
META TOPICPARENT 
name="SFstLibrary" 
OpenGrm SFst Available Operations
The following operations are provided for SFSTs. Care must be taken that the input FSTs meet the specified requirements (e.g. canonical, backoffcomplete or normalized).
The binary commands typically check their input requirements are satisfied or raise an error but the C++ versions may not check for efficiency (see the source code documentation for specific cases).


< < 

> > 




< < 
IsCanonical 
IsCanonical(fst, phi_label) 
checks the second property here holds for a weighted FST 
Time, Space: O(V + E) 
IsNormalized 
IsNormalized(fst, phi_label, delta) 
checks the two properties here hold for a weighted FST 
Time, Space: O(V + E) 

> > 
CountNormalize 
CountNormalize(&fst) 
normalizes a count FST (e.g. as output by Count()) 
Time, Space: 

sfstnormalize method={kl_min,summed} in.fst out.fst 




GlobalNormalize 
GlobalNormalize(&fst, phi_label, delta) 
globally normalizes, when possible^{1}, a canonical weighted FST preserving total path weights (up to a global constant) 
same as ShortestDistance 

sfstnormalize [method=global] [phi_label=$l][delta=$d] in.fst out.fst 




> > 
Info 
sfstinfo 
prints out information about a stochastic FST 
Time, Space: O(V + E * maxphiorder) 
Intersect 
Intersect() 
intersects two canonical stochastic FSAs 
Time^{2}: O(E_{1}V_{2}(maxlabelmultiplicity_{2} + maxphiorder_{2} log(maxoutdegree_{2})) 

sfstintersect 


IsCanonical 
IsCanonical(fst, phi_label) 
checks the second property here holds for a weighted FST 
Time, Space: O(V + E) 
IsNormalized 
IsNormalized(fst, phi_label, delta) 
checks the two properties here hold for a weighted FST 
Time, Space: O(V + E) 


LocalNormalize 
LocalNormalize(&fst) 
locally normalizes, when possible, a canonical weighted FST preserving each state's outgoing arc weights up to a statespecific constant 
Time, Space: O(V + E) 

sfstnormalize method=local in.fst out.fst 


NGramApprox 
NGramApprox(ifst, &ofst, order, phi_label, delta) 
approximates a normalized stochastic FST as an ngram model (having phi_labels in OpenGrm NGram format) 
same as ShortestDistance on the intersection of the input and output FSTs 

sfstngramapprox [order=$o][phi_label=$l][delta=$d] in.fst out.fst 




< < 
Perplexity 
Perplexity(fst, phi_label, unknown_label, unknown_class_size) 
computes perplexity for a stochastic FST 


> > 
Perplexity 
Perplexity(fst, phi_label, unknown_label, unknown_class_size) 
computes perplexity for a stochastic FST 
Time, Space: 



sfstperplexity [phi_label=$l] [unknown_label=$u][unknown_class_size=$s] in.fst test.far 
(test sentences are in FST archive format) 

PhiNormalize 
PhiNormalize(&fst, phi_label) 
normalizes, when possible, a canonical weighted FST by only modifying the failure transitions 
Time, Space: O(V + E) 

sfstnormalize method=phi [phi_label=$l][delta=$d] in.fst out.fst 


RandGen 
fst::RandGen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) 
randomly generates paths in a stochastic FST (correctly dealing with failure transitions) 
see RandGen 

sfstrandgen [phi_label=$l] [max_length=$l] [npath=$n] [seed=$s] in.fst out.fst 




> > 
ShortestDistance 
ShortestDistance() 
computes the shortest distance in the presence of failure transitions 
same as ShortestDistance 

sfstshorttestdistance 


Topology 
Topology() 
algorithms for constructing specific FST topologies 
Time, Space: O(V + E) 

sfsttopology 




Trim 
Trim(&fst, phi_label) 
removes useless states and transitions in stochastic automata (irrespective of weights) 
Time, Space: O(V + E * maxphiorder) 

sfsttrim phi_label=$l in.fst out.fst 




> >  

^{1}Possible when the sum of weight of all successful paths from the initial state is finite (and the input is trim). 

< <  Michael Riley  20180716 
> >  ^{2}Assumes for each state (s1, s2) in the output, the outdegree of state s1 in FST1 is less than state s2 in FST2; otherwise the term for that state's contribution swaps s1 and s2. 