The following operations are provided for SFSTs. Care must be taken that the input FSTs meet the specified requirements (e.g.
).
The binary commands typically check their input requirements are satisfied or raise an error but the C++ versions may not check for efficiency (see the source code documentation for specific cases).
Operation |
Usage |
Description |
Complexity |
Trim |
Trim(&fst, phi_label) |
removes useless states and transitions in stochastic automata (irrespective of weights) |
Time, Space: O(V + E * max-phi-order) |
RandGen |
fst::RandGen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) |
randomly generates paths in a stochastic FST (correctly dealing with failure transitions) |
see RandGen |
Info |
sfstinfo |
prints out information about a stochastic FST |
Time, Space: O(V + E * max-phi-order) |
PhiNormalize |
PhiNormalize(&fst, phi_label) |
normalizes, when possible, a canonical weighted FST by only modifying the failure transitions |
Time, Space: O(V + E) |
CountNormalize |
CountNormalize(&fst) |
normalizes a count FST (e.g. as output by Count()) |
Time, Space: |
|
sfstapprox[--phi_label=$l][--delta=$d] in.fst backoff.fst out.fst |
|
|
|
sfstcount |
|
|
|
sfstnormalize -method={kl_min,summed} in.fst out.fst |
|
|
|
sfstnormalize [--method=global] [--phi_label=$l][--delta=$d] in.fst out.fst |
|
|
|
sfstintersect |
|
|
|
sfstnormalize -method=local in.fst out.fst |
|
|
|
sfstngramapprox [--order=$o][--phi_label=$l][--delta=$d] in.fst out.fst |
|
|
|
sfstnormalize --method=phi [-phi_label=$l][--delta=$d] in.fst out.fst |
|
|
|
sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst |
|
|
|
sfstshorttestdistance |
|
|
|
sfsttopology |
|
|
|
sfsttrim -phi_label=$l in.fst out.fst |
|
|
LocalNormalize |
LocalNormalize(&fst) |
locally normalizes, when possible, a canonical weighted FST preserving each state's out-going arc weights up to a state-specific constant |
Time, Space: O(V + E) |
Intersect |
Intersect() |
intersects two canonical stochastic FSAs |
Time2: O(E1V2(max-label-multiplicity2 + max-phi-order2 log(max-out-degree2)) |
GlobalNormalize |
GlobalNormalize(&fst, phi_label, delta) |
globally normalizes, when possible1, a canonical weighted FST preserving total path weights (up to a global constant) |
same as ShortestDistance |
Count |
Count() |
counts from stochastic FST wrt to a provided backoff-complete FST |
same as ShortestDistance on the intersection of the input and output FSTs |
ShortestDistance |
ShortestDistance() |
computes the shortest distance in the presence of failure transitions |
same as ShortestDistance |
Perplexity |
Perplexity(fst, phi_label, unknown_label, unknown_class_size) |
computes perplexity for a stochastic FST |
Time, Space: |
IsNormalized |
IsNormalized(fst, phi_label, delta) |
checks the two properties here hold for a weighted FST |
Time, Space: O(V + E) |
IsCanonical |
IsCanonical(fst, phi_label) |
checks the second property here holds for a weighted FST |
Time, Space: O(V + E) |
Approx |
Approx(ifst, &backoff_fst, phi_label, delta) |
approximates a normalized stochastic FST wrt a provided backoff-complete FST |
same as ShortestDistance on the intersection of the input and output FSTs |
NGramApprox |
NGramApprox(ifst, &ofst, order, phi_label, delta) |
approximates a normalized stochastic FST as an n-gram model (having phi_labels in OpenGrm NGram format) |
same as ShortestDistance on the intersection of the input and output FSTs |
Topology |
Topology() |
algorithms for constructing specific FST topologies |
Time, Space: O(V + E) |
|
sfstperplexity [--phi_label=$l] [-unknown_label=$u][--unknown_class_size=$s] in.fst test.far |
(test sentences are in FST archive format) |
|
Possible when the sum of weight of all successful paths from the initial state is finite (and the input is
).
Assumes for each state (s1, s2) in the output, the out-degree of state s1 in FST1 is less than state s2 in FST2; otherwise the term for that state's contribution swaps s1 and s2.