The following operations are provided for SFSTs. Care must be taken that the input FSTs meet the specified requirements (e.g. canonical, backoff-complete or normalized). The binary commands typically check their input requirements are satisfied or raise an error but the C++ versions may not check for efficiency (see the source code documentation for specific cases).

Operation | Usage | Description | Complexity |
---|---|---|---|

sfstperplexity [--phi_label=$l] [--unknown_label=$u] q.fst [p.{fst,far}] | (p.far is in FST archive format) | ||

Topology | sfstopology [--method=ngram] [--phi_label=$l] in.fst out.fst | algorithms for constructing specific FST topologies | Time, Space: O(V + E) |

NGramApprox | NGramApprox(ifst, &ofst, order, phi_label, delta) | approximates a normalized stochastic FST as an n-gram model (having `phi_labels` in OpenGrm NGram format) |
same as ShortestDistance on the intersection of the input and output FSTs |

Approx | Approx(ifst, &backoff_fst, phi_label, delta) | approximates a normalized stochastic FST wrt a provided backoff-complete FST | same as ShortestDistance on the intersection of the input and output FSTs |

IsCanonical | IsCanonical(fst, phi_label) | checks the second property here holds for a weighted FST | Time, Space: O(V + E) |

IsNormalized | IsNormalized(fst, phi_label, delta) | checks the two properties here hold for a weighted FST | Time, Space: O(V + E) |

Perplexity | Perplexity perp(qfst, phi_label, unknown_label) [perp.SetTarget(pfst)] perp.GetPerplexity() |
computes self/cross perplexity for stochastic FSTs | same as ShortestDistance on the intersection of the source and target FSTs |

ShortestDistance | ShortestDistance(fst, &distance, phi_label, reverse, delta) | computes the shortest distance in the presence of failure transitions | same as ShortestDistance |

Count | Counter counter.Count(infst) |
counts from stochastic FST wrt to a provided backoff-complete FST | same as ShortestDistance on the intersection of the input and output FSTs |

GlobalNormalize | GlobalNormalize(&fst, phi_label, delta) | globally normalizes, when possible^{1}, a canonical weighted FST preserving total path weights (up to a global constant) |
same as ShortestDistance |

Intersect | Intersect(ifst1, ifst2, &ofst, phi_label, trim) | intersects FSAs in the presence of failure transitions | Time^{2}: O(E_{1}V_{2}(max-label-multiplicity_{2} + max-phi-order_{2} log(max-out-degree_{2})) |

LocalNormalize | LocalNormalize(&fst) | locally normalizes, when possible, a canonical weighted FST preserving each state's out-going arc weights up to a state-specific constant | Time, Space: O(V + E) |

sfstapprox[--phi_label=$l][--delta=$d] in.fst backoff.fst out.fst | |||

sfstcount [--phi_label=$l] in.fst top.fst out.fst | |||

sfstnormalize -method={kl_min,summed} [--phi_label=$l] in.fst out.fst | |||

sfstnormalize [--method=global] [--phi_label=$l][--delta=$d] in.fst out.fst | |||

sfstintersect [--trim] [--phi_label=$l] in1.fst in2.fst out.fst | |||

sfstnormalize -method=local [--phi_label=$l] in.fst out.fst | |||

sfstngramapprox [--order=$o][--phi_label=$l][--delta=$d] in.fst out.fst | |||

sfstnormalize --method=phi [--phi_label=$l][--delta=$d] in.fst out.fst | |||

sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst | |||

sfstshorttestdistance [--phi_label=$l][--reversse][--delta=$d] in.fst | |||

sfsttrim[- -phi_label=$l] in.fst out.fst | |||

CountNormalize | CountNormalize(&fst) | normalizes a count FST (e.g. as output by Count()) | Time: O(sE) where s is max iterations per state, Space: O(V) |

PhiNormalize | PhiNormalize(&fst, phi_label) | normalizes, when possible, a canonical weighted FST by only modifying the failure transitions | Time, Space: O(V + E) |

Info | sfstinfo [--phi_label=$l] in.fst | prints out information about a stochastic FST | Time, Space: O(V + E * max-phi-order) |

RandGen | fst::RandGen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector |
randomly generates paths in a stochastic FST (correctly dealing with failure transitions) | see RandGen |

Trim | Trim(&fst, phi_label) | removes useless states and transitions in stochastic automata (irrespective of weights) | Time, Space: O(V + E * max-phi-order) |

^{2}Assumes for each state (s1, s2) in the output, the out-degree of state s1 in FST1 is less than state s2 in FST2; otherwise the term for that state's contribution swaps s1 and s2.

This topic: GRM > WebHome > SFstLibrary > SFstAvailableOperations

Topic revision: r8 - 2020-07-06 - MichaelRiley

Copyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.

Ideas, requests, problems regarding TWiki? Send feedback

Ideas, requests, problems regarding TWiki? Send feedback