TWiki> GRM Web>SFstLibrary (revision 3)EditAttach

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

SFst is a library for normalizing, sampling, combining, and approximating stochastic (or probabilistic) finite-state transducers. These are weighted finite-state transducers, represented in OpenFst library format, that have the following two properties:

  1. normalized: the weights of the transitions (and any final weight) from each state sum to Weight::One()1
  2. canonical topology:
    • the states are sorted by input label
    • there may be failure transitions (see phi-labeled transitions) but
      • there is at most one such transition per state
      • there are no failure-transition cycles
      • let state s have a transition labeled by x: if there is a failure transition from s to s', then either s' also has a transition labeled by x or it is not possible to read x from s' even via failure transitions
      • the summation to admit normalized follows the failure transitions through to actual transitions accumulating the weight

For example, an n-gram model produced by the OpenGrm NGram Library is a stochastic FST (provided the phi_label is specified to match the backoff label, typically 0, of the n-gram model), but many other topologies are possible.

The following operations are provided for SFSTs:

Operation Usage Description Complexity
IsCanonical IsCanonical(fst, phi_label) checks the second property above holds for a weighted FST Time, Space: O(V + E)
IsNormalized IsNormalized(fst, phi_label, delta) checks the above two properties hold for a weighted FST Time, Space: O(V + E)
Normalize Normalize(&fst, phi_label, delta) normalizes a canonical weighted FST preserving total path weights (up to a global constant), where possible2 same as ShortestDistance
  sfstnormalize [-phi_label=$l][--delta=$d] in.fst out.fst    
PhiNormalize PhiNormalize(&fst, phi_label) normalizes a canonical weighted FST by only modifying the failure transitions, where possible Time, Space: O(V + E)
  sfstnormalize --phi_only [-phi_label=$l][--delta=$d] in.fst out.fst    
Randgen fst::Randgen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) randomly generates paths in a stochastic FST (correctly dealing with failure transitions) see Randgen
  sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst    

1Computation is done internally with Log64Weight. Conversion from the input weight type is done using a WeightConvert functor, pre-defined for common weight types like TropicalWeight and LogWeight.
2Possible when the sum of all successful paths from the initial state is finite (and the input is trim).

Edit | Attach | Watch | Print version | History: r23 | r5 < r4 < r3 < r2 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r3 - 2016-04-06 - MichaelRiley
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback