TWiki> GRM Web>SFstLibrary (revision 1)EditAttach

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

SFst is a library for normalizing, sampling, combining, and approximating stochastic (or probabilistic) finite-state transducers. These are weighted finite-state transducers, represented in OpenFst library format, that have the following properties:

  • normalized: the weights of the transitions (and any final weight) from each state sum to Weight::One()1
  • canonical topology: there may be failure transitions (see phi-labeled transitions) but there is at most one such transition per state and there are no failure-transition cycles. The summation above follows the failure transitions through to actual transitions accumulating the weight.

For example, an n-gram model produced by the OpenGrm NGram Library is a stochastic FST (provided the phi_label is specified to match the backoff label, typically 0, of the n-gram model), but many other topologies are possible.

The following operations are provided for SFSTs:

Operation Usage Description Complexity
IsCanonical IsCanonical(fst, phi_label) checks the second property above holds for a weighted FST Time, Space: O(V + E)
IsNormalized IsNormalized(fst, phi_label, delta) checks the above two properties hold for a weighted FST Time, Space: O(V + E)
Normalize Normalize(&fst, phi_label, delta) normalizes a canonical weighted FST preserving total path weights (up to a global constant), where possible same as ShortestDistance
  sfstnormalize [-phi_label=$l][--delta=$d] in.fst out.fst    
PhiNormalize PhiNormalize(&fst, phi_label) normalizes a canonical weighted FST by only modifying the failure transitions, where possible Time, Space: O(V + E)
  sfstnormalize --phi_only [-phi_label=$l][--delta=$d] in.fst out.fst    
Randgen fst::Randgen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) randomly generates paths in a stochastic FST (correctly dealing with failure transitions) see Randgen
  sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst    

1Computation is done internally with Log64Weight. Conversion from the input weight type is done using a WeightConvert functor, pre-defined for common weight types like TropicalWeight and LogWeight.

Edit | Attach | Watch | Print version | History: r23 | r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r1 - 2016-03-29 - MichaelRiley
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback