OpenGrm SFst Library: Stochastic Finite-State Transducer Library
SFst is a library for normalizing, sampling, combining, and approximating
stochastic (or
probabilistic) finite-state transducers. These are weighted finite-state transducers, represented in
OpenFst library format, that have the following properties:
- normalized: the weights of the transitions (and any final weight) from each state sum to Weight::One()1
- canonical topology: there may be failure transitions (see phi-labeled transitions) but there is at most one such transition per state and there are no failure-transition cycles. The summation above follows the failure transitions through to actual transitions accumulating the weight.
For example, an n-gram model produced by the
OpenGrm NGram Library is a stochastic FST (provided the
phi_label
is specified to match the backoff label, typically 0, of the n-gram model), but many other topologies are possible.
The following operations are provided for SFSTs:
Operation |
Usage |
Description |
Complexity |
IsCanonical |
IsCanonical(fst, phi_label) |
checks the second property above holds for a weighted FST |
Time, Space: O(V + E) |
IsNormalized |
IsNormalized(fst, phi_label, delta) |
checks the above two properties hold for a weighted FST |
Time, Space: O(V + E) |
Normalize |
Normalize(&fst, phi_label, delta) |
normalizes a canonical weighted FST preserving total path weights (up to a global constant), where possible |
same as ShortestDistance |
|
sfstnormalize [-phi_label=$l][--delta=$d] in.fst out.fst |
|
|
PhiNormalize |
PhiNormalize(&fst, phi_label) |
normalizes a canonical weighted FST by only modifying the failure transitions, where possible |
Time, Space: O(V + E) |
|
sfstnormalize --phi_only [-phi_label=$l][--delta=$d] in.fst out.fst |
|
|
Randgen |
fst::Randgen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) |
randomly generates paths in a stochastic FST (correctly dealing with failure transitions) |
see Randgen |
|
sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst |
|
|
1Computation is done internally with
Log64Weight. Conversion from the input weight type is done using a
WeightConvert
functor, pre-defined for common weight types like
TropicalWeight
and
LogWeight
.