Difference: SFstLibrary (1 vs. 19)

Revision 192019-07-18 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library


Changed:
<
<
Red led OpenGrm SFst Version 1.0.0 is now available for download.
>
>
Red led OpenGrm SFst Version 1.1.0 is now available for download.
 

SFst is a library for normalizing, sampling, combining, and approximating stochastic (or probabilistic) finite-state transducers. These are weighted finite-state transducers, represented in OpenFst library format, that have two properties:

Revision 182018-07-17 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Changed:
<
<
SFst is a library for normalizing, sampling, combining, and approximating stochastic (or probabilistic) finite-state transducers. These are weighted finite-state transducers, represented in OpenFst library format, that have the following two properties:

  1. normalized: the weights of the paths into the future from each state sum to Weight::One()1
  2. canonical topology:
    • the states are sorted by input label
    • there may be failure transitions (see phi-labeled transitions) but
      • there is at most one such transition per state
      • there are no failure-transition (and/or epsilon-transition) cycles
        <--       * let state s have a transition labeled by x: if there is a failure transition from s to s', then either s' also has a transition labeled by x or it is not possible to read x from s' even via failure transitions -->
    • no assumption is made of general determinism or what transitions must be present on failure (unlike in a canonical n-gram model).

For example, an n-gram model produced by the OpenGrm NGram Library is a stochastic FST2 but many other topologies are possible.

>
>

Red led OpenGrm SFst Version 1.0.0 is now available for download.

SFst is a library for normalizing, sampling, combining, and approximating stochastic (or probabilistic) finite-state transducers. These are weighted finite-state transducers, represented in OpenFst library format, that have two properties:

  1. a canonical format that may include epsilon and failure transitions under specified constraints
  2. a normalized weight distribution that assigns a (negative log) probability to each path leaving a state.
An n-gram model produced by the OpenGrm NGram Library is a stochastic FST1 but many other topologies are possible.
 
Added:
>
>
 
Changed:
<
<
1Computation is done internally assuming the weights are negative log probabilities using Log64Weight. Conversion from the input weight type is done using a WeightConvert functor, pre-defined for common weight types like TropicalWeight and LogWeight.
2Provided the phi_label is specified to match the backoff label, typically 0, of the n-gram model.
>
>

1Provided the failure label (phi_label) is specified to match the backoff label, typically 0, of the n-gram model.
 

Revision 172018-07-17 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Line: 13 to 13
 
<--       * let state s have a transition labeled by x: if there is a failure transition from s to s', then either s' also has a transition labeled by x or it is not possible to read x from s' even via failure transitions -->
    • no assumption is made of general determinism or what transitions must be present on failure (unlike in a canonical n-gram model).
Changed:
<
<
For example, an n-gram model produced by the OpenGrm NGram Library is a stochastic FST (provided the phi_label is specified to match the backoff label, typically 0, of the n-gram model), but many other topologies are possible.
>
>
For example, an n-gram model produced by the OpenGrm NGram Library is a stochastic FST2 but many other topologies are possible.

1Computation is done internally assuming the weights are negative log probabilities using Log64Weight. Conversion from the input weight type is done using a WeightConvert functor, pre-defined for common weight types like TropicalWeight and LogWeight.
2Provided the phi_label is specified to match the backoff label, typically 0, of the n-gram model.

 
Deleted:
<
<
The following operations are provided for SFSTs:
 
Deleted:
<
<
Operation Usage Description Complexity
Condition Condition(fst, phi_label, delta) modifies input moving towards a globally normalizable FST, controlled by delta >= 0 Time, Space: O(V + E)
Approx Approx(ifst, &ofst, phi_label, delta) approximates a normalized stochastic FST wrt a provided backoff model same as ShortestDistance on the intersection of the input and output FSTs
  sfstapprox[--phi_label=$l][--delta=$d] in.fst backoff.fst out.fst    
IsCanonical IsCanonical(fst, phi_label) checks the second property above holds for a weighted FST Time, Space: O(V + E)
IsNormalized IsNormalized(fst, phi_label, delta) checks the above two properties hold for a weighted FST Time, Space: O(V + E)
GlobalNormalize GlobalNormalize(&fst, phi_label, delta) globally normalizes, when possible2, a canonical weighted FST preserving total path weights (up to a global constant) same as ShortestDistance
  sfstnormalize [--method=global] [--phi_label=$l][--delta=$d] in.fst out.fst    
LocalNormalize LocalNormalize(&fst) locally normalizes, when possible, a canonical weighted FST preserving each state's out-going arc weights up to a state-specific constant Time, Space: O(V + E)
  sfstnormalize -method=local in.fst out.fst    
NGramApprox NGramApprox(ifst, &ofst, order, phi_label, delta) approximates a normalized stochastic FST as an n-gram model (having phi_labels in OpenGrm NGram format) same as ShortestDistance on the intersection of the input and output FSTs
  sfstngramapprox [--order=$o][--phi_label=$l][--delta=$d] in.fst out.fst    
Perplexity Perplexity(fst, phi_label, unknown_label, unknown_class_size) computes perplexity for a stochastic FST  
  sfstperplexity [--phi_label=$l] [-unknown_label=$u][--unknown_class_size=$s] in.fst test.far (test sentences are in FST archive format)  
PhiNormalize PhiNormalize(&fst, phi_label) normalizes, when possible, a canonical weighted FST by only modifying the failure transitions Time, Space: O(V + E)
  sfstnormalize --method=phi [-phi_label=$l][--delta=$d] in.fst out.fst    
RandGen fst::RandGen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) randomly generates paths in a stochastic FST (correctly dealing with failure transitions) see RandGen
  sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst    
Trim Trim(&fst, phi_label) Removes useless states and transitions in stochastic automata (irrespective of weights) Time, Space: O(V + E * max-phi-order)
  sfsttrim -phi_label=$l in.fst out.fst    
 
Deleted:
<
<
1Computation is done internally with Log64Weight. Conversion from the input weight type is done using a WeightConvert functor, pre-defined for common weight types like TropicalWeight and LogWeight.
2Possible when the sum of weight of all successful paths from the initial state is finite (and the input is trim).

Revision 162018-05-10 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Line: 11 to 11
 
      • there is at most one such transition per state
      • there are no failure-transition (and/or epsilon-transition) cycles
        <--       * let state s have a transition labeled by x: if there is a failure transition from s to s', then either s' also has a transition labeled by x or it is not possible to read x from s' even via failure transitions -->
Changed:
<
<
    • no assumption is made of general determinism or what transitions must be present on failure (unlike in a canonical n-gram model).
>
>
    • no assumption is made of general determinism or what transitions must be present on failure (unlike in a canonical n-gram model).
  For example, an n-gram model produced by the OpenGrm NGram Library is a stochastic FST (provided the phi_label is specified to match the backoff label, typically 0, of the n-gram model), but many other topologies are possible.
Line: 19 to 19
 
Operation Usage Description Complexity
Condition Condition(fst, phi_label, delta) modifies input moving towards a globally normalizable FST, controlled by delta >= 0 Time, Space: O(V + E)
Changed:
<
<
Approx Approx(ifst, &ofst, phi_label, backoff_label, delta) approximates a normalized stochastic FST (having phi_labels) wrt a provided backoff model (having backoff_labels) same as ShortestDistance on the intersection of the input and output FSTs
  sfstapprox[--phi_label=$l][--backoff_label=$j][--delta=$d] in.fst backoff.fst out.fst    
>
>
Approx Approx(ifst, &ofst, phi_label, delta) approximates a normalized stochastic FST wrt a provided backoff model same as ShortestDistance on the intersection of the input and output FSTs
  sfstapprox[--phi_label=$l][--delta=$d] in.fst backoff.fst out.fst    
 
IsCanonical IsCanonical(fst, phi_label) checks the second property above holds for a weighted FST Time, Space: O(V + E)
IsNormalized IsNormalized(fst, phi_label, delta) checks the above two properties hold for a weighted FST Time, Space: O(V + E)
GlobalNormalize GlobalNormalize(&fst, phi_label, delta) globally normalizes, when possible2, a canonical weighted FST preserving total path weights (up to a global constant) same as ShortestDistance
  sfstnormalize [--method=global] [--phi_label=$l][--delta=$d] in.fst out.fst    
LocalNormalize LocalNormalize(&fst) locally normalizes, when possible, a canonical weighted FST preserving each state's out-going arc weights up to a state-specific constant Time, Space: O(V + E)
  sfstnormalize -method=local in.fst out.fst    
Changed:
<
<
NGramApprox NGramApprox(ifst, &ofst, order, phi_label, backoff_label, delta) approximates a normalized stochastic FST (having phi_labels) as an n-gram model (having backoff_labels in OpenGrm NGram format) same as ShortestDistance on the intersection of the input and output FSTs
  sfstngramapprox [--order=$o][--phi_label=$l][--backoff_label=$j][--delta=$d] in.fst out.fst    
>
>
NGramApprox NGramApprox(ifst, &ofst, order, phi_label, delta) approximates a normalized stochastic FST as an n-gram model (having phi_labels in OpenGrm NGram format) same as ShortestDistance on the intersection of the input and output FSTs
  sfstngramapprox [--order=$o][--phi_label=$l][--delta=$d] in.fst out.fst    
 
Perplexity Perplexity(fst, phi_label, unknown_label, unknown_class_size) computes perplexity for a stochastic FST  
  sfstperplexity [--phi_label=$l] [-unknown_label=$u][--unknown_class_size=$s] in.fst test.far (test sentences are in FST archive format)  
PhiNormalize PhiNormalize(&fst, phi_label) normalizes, when possible, a canonical weighted FST by only modifying the failure transitions Time, Space: O(V + E)

Revision 152017-11-02 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Line: 23 to 23
 
  sfstapprox[--phi_label=$l][--backoff_label=$j][--delta=$d] in.fst backoff.fst out.fst    
IsCanonical IsCanonical(fst, phi_label) checks the second property above holds for a weighted FST Time, Space: O(V + E)
IsNormalized IsNormalized(fst, phi_label, delta) checks the above two properties hold for a weighted FST Time, Space: O(V + E)
Added:
>
>
GlobalNormalize GlobalNormalize(&fst, phi_label, delta) globally normalizes, when possible2, a canonical weighted FST preserving total path weights (up to a global constant) same as ShortestDistance
  sfstnormalize [--method=global] [--phi_label=$l][--delta=$d] in.fst out.fst    
 
LocalNormalize LocalNormalize(&fst) locally normalizes, when possible, a canonical weighted FST preserving each state's out-going arc weights up to a state-specific constant Time, Space: O(V + E)
  sfstnormalize -method=local in.fst out.fst    
NGramApprox NGramApprox(ifst, &ofst, order, phi_label, backoff_label, delta) approximates a normalized stochastic FST (having phi_labels) as an n-gram model (having backoff_labels in OpenGrm NGram format) same as ShortestDistance on the intersection of the input and output FSTs
  sfstngramapprox [--order=$o][--phi_label=$l][--backoff_label=$j][--delta=$d] in.fst out.fst    
Deleted:
<
<
Normalize Normalize(&fst, phi_label, delta) globally normalizes, when possible2, a canonical weighted FST preserving total path weights (up to a global constant) same as ShortestDistance
  sfstnormalize [--phi_label=$l][--delta=$d] in.fst out.fst    
 
Perplexity Perplexity(fst, phi_label, unknown_label, unknown_class_size) computes perplexity for a stochastic FST  
  sfstperplexity [--phi_label=$l] [-unknown_label=$u][--unknown_class_size=$s] in.fst test.far (test sentences are in FST archive format)  
PhiNormalize PhiNormalize(&fst, phi_label) normalizes, when possible, a canonical weighted FST by only modifying the failure transitions Time, Space: O(V + E)

Revision 142017-07-01 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Line: 35 to 35
 
  sfstnormalize --method=phi [-phi_label=$l][--delta=$d] in.fst out.fst    
RandGen fst::RandGen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) randomly generates paths in a stochastic FST (correctly dealing with failure transitions) see RandGen
  sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst    
Changed:
<
<
Trim Trim(&fst, phi_label) Removes useless states and transitions in stochastic automata (irrespective of weights) Time: O(E log E * max-phi-order), Space: O(E)
>
>
Trim Trim(&fst, phi_label) Removes useless states and transitions in stochastic automata (irrespective of weights) Time, Space: O(V + E * max-phi-order)
 
  sfsttrim -phi_label=$l in.fst out.fst    

1Computation is done internally with Log64Weight. Conversion from the input weight type is done using a WeightConvert functor, pre-defined for common weight types like TropicalWeight and LogWeight.

Revision 132017-06-20 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Line: 19 to 19
 
Operation Usage Description Complexity
Condition Condition(fst, phi_label, delta) modifies input moving towards a globally normalizable FST, controlled by delta >= 0 Time, Space: O(V + E)
Changed:
<
<
  sfstnormalize -condition=$d ... in.fst out.fst (as a pre-conditioning step in normalization)  
>
>
Approx Approx(ifst, &ofst, phi_label, backoff_label, delta) approximates a normalized stochastic FST (having phi_labels) wrt a provided backoff model (having backoff_labels) same as ShortestDistance on the intersection of the input and output FSTs
  sfstapprox[--phi_label=$l][--backoff_label=$j][--delta=$d] in.fst backoff.fst out.fst    
 
IsCanonical IsCanonical(fst, phi_label) checks the second property above holds for a weighted FST Time, Space: O(V + E)
IsNormalized IsNormalized(fst, phi_label, delta) checks the above two properties hold for a weighted FST Time, Space: O(V + E)
LocalNormalize LocalNormalize(&fst) locally normalizes, when possible, a canonical weighted FST preserving each state's out-going arc weights up to a state-specific constant Time, Space: O(V + E)
  sfstnormalize -method=local in.fst out.fst    
Changed:
<
<
NGramApprox NGramApprox(ifst, &ofst, phi_label, backoff_label, delta) approximates a normalized stochastic FST (having phi_labels) as an n-gram model (having backoff_labels in OpenGrm NGram format) same as ShortestDistance on the intersection of the input and output FSTs
>
>
NGramApprox NGramApprox(ifst, &ofst, order, phi_label, backoff_label, delta) approximates a normalized stochastic FST (having phi_labels) as an n-gram model (having backoff_labels in OpenGrm NGram format) same as ShortestDistance on the intersection of the input and output FSTs
 
  sfstngramapprox [--order=$o][--phi_label=$l][--backoff_label=$j][--delta=$d] in.fst out.fst    
Normalize Normalize(&fst, phi_label, delta) globally normalizes, when possible2, a canonical weighted FST preserving total path weights (up to a global constant) same as ShortestDistance
  sfstnormalize [--phi_label=$l][--delta=$d] in.fst out.fst    
Line: 34 to 35
 
  sfstnormalize --method=phi [-phi_label=$l][--delta=$d] in.fst out.fst    
RandGen fst::RandGen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) randomly generates paths in a stochastic FST (correctly dealing with failure transitions) see RandGen
  sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst    
Added:
>
>
Trim Trim(&fst, phi_label) Removes useless states and transitions in stochastic automata (irrespective of weights) Time: O(E log E * max-phi-order), Space: O(E)
  sfsttrim -phi_label=$l in.fst out.fst    
  1Computation is done internally with Log64Weight. Conversion from the input weight type is done using a WeightConvert functor, pre-defined for common weight types like TropicalWeight and LogWeight.
2Possible when the sum of weight of all successful paths from the initial state is finite (and the input is trim).

Revision 122017-06-03 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

SFst is a library for normalizing, sampling, combining, and approximating stochastic (or probabilistic) finite-state transducers. These are weighted finite-state transducers, represented in OpenFst library format, that have the following two properties:

Changed:
<
<
  1. normalized: the weights of the transitions (and any final weight) from each state sum to Weight::One()1
>
>
  1. normalized: the weights of the paths into the future from each state sum to Weight::One()1
 
  1. canonical topology:
    • the states are sorted by input label
    • there may be failure transitions (see phi-labeled transitions) but
      • there is at most one such transition per state
      • there are no failure-transition (and/or epsilon-transition) cycles
        <--       * let state s have a transition labeled by x: if there is a failure transition from s to s', then either s' also has a transition labeled by x or it is not possible to read x from s' even via failure transitions -->
Deleted:
<
<
      • the summation to admit normalized follows the failure transitions through to actual transitions accumulating the weight
 
    • no assumption is made of general determinism or what transitions must be present on failure (unlike in a canonical n-gram model).

For example, an n-gram model produced by the OpenGrm NGram Library is a stochastic FST (provided the phi_label is specified to match the backoff label, typically 0, of the n-gram model), but many other topologies are possible.

Revision 112016-12-23 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Line: 9 to 9
 
    • the states are sorted by input label
    • there may be failure transitions (see phi-labeled transitions) but
      • there is at most one such transition per state
Changed:
<
<
      • there are no failure-transition cycles
>
>
      • there are no failure-transition (and/or epsilon-transition) cycles
 
<--       * let state s have a transition labeled by x: if there is a failure transition from s to s', then either s' also has a transition labeled by x or it is not possible to read x from s' even via failure transitions -->
      • the summation to admit normalized follows the failure transitions through to actual transitions accumulating the weight
    • no assumption is made of general determinism or what transitions must be present on failure (unlike in a canonical n-gram model).

Revision 102016-12-15 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Line: 19 to 19
 The following operations are provided for SFSTs:

Operation Usage Description Complexity
Added:
>
>
Condition Condition(fst, phi_label, delta) modifies input moving towards a globally normalizable FST, controlled by delta >= 0 Time, Space: O(V + E)
  sfstnormalize -condition=$d ... in.fst out.fst (as a pre-conditioning step in normalization)  
 
IsCanonical IsCanonical(fst, phi_label) checks the second property above holds for a weighted FST Time, Space: O(V + E)
IsNormalized IsNormalized(fst, phi_label, delta) checks the above two properties hold for a weighted FST Time, Space: O(V + E)
Added:
>
>
LocalNormalize LocalNormalize(&fst) locally normalizes, when possible, a canonical weighted FST preserving each state's out-going arc weights up to a state-specific constant Time, Space: O(V + E)
  sfstnormalize -method=local in.fst out.fst    
 
NGramApprox NGramApprox(ifst, &ofst, phi_label, backoff_label, delta) approximates a normalized stochastic FST (having phi_labels) as an n-gram model (having backoff_labels in OpenGrm NGram format) same as ShortestDistance on the intersection of the input and output FSTs
  sfstngramapprox [--order=$o][--phi_label=$l][--backoff_label=$j][--delta=$d] in.fst out.fst    
Changed:
<
<
Normalize Normalize(&fst, phi_label, delta) normalizes a canonical weighted FST preserving total path weights (up to a global constant), where possible2 same as ShortestDistance
>
>
Normalize Normalize(&fst, phi_label, delta) globally normalizes, when possible2, a canonical weighted FST preserving total path weights (up to a global constant) same as ShortestDistance
 
  sfstnormalize [--phi_label=$l][--delta=$d] in.fst out.fst    
Perplexity Perplexity(fst, phi_label, unknown_label, unknown_class_size) computes perplexity for a stochastic FST  
  sfstperplexity [--phi_label=$l] [-unknown_label=$u][--unknown_class_size=$s] in.fst test.far (test sentences are in FST archive format)  
Changed:
<
<
PhiNormalize PhiNormalize(&fst, phi_label) normalizes a canonical weighted FST by only modifying the failure transitions, where possible Time, Space: O(V + E)
  sfstnormalize --phi_only [-phi_label=$l][--delta=$d] in.fst out.fst    
>
>
PhiNormalize PhiNormalize(&fst, phi_label) normalizes, when possible, a canonical weighted FST by only modifying the failure transitions Time, Space: O(V + E)
  sfstnormalize --method=phi [-phi_label=$l][--delta=$d] in.fst out.fst    
 
RandGen fst::RandGen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) randomly generates paths in a stochastic FST (correctly dealing with failure transitions) see RandGen
  sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst    

1Computation is done internally with Log64Weight. Conversion from the input weight type is done using a WeightConvert functor, pre-defined for common weight types like TropicalWeight and LogWeight.

Changed:
<
<

2Possible when the sum of all successful paths from the initial state is finite (and the input is trim).
>
>

2Possible when the sum of weight of all successful paths from the initial state is finite (and the input is trim).

Revision 92016-08-26 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Line: 14 to 14
 
      • the summation to admit normalized follows the failure transitions through to actual transitions accumulating the weight
    • no assumption is made of general determinism or what transitions must be present on failure (unlike in a canonical n-gram model).
Changed:
<
<
For example, an n-gram model produced by the OpenGrm NGram Library is a stochastic FST (provided the phi_label is specified to match the backoff label, typically 0, of the n-gram model), but many other topologies are possible.
>
>
For example, an n-gram model produced by the OpenGrm NGram Library is a stochastic FST (provided the phi_label is specified to match the backoff label, typically 0, of the n-gram model), but many other topologies are possible.
  The following operations are provided for SFSTs:

Operation Usage Description Complexity
IsCanonical IsCanonical(fst, phi_label) checks the second property above holds for a weighted FST Time, Space: O(V + E)
IsNormalized IsNormalized(fst, phi_label, delta) checks the above two properties hold for a weighted FST Time, Space: O(V + E)
Added:
>
>
NGramApprox NGramApprox(ifst, &ofst, phi_label, backoff_label, delta) approximates a normalized stochastic FST (having phi_labels) as an n-gram model (having backoff_labels in OpenGrm NGram format) same as ShortestDistance on the intersection of the input and output FSTs
  sfstngramapprox [--order=$o][--phi_label=$l][--backoff_label=$j][--delta=$d] in.fst out.fst    
 
Normalize Normalize(&fst, phi_label, delta) normalizes a canonical weighted FST preserving total path weights (up to a global constant), where possible2 same as ShortestDistance
  sfstnormalize [--phi_label=$l][--delta=$d] in.fst out.fst    
Perplexity Perplexity(fst, phi_label, unknown_label, unknown_class_size) computes perplexity for a stochastic FST  

Revision 82016-05-18 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

>
>

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

  SFst is a library for normalizing, sampling, combining, and approximating stochastic (or probabilistic) finite-state transducers. These are weighted finite-state transducers, represented in OpenFst library format, that have the following two properties:

Revision 72016-05-17 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Line: 27 to 27
 
  sfstperplexity [--phi_label=$l] [-unknown_label=$u][--unknown_class_size=$s] in.fst test.far (test sentences are in FST archive format)  
PhiNormalize PhiNormalize(&fst, phi_label) normalizes a canonical weighted FST by only modifying the failure transitions, where possible Time, Space: O(V + E)
  sfstnormalize --phi_only [-phi_label=$l][--delta=$d] in.fst out.fst    
Changed:
<
<
RandGen fst::RandGen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) randomly generates paths in a stochastic FST (correctly dealing with failure transitions) see Randgen
>
>
RandGen fst::RandGen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) randomly generates paths in a stochastic FST (correctly dealing with failure transitions) see RandGen
 
  sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst    

1Computation is done internally with Log64Weight. Conversion from the input weight type is done using a WeightConvert functor, pre-defined for common weight types like TropicalWeight and LogWeight.

Revision 62016-05-14 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Line: 22 to 22
 
IsCanonical IsCanonical(fst, phi_label) checks the second property above holds for a weighted FST Time, Space: O(V + E)
IsNormalized IsNormalized(fst, phi_label, delta) checks the above two properties hold for a weighted FST Time, Space: O(V + E)
Normalize Normalize(&fst, phi_label, delta) normalizes a canonical weighted FST preserving total path weights (up to a global constant), where possible2 same as ShortestDistance
Changed:
<
<
  sfstnormalize [-phi_label=$l][--delta=$d] in.fst out.fst    
>
>
  sfstnormalize [--phi_label=$l][--delta=$d] in.fst out.fst    
Perplexity Perplexity(fst, phi_label, unknown_label, unknown_class_size) computes perplexity for a stochastic FST  
  sfstperplexity [--phi_label=$l] [-unknown_label=$u][--unknown_class_size=$s] in.fst test.far (test sentences are in FST archive format)  
 
PhiNormalize PhiNormalize(&fst, phi_label) normalizes a canonical weighted FST by only modifying the failure transitions, where possible Time, Space: O(V + E)
  sfstnormalize --phi_only [-phi_label=$l][--delta=$d] in.fst out.fst    
Changed:
<
<
Randgen fst::Randgen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) randomly generates paths in a stochastic FST (correctly dealing with failure transitions) see Randgen
>
>
RandGen fst::RandGen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) randomly generates paths in a stochastic FST (correctly dealing with failure transitions) see Randgen
 
  sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst    

1Computation is done internally with Log64Weight. Conversion from the input weight type is done using a WeightConvert functor, pre-defined for common weight types like TropicalWeight and LogWeight.

Revision 52016-05-11 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Line: 12 to 12
 
      • there are no failure-transition cycles
        <--       * let state s have a transition labeled by x: if there is a failure transition from s to s', then either s' also has a transition labeled by x or it is not possible to read x from s' even via failure transitions -->
      • the summation to admit normalized follows the failure transitions through to actual transitions accumulating the weight
Added:
>
>
    • no assumption is made of general determinism or what transitions must be present on failure (unlike in a canonical n-gram model).
  For example, an n-gram model produced by the OpenGrm NGram Library is a stochastic FST (provided the phi_label is specified to match the backoff label, typically 0, of the n-gram model), but many other topologies are possible.

Revision 42016-04-06 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Line: 10 to 10
 
    • there may be failure transitions (see phi-labeled transitions) but
      • there is at most one such transition per state
      • there are no failure-transition cycles
Changed:
<
<
      • let state s have a transition labeled by x: if there is a failure transition from s to s', then either s' also has a transition labeled by x or it is not possible to read x from s' even via failure transitions
>
>
<--       * let state s have a transition labeled by x: if there is a failure transition from s to s', then either s' also has a transition labeled by x or it is not possible to read x from s' even via failure transitions -->
 
      • the summation to admit normalized follows the failure transitions through to actual transitions accumulating the weight

For example, an n-gram model produced by the OpenGrm NGram Library is a stochastic FST (provided the phi_label is specified to match the backoff label, typically 0, of the n-gram model), but many other topologies are possible.

Revision 32016-04-06 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Changed:
<
<
SFst is a library for normalizing, sampling, combining, and approximating stochastic (or probabilistic) finite-state transducers. These are weighted finite-state transducers, represented in OpenFst library format, that have the following properties:
>
>
SFst is a library for normalizing, sampling, combining, and approximating stochastic (or probabilistic) finite-state transducers. These are weighted finite-state transducers, represented in OpenFst library format, that have the following two properties:
 
Changed:
<
<
  • normalized: the weights of the transitions (and any final weight) from each state sum to Weight::One()1
  • canonical topology: there may be failure transitions (see phi-labeled transitions) but there is at most one such transition per state and there are no failure-transition cycles. The summation above follows the failure transitions through to actual transitions accumulating the weight.
>
>
  1. normalized: the weights of the transitions (and any final weight) from each state sum to Weight::One()1
  2. canonical topology:
    • the states are sorted by input label
    • there may be failure transitions (see phi-labeled transitions) but
      • there is at most one such transition per state
      • there are no failure-transition cycles
      • let state s have a transition labeled by x: if there is a failure transition from s to s', then either s' also has a transition labeled by x or it is not possible to read x from s' even via failure transitions
      • the summation to admit normalized follows the failure transitions through to actual transitions accumulating the weight
  For example, an n-gram model produced by the OpenGrm NGram Library is a stochastic FST (provided the phi_label is specified to match the backoff label, typically 0, of the n-gram model), but many other topologies are possible.

Revision 22016-03-30 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

Line: 14 to 14
 
Operation Usage Description Complexity
IsCanonical IsCanonical(fst, phi_label) checks the second property above holds for a weighted FST Time, Space: O(V + E)
IsNormalized IsNormalized(fst, phi_label, delta) checks the above two properties hold for a weighted FST Time, Space: O(V + E)
Changed:
<
<
Normalize Normalize(&fst, phi_label, delta) normalizes a canonical weighted FST preserving total path weights (up to a global constant), where possible same as ShortestDistance
>
>
Normalize Normalize(&fst, phi_label, delta) normalizes a canonical weighted FST preserving total path weights (up to a global constant), where possible2 same as ShortestDistance
 
  sfstnormalize [-phi_label=$l][--delta=$d] in.fst out.fst    
PhiNormalize PhiNormalize(&fst, phi_label) normalizes a canonical weighted FST by only modifying the failure transitions, where possible Time, Space: O(V + E)
  sfstnormalize --phi_only [-phi_label=$l][--delta=$d] in.fst out.fst    
Line: 22 to 22
 
  sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst    

1Computation is done internally with Log64Weight. Conversion from the input weight type is done using a WeightConvert functor, pre-defined for common weight types like TropicalWeight and LogWeight.

Added:
>
>

2Possible when the sum of all successful paths from the initial state is finite (and the input is trim).

Revision 12016-03-29 - MichaelRiley

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WebHome"

OpenGrm SFst Library: Stochastic Finite-State Transducer Library

SFst is a library for normalizing, sampling, combining, and approximating stochastic (or probabilistic) finite-state transducers. These are weighted finite-state transducers, represented in OpenFst library format, that have the following properties:

  • normalized: the weights of the transitions (and any final weight) from each state sum to Weight::One()1
  • canonical topology: there may be failure transitions (see phi-labeled transitions) but there is at most one such transition per state and there are no failure-transition cycles. The summation above follows the failure transitions through to actual transitions accumulating the weight.

For example, an n-gram model produced by the OpenGrm NGram Library is a stochastic FST (provided the phi_label is specified to match the backoff label, typically 0, of the n-gram model), but many other topologies are possible.

The following operations are provided for SFSTs:

Operation Usage Description Complexity
IsCanonical IsCanonical(fst, phi_label) checks the second property above holds for a weighted FST Time, Space: O(V + E)
IsNormalized IsNormalized(fst, phi_label, delta) checks the above two properties hold for a weighted FST Time, Space: O(V + E)
Normalize Normalize(&fst, phi_label, delta) normalizes a canonical weighted FST preserving total path weights (up to a global constant), where possible same as ShortestDistance
  sfstnormalize [-phi_label=$l][--delta=$d] in.fst out.fst    
PhiNormalize PhiNormalize(&fst, phi_label) normalizes a canonical weighted FST by only modifying the failure transitions, where possible Time, Space: O(V + E)
  sfstnormalize --phi_only [-phi_label=$l][--delta=$d] in.fst out.fst    
Randgen fst::Randgen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) randomly generates paths in a stochastic FST (correctly dealing with failure transitions) see Randgen
  sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst    

1Computation is done internally with Log64Weight. Conversion from the input weight type is done using a WeightConvert functor, pre-defined for common weight types like TropicalWeight and LogWeight.

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback