Difference: NGramMake (1 vs. 7)

Revision 72012-03-08 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="NGramQuickTour"

NGramMake

Line: 48 to 48
  class NGramWittenBell ngram(StdMutableFst *countfst); | |
Added:
>
>
In addition to the C++ simple usage above, optional arguments permit the passing of non-default values for various parameters similar to the command-line version.
 

Examples

To make a Kneser-Ney smoothed model from given counts:

Revision 62012-03-04 - BrianRoark

Line: 1 to 1
 
META TOPICPARENT name="NGramQuickTour"

NGramMake

Line: 81 to 81
 Ney, H., Essen, U., Kneser, R., 1994. On structuring probabilistic dependences in stochastic language modeling. Computer Speech and Language 8, 138.

Witten, I. H., Bell, T. C., 1991. The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression. IEEE Transactions on Information Theory 37 (4), 10851094.

Deleted:
<
<
-- MichaelRiley - 09 Dec 2011

Revision 52011-12-16 - BrianRoark

Line: 1 to 1
 
META TOPICPARENT name="NGramQuickTour"

NGramMake

Line: 6 to 6
  This operation produces a smoothed, normalized language model from input n-gram count FST. It smooths the model in one of six ways:
Changed:
<
<
  • witten_bell: smooths using Witten-Bell (cite), with a hyperparameter k, as presented in Carpenter (2005).
  • absolute: smooths based on Absolute Discounting (cite), using bins and discount parameters.
  • katz: smooths based on Katz Backoff (cite), using bins parameters.
  • kneser_ney: smooths based on Kneser-Ney (cite), a variant of Absolute Discounting.
>
>
  • witten_bell: smooths using Witten-Bell (Witten and Bell, 1991), with a hyperparameter k, as presented in Carpenter (2005).
  • absolute: smooths based on Absolute Discounting (Ney, Essen and Kneser, 1994), using bins and discount parameters.
  • katz: smooths based on Katz Backoff (Katz, 1987), using bins parameters.
  • kneser_ney: smooths based on Kneser-Ney (Kneser and Ney, 1995), a variant of Absolute Discounting.
 
  • presmoothed: normalizes at each state based on the n-gram count of the history.
  • unsmoothed: normalizes the model but provides no smoothing.
Line: 49 to 49
 | |

Examples

Added:
>
>
To make a Kneser-Ney smoothed model from given counts:

$ ngrammake --method=kneser_ney earnest.cnts >earnest.kn.mod


StdMutableFst *counts = StdMutableFst::Read("in.fst", true);
NGramKneserNey ngram(counts);
ngram.MakeNGramModel();
ngram.GetFst().Write("out.mod");
 

Caveats

Line: 56 to 70
 

References

Added:
>
>
Carpenter, B., 2005. Scaling high-order character language models to gigabytes. In Proceedings of the ACL Workshop on Software, pages 8699.

Chen, S., Goodman, J., 1998. An empirical study of smoothing techniques for language modeling. Technical report, TR-10-98, Harvard University.

Katz, S. M., 1987. Estimation of probabilities from sparse data for the language model component of a speech recogniser. IEEE Transactions on Acoustics, Speech, and Signal Processing 35 (3), 400401.

Kneser, R., Ney, H., 1995. Improved backing-off for m-gram language modeling. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). pp. 181184.

 
Added:
>
>
Ney, H., Essen, U., Kneser, R., 1994. On structuring probabilistic dependences in stochastic language modeling. Computer Speech and Language 8, 138.
 
Added:
>
>
Witten, I. H., Bell, T. C., 1991. The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression. IEEE Transactions on Information Theory 37 (4), 10851094.
  -- MichaelRiley - 09 Dec 2011

Revision 42011-12-15 - BrianRoark

Line: 1 to 1
 
META TOPICPARENT name="NGramQuickTour"

NGramMake

Description

Added:
>
>
This operation produces a smoothed, normalized language model from input n-gram count FST. It smooths the model in one of six ways:

  • witten_bell: smooths using Witten-Bell (cite), with a hyperparameter k, as presented in Carpenter (2005).
  • absolute: smooths based on Absolute Discounting (cite), using bins and discount parameters.
  • katz: smooths based on Katz Backoff (cite), using bins parameters.
  • kneser_ney: smooths based on Kneser-Ney (cite), a variant of Absolute Discounting.
  • presmoothed: normalizes at each state based on the n-gram count of the history.
  • unsmoothed: normalizes the model but provides no smoothing.

See Chen and Goodman (1998) for a discussion of these smoothing methods.

All of the smoothing methods can be used to build either a mixture model (in which higher order n-gram distributions are interpolated with lower order n-gram distributions) or a backoff model (using the --backoff option, in which lower order n-gram distributions are only used if the higher order n-gram was unobserved in the corpus). Even though some of the methods are typically primarily used with either mixture or backoff smoothing (e.g., Katz with backoff), in this library they can be used with either. Note that mixture models are converted to a backoff topology by pre-summing the mixtures and placing the mixed probability on the highest order transition.

If the --bins option is left as the default (-1), then the number of bins for the discounting methods (katz,absolute,kneser_ney) are set to method appropriate defaults (5 for katz, 1 for absolute).

The C++ classes are all derived from the base class NGramMake.

 

Usage

Added:
>
>
ngrammake [--options] [in.fst [out.fst]]
  --method: type = string, one of: witten_bell (default) | absolute | katz | 
                                   kneser_ney | presmoothed | unsmoothed
  --backoff: type = bool, default = false
  --bins: type = int64, default = -1
  --witten_bell_k, type = double, default = 1.0
  --discount_D, type = double, default = 1.0
 
 class NGramAbsolute ngram(StdMutableFst *countfst);
 
 class NGramKatz ngram(StdMutableFst *countfst);
 
 class NGramKneserNey ngram(StdMutableFst *countfst);
 
 class NGramUnsmoothed ngram(StdMutableFst *countfst);
 
 class NGramWittenBell ngram(StdMutableFst *countfst);
 
 

Examples

Caveats

Added:
>
>
The presmoothed method normalizes at each state based on the n-gram count of the history, which is only appropriate under specialized circumstances, such as when the counts have been derived from strings with backoff transitions indicated.
 

References

Revision 32011-12-13 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="NGramQuickTour"

NGramMake

Line: 6 to 6
 

Usage

Added:
>
>

Examples

 

Caveats

References

Revision 22011-12-10 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="NGramQuickTour"

NGramMake

Line: 6 to 6
 

Usage

Deleted:
<
<

Complexity

 

Caveats

References

Revision 12011-12-09 - MichaelRiley

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="NGramQuickTour"

NGramMake

Description

Usage

Complexity

Caveats

References

-- MichaelRiley - 09 Dec 2011

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback