Difference: NGramSuggests (1 vs. 18)

Revision 182012-03-08 - BrianRoark

Line: 1 to 1
 
META TOPICPARENT name="NGramLibrary"

OpenGrm NGram Library Suggestions

  1. Improve memory usage of ngramread - uses 14 gb for 1.5gb input (Brian)
Changed:
<
<
  1. support for printing pair LM in a way that facilitates building of transducer
>
>
  1. Counting from cyclic FSTs (Cyril)
  2. support for printing pair LM in a way that facilitates building of transducer
 

META TOPICMOVED by="MichaelRiley" date="1296788058" from="GRM.GrmSuggests" to="GRM.NGramSuggests"

Revision 172012-03-08 - BrianRoark

Line: 1 to 1
 
META TOPICPARENT name="NGramLibrary"

OpenGrm NGram Library Suggestions

  1. Improve memory usage of ngramread - uses 14 gb for 1.5gb input (Brian)
Changed:
<
<
  1. Methods for providing probability mass to OOV, e.g., use good-turing
  2. counting from cyclic (Cyril).
  3. support for printing pair LM in a way that facilitates building of transducer
>
>
  1. support for printing pair LM in a way that facilitates building of transducer
 
Deleted:
<
<
-- MichaelRiley - 05 Nov 2010
 
META TOPICMOVED by="MichaelRiley" date="1296788058" from="GRM.GrmSuggests" to="GRM.NGramSuggests"

Revision 162012-03-08 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="NGramLibrary"

OpenGrm NGram Library Suggestions

  1. Improve memory usage of ngramread - uses 14 gb for 1.5gb input (Brian)
  2. Methods for providing probability mass to OOV, e.g., use good-turing
Changed:
<
<
  1. counting from cyclic (Cyril)
  2. shift VectorFst to MutableFst where possible (Cyril)
  3. create full purpose, efficient ngramrandgen (Michael) - done.
  4. change ngramrandcorput to output a far rather than text. - done (in ngramrandgen)
>
>
  1. counting from cyclic (Cyril).
 
  1. support for printing pair LM in a way that facilitates building of transducer

-- MichaelRiley - 05 Nov 2010

Revision 152011-12-12 - BrianRoark

Line: 1 to 1
 
META TOPICPARENT name="NGramLibrary"

OpenGrm NGram Library Suggestions

Line: 6 to 6
 
  1. Methods for providing probability mass to OOV, e.g., use good-turing
  2. counting from cyclic (Cyril)
  3. shift VectorFst to MutableFst where possible (Cyril)
Changed:
<
<
  1. create full purpose, efficient ngramrandgen (Michael)
  2. change ngramrandcorput to output a far rather than text.
>
>
  1. create full purpose, efficient ngramrandgen (Michael) - done.
  2. change ngramrandcorput to output a far rather than text. - done (in ngramrandgen)
 
  1. support for printing pair LM in a way that facilitates building of transducer

-- MichaelRiley - 05 Nov 2010

Revision 142011-12-11 - BrianRoark

Line: 1 to 1
 
META TOPICPARENT name="NGramLibrary"

OpenGrm NGram Library Suggestions

Line: 8 to 8
 
  1. shift VectorFst to MutableFst where possible (Cyril)
  2. create full purpose, efficient ngramrandgen (Michael)
  3. change ngramrandcorput to output a far rather than text.
Added:
>
>
  1. support for printing pair LM in a way that facilitates building of transducer
  -- MichaelRiley - 05 Nov 2010

Revision 132011-12-06 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="NGramLibrary"

OpenGrm NGram Library Suggestions

Line: 7 to 7
 
  1. counting from cyclic (Cyril)
  2. shift VectorFst to MutableFst where possible (Cyril)
  3. create full purpose, efficient ngramrandgen (Michael)
Added:
>
>
  1. change ngramrandcorput to output a far rather than text.
  -- MichaelRiley - 05 Nov 2010

Revision 122011-12-02 - BrianRoark

Line: 1 to 1
 
META TOPICPARENT name="NGramLibrary"

OpenGrm NGram Library Suggestions

  1. Improve memory usage of ngramread - uses 14 gb for 1.5gb input (Brian)
Changed:
<
<
  1. random test suite (Brian)
  2. parameter for unknown word prob (Brian)
  3. normalize based on prefix count, no additional smoothing (Brian)
  4. counting from cyclic (Cyril)
  5. giving counts to strings for counting (Brian)
  6. shift VectorFst to MutableFst where possible (Cyril)
  7. create ngramperplexity as separate from ngramapply (Brian)
  8. create full purpose, efficient ngramrandgen (Michael)
  9. fold ngramintersect into ngramapply (Richard and Brian)
>
>
  1. Methods for providing probability mass to OOV, e.g., use good-turing
  2. counting from cyclic (Cyril)
  3. shift VectorFst to MutableFst where possible (Cyril)
  4. create full purpose, efficient ngramrandgen (Michael)
  -- MichaelRiley - 05 Nov 2010

Revision 112011-11-15 - BrianRoark

Line: 1 to 1
 
META TOPICPARENT name="NGramLibrary"

OpenGrm NGram Library Suggestions

Changed:
<
<
  1. ngramapply takes --verbose, but -v=n (e.g. =v=1) is already provided (BER: fixed to use --v)
  2. ngramapply appears to be more of a debugging tool than than applicator given its human-readable type output. Perhaps a name that suggests or at least a comment in the quick tour that this is the case and that 'fstintersect' (possiby with a phi matcher) is the heavy-duty way to do this.
  3. Improve memory usage of ngramread - uses 14 gb for 1.5gb input
  4. allow unk words to contribute to perplexity measure
  5. random test suite
  6. parameter for unknown word prob
  7. normalize based on prefix count, no additional smoothing
  8. counting from cyclic
  9. giving counts to strings for counting
>
>
  1. Improve memory usage of ngramread - uses 14 gb for 1.5gb input (Brian)
  2. random test suite (Brian)
  3. parameter for unknown word prob (Brian)
  4. normalize based on prefix count, no additional smoothing (Brian)
  5. counting from cyclic (Cyril)
  6. giving counts to strings for counting (Brian)
  7. shift VectorFst to MutableFst where possible (Cyril)
  8. create ngramperplexity as separate from ngramapply (Brian)
  9. create full purpose, efficient ngramrandgen (Michael)
  10. fold ngramintersect into ngramapply (Richard and Brian)
  -- MichaelRiley - 05 Nov 2010

Revision 102011-11-15 - BrianRoark

Line: 1 to 1
 
META TOPICPARENT name="NGramLibrary"

OpenGrm NGram Library Suggestions

Line: 6 to 6
 
  1. ngramapply appears to be more of a debugging tool than than applicator given its human-readable type output. Perhaps a name that suggests or at least a comment in the quick tour that this is the case and that 'fstintersect' (possiby with a phi matcher) is the heavy-duty way to do this.
  2. Improve memory usage of ngramread - uses 14 gb for 1.5gb input
  3. allow unk words to contribute to perplexity measure
Added:
>
>
  1. random test suite
  2. parameter for unknown word prob
  3. normalize based on prefix count, no additional smoothing
  4. counting from cyclic
  5. giving counts to strings for counting
  -- MichaelRiley - 05 Nov 2010

Revision 92011-09-27 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="NGramLibrary"

OpenGrm NGram Library Suggestions

  1. ngramapply takes --verbose, but -v=n (e.g. =v=1) is already provided (BER: fixed to use --v)
  2. ngramapply appears to be more of a debugging tool than than applicator given its human-readable type output. Perhaps a name that suggests or at least a comment in the quick tour that this is the case and that 'fstintersect' (possiby with a phi matcher) is the heavy-duty way to do this.
Changed:
<
<
  1. Improve memory usage of ngramread - uses 14 gb for 1.5gb input
>
>
  1. Improve memory usage of ngramread - uses 14 gb for 1.5gb input
  2. allow unk words to contribute to perplexity measure
  -- MichaelRiley - 05 Nov 2010

Revision 82011-02-26 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="NGramLibrary"

OpenGrm NGram Library Suggestions

  1. ngramapply takes --verbose, but -v=n (e.g. =v=1) is already provided (BER: fixed to use --v)
  2. ngramapply appears to be more of a debugging tool than than applicator given its human-readable type output. Perhaps a name that suggests or at least a comment in the quick tour that this is the case and that 'fstintersect' (possiby with a phi matcher) is the heavy-duty way to do this.
Deleted:
<
<
  1. Classes such as NgramModel are defined entirely internal to the class. Assuming it is not intended to be templated, then all the but trivial (read short) definitions should be turned into declarations and the definitions moved to a companion cc file (e.g. lib/ngram-model.cc). If it is intended to be templated, then long definitions should be moved out below the class defnition in the header file.
  2. Use true/false not 1/0 for bool values.
  3. NgramModel has a lot of member functions. Would it make sense to refactor to have a few key member functions and make other operations either part of derived classes or separate functions?
  4. Functions like NGramModel::AppendWordToNgramHistory should be static.
  5. NgramModel has useful methods defined privately. Shouldn't they be public or external?
  6. Use enum for ngrammodel 'method' e.g. enum NgramSmoothing { KATZ_SMOOTHING, WITTEN_BELL_SMOOTHING, ... };
  7. Use string as flag to ngrammake --smoothing=katz (and a switch to set enum).
 
  1. Improve memory usage of ngramread - uses 14 gb for 1.5gb input

-- MichaelRiley - 05 Nov 2010

Revision 72011-02-04 - MichaelRiley

Line: 1 to 1
Changed:
<
<
META TOPICPARENT name="WebHome"

OpenGrm Suggestions

>
>
META TOPICPARENT name="NGramLibrary"

OpenGrm NGram Library Suggestions

 
  1. ngramapply takes --verbose, but -v=n (e.g. =v=1) is already provided (BER: fixed to use --v)
  2. ngramapply appears to be more of a debugging tool than than applicator given its human-readable type output. Perhaps a name that suggests or at least a comment in the quick tour that this is the case and that 'fstintersect' (possiby with a phi matcher) is the heavy-duty way to do this.
Line: 15 to 15
 
  1. Improve memory usage of ngramread - uses 14 gb for 1.5gb input

-- MichaelRiley - 05 Nov 2010 \ No newline at end of file

Added:
>
>
META TOPICMOVED by="MichaelRiley" date="1296788058" from="GRM.GrmSuggests" to="GRM.NGramSuggests"

Revision 62011-01-29 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm Suggestions

Line: 12 to 12
 
  1. NgramModel has useful methods defined privately. Shouldn't they be public or external?
  2. Use enum for ngrammodel 'method' e.g. enum NgramSmoothing { KATZ_SMOOTHING, WITTEN_BELL_SMOOTHING, ... };
  3. Use string as flag to ngrammake --smoothing=katz (and a switch to set enum).
Added:
>
>
  1. Improve memory usage of ngramread - uses 14 gb for 1.5gb input
  -- MichaelRiley - 05 Nov 2010 \ No newline at end of file

Revision 52010-11-30 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm Suggestions

Line: 10 to 10
 
  1. NgramModel has a lot of member functions. Would it make sense to refactor to have a few key member functions and make other operations either part of derived classes or separate functions?
  2. Functions like NGramModel::AppendWordToNgramHistory should be static.
  3. NgramModel has useful methods defined privately. Shouldn't they be public or external?
Added:
>
>
  1. Use enum for ngrammodel 'method' e.g. enum NgramSmoothing { KATZ_SMOOTHING, WITTEN_BELL_SMOOTHING, ... };
  2. Use string as flag to ngrammake --smoothing=katz (and a switch to set enum).
  -- MichaelRiley - 05 Nov 2010

Revision 42010-11-29 - MichaelRiley

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<

OpenGrm Suggestions

>
>

OpenGrm Suggestions

 
  1. ngramapply takes --verbose, but -v=n (e.g. =v=1) is already provided (BER: fixed to use --v)
  2. ngramapply appears to be more of a debugging tool than than applicator given its human-readable type output. Perhaps a name that suggests or at least a comment in the quick tour that this is the case and that 'fstintersect' (possiby with a phi matcher) is the heavy-duty way to do this.
Changed:
<
<
>
>
  1. Classes such as NgramModel are defined entirely internal to the class. Assuming it is not intended to be templated, then all the but trivial (read short) definitions should be turned into declarations and the definitions moved to a companion cc file (e.g. lib/ngram-model.cc). If it is intended to be templated, then long definitions should be moved out below the class defnition in the header file.
  2. Use true/false not 1/0 for bool values.
  3. NgramModel has a lot of member functions. Would it make sense to refactor to have a few key member functions and make other operations either part of derived classes or separate functions?
  4. Functions like NGramModel::AppendWordToNgramHistory should be static.
  5. NgramModel has useful methods defined privately. Shouldn't they be public or external?
  -- MichaelRiley - 05 Nov 2010 \ No newline at end of file

Revision 32010-11-11 - BrianRoark

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm Suggestions

Changed:
<
<
  1. ngramapply takes --verbose, but -v=n (e.g. =v=1) is already provided (BER: will make use of --v)
>
>
  1. ngramapply takes --verbose, but -v=n (e.g. =v=1) is already provided (BER: fixed to use --v)
 
  1. ngramapply appears to be more of a debugging tool than than applicator given its human-readable type output. Perhaps a name that suggests or at least a comment in the quick tour that this is the case and that 'fstintersect' (possiby with a phi matcher) is the heavy-duty way to do this.

Revision 22010-11-10 - BrianRoark

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm Suggestions

Changed:
<
<
  1. ngramapply takes --verbose, but -v=n (e.g. =v=1) is already provided
>
>
  1. ngramapply takes --verbose, but -v=n (e.g. =v=1) is already provided (BER: will make use of --v)
 
  1. ngramapply appears to be more of a debugging tool than than applicator given its human-readable type output. Perhaps a name that suggests or at least a comment in the quick tour that this is the case and that 'fstintersect' (possiby with a phi matcher) is the heavy-duty way to do this.

Revision 12010-11-05 - MichaelRiley

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WebHome"

OpenGrm Suggestions

  1. ngramapply takes --verbose, but -v=n (e.g. =v=1) is already provided
  2. ngramapply appears to be more of a debugging tool than than applicator given its human-readable type output. Perhaps a name that suggests or at least a comment in the quick tour that this is the case and that 'fstintersect' (possiby with a phi matcher) is the heavy-duty way to do this.

-- MichaelRiley - 05 Nov 2010

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback