TWiki> GRM Web>NGramLibrary>NGramBugs (revision 14)EditAttach

OpenGrm NGram Library Known Bugs

  1. Giving ngramread a text file where lines are terminiated with space causes models built with this to have a missing symbol. (BER: fixed -- deals with leading and trailing spaces)
  2. (empty) -- causes wiki cross referencing to be wonky if omitted
  3. (empty)
  4. (empty)
  5. Giving ngramread a blank line causes models built with missing symbol (at least in the version I'm using).
  6. Giving ngramprint --ARPA an FST with no symbol tables segfaults
  7. (empty)
  8. Checking for normalized model is perhaps too exact (allow small delta) (MDR: made --norm_eps a flag for relevant binaries)
  9. ngramread doesn't complain about missing labels if --symbols is passed (e.g. <s> in ARPA format but <S> in symbols file)
  10. ngramread fails if first line isn't blank. Can't read Google ARPA files. OK if first line is \data\? (fixed.)
  11. Editing Google file to fix 10. results in OK read and empty FST. (seems to be fixed.)
  12. Perplexity measure on war of worlds corpus gives warning about bad fST (but o.w. seems to work)
  13. ngrammake fails with zero arguments (BER: fixed)
  14. random error message when symbol tables are missing
  15. empty or non-coaccess machines give odd errors / segfaults
  16. ngramcount using -epsilon_as_backoff has problem finding state if counting order different from model order
-- MichaelRiley - 04 Nov 2010
Edit | Attach | Watch | Print version | History: r20 | r16 < r15 < r14 < r13 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r14 - 2011-11-07 - BrianRoark
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback