You can use the formatting commands describes in TextFormattingRules in your comment.
If you want to post some code, surround it with <verbatim> and </verbatim> tags.
Auto-linking of WikiWords is now disabled in comments, so you can type VectorFst and it won't result in a broken link.
You now need to use <br> to force new lines in your comment (unless inside verbatim tags). However, a blank line will automatically create a new paragraph.
Hi Richard (etc.), using Thrax 1.1.0 (and with OpenFst 1.3.4 already installed), compilation fails while making the file `ast/identifier-node.cc` due to an issue in the `include/thrax/compat/utils.h` header. Here's the error:
/bin/sh ../../libtool --tag=CXX --mode=compile g++ -DHAVE_CONFIG_H -I./../include -g -O2 -MT identifier-node.lo -MD -MP -MF .deps/identifier-node.Tpo -c -o identifier-node.lo `test -f 'ast/identifier-node.cc' || echo './'`ast/identifier-node.cc
libtool: compile: g++ -DHAVE_CONFIG_H -I./../include -g -O2 -MT identifier-node.lo -MD -MP -MF .deps/identifier-node.Tpo -c ast/identifier-node.cc -fno-common -DPIC -o .libs/identifier-node.o
In file included from ast/identifier-node.cc:22:
./../include/thrax/compat/utils.h:119:8: error: field has incomplete type
'char []'
char buf[];
^
I presume this is because buf[] doesn't have a length defined (nor is it initialized with a string), and when I change the line to
char buf[1024];
compilation goes through. (I'm not sure this is a sensible default; I spent no time trying to understand what this code is doing.)
I'd include a patch but it's one line.
Kyle
Just remove that line: that variable is not used. Apparently it's a holdover from some earlier implementation, and I just forgot to update it. I'll fix this in the next release.
Hi,
I am currently using thrax to extend my some features of an alignment tool I wrote for my g2p system.
The basic idea is that the user can specify some alignment correspondence rules and optional default penalties, and then these can be incorporated into the EM training process.
At present I have kind of hacked the functionality of the thraxcompiler command tool to read in the grammar, and then return the desired FST+symbol table to the alignment program.
EDIT: Maybe it makes more sense to just provide a couple of snippets:
GetFstFromGrammar
sy = SymbolTable['simple.syms'];
zero = "0".sy : "zero".sy;
units = ( "these're".sy : ( "these're".sy | "[these]" | "[these]" "are".sy ) );
split = ( "[these]" "are".sy : "these're".sy );
sigma = "<sigma>".sy : "<sigma>".sy;
abc = ( "a".sy "b c".sy : "a b b".sy );
export RULES = Optimize[ sigma* ( units | zero | abc ) sigma* ];
Here the 'sigma' is used in combination with a specialized 1-state alignment transducer that relies on RHO and SIGMA matchers.
Is there an alternative or recommended way to do this? It would be great if I could either specify the symbol table just once at the beginning, or automatically infer/generate the whole symbol table and return it - or even better modify the grammar from my C++ application to simply what the user is responsible for doing.
I went through the FAQ but did not notice any answers to these questions.
Thanks for your time.
UPDATE:
I solved this by creating some bindings with pybindgen and then writing a generator that interprets a simplified version of the Thrax grammar, then expands it to the versbose version with the extra quotes and symfile suffixes, etc.
yes (openfst 1.3.4 compiled with --enable-far and some other enable options ), thrax compiled successfully,but compilation fails while making the file `batch_test.c` (extracted form export.tgz), can you me some advice
I'd like to but first I need to understand what is going on. I can't reproduce your error (apparently) and I don't know what batch_test.c is since it's not part of the Thrax distribution. Is this your own code? If so then I need to see EXACTLY what you are doing, including probably your sending me a directory with all of the additional code.
If this is part of the Thrax distribution then please tell me where it is because I can't find it (nor do I remember such a file).
thank you for your reply.
in this page:
http://openfst.cs.nyu.edu/twiki/bin/view/Contrib/ThraxContrib,
you can see
Projects using the OpenGrm Thrax tools:
export.tgz: Grammars and software developed as part of a text normalization class taught at the Center for Spoken Language Understanding, Fall 2011. URL for the course: http://www.cslu.ogi.edu/~sproatr/Courses/TextNorm/
i download "export.tgz" .
there is a file called batch_tester.cc in batch_tester directory(extract from export.tgz)。
Ok that helps. Yes, I did write that, but it wasn't obvious from your query that this is what you were referring to. Please in future give all necessary information when reporting a bug.
In the meantime I will have a look. I do not know off the top of my head what the problem is.
So far I find thrax a very neat piece of software but I have two questions...
Can I somehow use probability semiring as weights, because it seems Thrax only allows specifying log and tropical semirings? How about the other ones... Or should I somehow postprocess the generated far file?
Another question: I tried to use "fstdraw" on a far file, but got: ERROR: FstHeader::Read: Bad FST header: example.far
Is this a version mismatch?
Sorry, I missed the earlier comment -- for some reason I didn't get email about it.
Unfortunately the restriction to Log and Tropical is due to a similar restriction in the fst library: the real semiring does not come predefined. The best suggestion would be to use Tropical and then just do the obvious e^-cost conversion.
-- CyrilAllauzen - 13 Aug 2012