TWiki
>
FST Web
>
FstAdvancedUsage
(revision 6) (raw view)
Edit
Attach
%TOC% ---+ <nop>OpenFst Advanced Usage Below are a variety of topics covered in greater depth or of more specialized interest than found in the Quick Tour. Reading the [[FstQuickTour][Quick Tour]] first is recommended. #ArcIterators ---++ Arc Iterators An arc iterator %DOX{fst::ArcIterator[%H%]}% is used to access the transitions leaving an FST state. It has the form: <pre> template <class F> class ArcIterator { typedef typename F::Arc Arc; typedef typename Arc::StateId StateId; public: ArcIterator(const &F fst, StateId s); %RED% // End of iterator? %ENDCOLOR% bool Done() const; %RED% // Current arc (when !Done) %ENDCOLOR% const Arc& Value() const; %RED% // Advance to next arc (when !Done) %ENDCOLOR% void Next(); %RED% // Return current position %ENDCOLOR% size_t Position(); %RED% // Return to initial position %ENDCOLOR% void Reset(); %RED% // Arc access by position %ENDCOLOR% void Seek(size_t pos); }; </pre> It is templated on the Fst class =F= to allow efficient specializations but defaults to a generic version on the abstract base [[FstAdvancedUsage#BaseFsts][Fst]] class. See [[FstConventions][here]] for conventions that arc iterator use must respect. All current <nop>OpenFst library =Seek()= methods are constant time. An example use of an arc iterator is shown [[FstQuickTour#ArcIterator][here]]. A =MutableArcIterator= %DOX{fst::MutableArcIterator[%H%]}% is similar to an =ArcIterator= except its constructor takes a pointer to a =MutableFst= and it additionally has a =SetValue()= method. #ArcFilters ---++ Arc Filters Arc filters are accepted by various operations to control which arcs are transitioned. An arc filter has the form: <pre> template <class Arc> class ArcFilter { public: %RED% // Return true iff arc is to be transitioned. %ENDCOLOR% bool operator()(const Arc &arc) const; }; </pre> Pre-defined arc filters include: | *Name* | *Description* | | | =AnyArcFilter= | Accept all arcs | %DOX{fst::AnyArcFilter[%H%]}% | | =EpsilonArcFilter= | Accept only arcs with input and output epsilons |%DOX{fst::AnyArcFilter[%H%]}% | | =InputEpsilonArcFilter= | Accept only arcs with input epsilons | %DOX{fst::InputEpsilonArcFilter[%H%]}% | | =OutputEpsilonArcFilter= | Accept only arcs with output epsilons | %DOX{fst::InputEpsilonArcFilter[%H%]}% | #BaseFsts ---++Base Fsts Every =Fst= %DOX{fst::Fst[%H%]}% must specify an initial state, the final weights, arc and epsilon counts per states, an Fst type name, the Fst's [[FstAdvancedUsage#FstProperties][properties]], how to copy, read and write the Fst, and the input and output symbol tables (if any). In particular, the base =Fst= class has the inteface: <pre> template <class A> class Fst { public: typedef A Arc; typedef typename A::Weight Weight; typedef typename A::StateId StateId; %RED% // Initial state %ENDCOLOR% virtual StateId Start() const = 0; %RED% // States's final weight %ENDCOLOR% virtual Final(StateId) const = 0: %RED% // State's arc count %ENDCOLOR% virtual NumArcs(StateId) const = 0; %RED% // States's input epsilon count %ENDCOLOR% virtual NumInputEpsilons(StateId) const = 0; %RED% // State's output epsilon count %ENDCOLOR% virtual NumOutputEpsilons(StateId) const = 0; %RED% // If test=false, return stored properties bits for mask (some poss. unknown) // If test=true, return property bits for mask (computing o.w. unknown) %ENDCOLOR% virtual Properties(uint64 mask, bool test) const = 0; %RED% // Fst type name %ENDCOLOR% virtual const string& Type() const = 0; %RED% // Get a copy of this Fst %ENDCOLOR% virtual Fst<A> *Copy() const = 0; %RED% // Read an Fst from an input stream; returns NULL on error %ENDCOLOR% static Fst<A> *Read(istream &strm, const FstReadOptions &opts); %RED% // Read an Fst from a file; return NULL on error // Empty filename reads from standard input %ENDCOLOR% static Fst<A> *Read(const string &filename); %RED% // Write an Fst to an output stream; return false on error %ENDCOLOR% virtual bool Write(ostream &strm, const FstWriteOptions &opts); %RED% // Write an Fst to a file; return false on error // Empty filename writes to standard output %ENDCOLOR% virtual bool Write(const string &filename); %RED% // Return input label symbol table; return NULL if not specified %ENDCOLOR% virtual const SymbolTable* InputSymbols() const = 0; %RED% // Return output label symbol table; return NULL if not specified %ENDCOLOR% virtual const SymbolTable* OutputSymbols() const = 0; }; </pre> =Fst= is an abstract class (note the pure virtual methods). All <nop>OpenFst FSTs must meet this interface. The companion [[FstAdvancedUsage#StateIterators][state iterator]] and [[FstAdvancedUsage#ArcIterators][arc iterator]] classes provide access to the states and transitions of the FST. #FstCaching ---++ Caching %ICON{wip}% #CommandLineOptions ---++ Command Line Flags <NOP>OpenFst has several global options in the library proper that most users can ignore, leaving them with their default values: | *Option* | *Type* | *Default* | *Description* | | =FLAGS_fst_default_cache_gc= | bool | true | Enable garbage collection of cached Fsts | | =FLAGS_fst_default_cache_gc_limit= | int64 | 1048576 | Byte size that triggers garbage collection of cached Fsts | | =FLAGS_fst_pair_parentheses= | string | "" | Characters enclosing the first weight of a printed pair weight (and derived classes) to ensure proper I/O of nested pair weights; must have size 0 (none) or 2 (open and close parenthesis) | | =FLAGS_fst_pair_separator= | string | "," | Character separator between printed pair weights; must be a single character | | =FLAGS_fst_verify_properties= | bool | false | Verify Fst properites are correctly set when queried | The first two are used to control the [[FstAdvancedUsage#FstCaching][caching]] of expanded state and arc information found in most of the on-the-fly Fst classes; the default values should normally be satisfactory. The second two are used to control the text formating of =ProductWeight= %DOX{fst::ProductWeight[%H%]}% and other weight pairs. The last is used to ensure that the [[FstAdvancedUsage#FstProperties][properties]] of an FST have been correctly set; is is used for debugging only, since it incurs considerable computational cost. In each of the Fst distribution installed binaries, the above options, as well as any of those defined specific to the binary, can be set from the command line using e.g. =--fst_default_cache_gc=false= or =--fst_pair_parenthesis="("= . Additionally, the option =--help= and =--v=N= (where N = 0,1,2,..) will print out usage information and set the verbosity level of logging, respectively. The flag processing is modeled after the Google [[http://code.google.com/p/google-gflags][gflags]] package. In a user-defined binary, the command line options processing will all also work if the user calls: <verbatim> SetFlags(usage, &argc, &argv, true); </verbatim> In that case, the user can set his own flags as well, following the conventions in [[http://cims.nyu.edu/~openfst/doxygen/html/flags_8h-source.html][<fst/flags.h>]]. Alternatively, the user can process options in his own way and directly assign to any of the above global options if he wishes to modify their defaults. #ComposeFilters ---++ Composition Filters %ICON{wip}% #ExpandedFsts ---++ Expanded Fsts An =ExpandedFst= %DOX{fst::ExpandedFst[%H%]}% is an [[FstAdvancedUsage#BaseFsts][Fst]] that has an additional method that specifies the state count as well as methods to copy and read the expanded FST. In particular, an =ExpandedFst= class has the interface: <pre> template <class A> class ExpandedFst : public Fst<class A> { public: typedef A Arc; typedef typename A::StateId StateId; %RED% // State count %ENDCOLOR% StateId NumStates(); %RED% // Get a copy of this ExpandedFst %ENDCOLOR% virtual ExpandedFst<A> *Copy() const = 0; %RED% // Read an ExpandedFst from an input stream; returns NULL on error %ENDCOLOR% static ExpandedFst<A> *Read(istream &strm, const FstReadOptions &opts); %RED% // Read an ExpandedFst from a file; return NULL on error // Empty filename reads from standard input %ENDCOLOR% static ExpandedFst<A> *Read(const string &filename); }; </pre> =ExpandedFst= is an abstract class (note the pure virtual methods). Examples are =VectorFst= %DOX{fst::VectorFst[%H%]}% and =ConstFst= %DOX{fst::ConstFst[%H%]}% #FstInputOutput ---++ FST Input/Output %ICON{wip}% #FstMatchers ---++ Matchers %ICON{wip}% #MutableFsts ---++ Mutable Fsts A =MutableFst= %DOX{fst::MutableFst[%H%]}% is an [[FstAdvancedUsage#ExpandedFsts][ExpandedFst]] that has additional methods that specifiy how to set the start state, final weights, [[FstAdvancedUsage#FstProperties][properties]] and the input and output symbols, how to add and delete states and arcs, as well as methods to copy and read the mutable FST. In particular, a =MutableFst= class has the interface: <pre> template <class A> class MutableFst : public ExpandedFst<class A> { public: typedef A Arc; typedef typename A::StateId StateId; typedef typename A::Weight Weight; %RED% // Set the initial state %ENDCOLOR% virtual void SetStart(StateId) = 0; %RED% // Set the initial state %ENDCOLOR% virtual void SetFinal(StateId, Weight) = 0; %RED% // Set property bits wrt mask %ENDCOLOR% virtual void SetProperties(uint64 props, uint64 mask) = 0; %RED% // Add a state, return its ID %ENDCOLOR% virtual StateId AddState() = 0; %RED% // Add an arc to state %ENDCOLOR% virtual void AddArc(StateId, const A &arc) = 0; %RED% // Delete some states %ENDCOLOR% virtual void DeleteStates(const vector<StateId>&) = 0; %RED% // Delete all states %ENDCOLOR% virtual void DeleteStates() = 0; %RED% // Delete some arcs at state %ENDCOLOR% virtual void DeleteArcs(StateId, size_t n) = 0; %RED% // Delete all arcs at state %ENDCOLOR% virtual void DeleteArcs(StateId) = 0; %RED% // Get a copy of this MutableFst %ENDCOLOR% virtual MutableFst<A> *Copy() const = 0; %RED% // Read an MutableFst from an input stream; returns NULL on error %ENDCOLOR% static MutableFst<A> *Read(istream &strm, const FstReadOptions &opts); %RED% // Read an MutableFst from a file; return NULL on error // Empty filename reads from standard input %ENDCOLOR% static MutableFst<A> *Read(const string &filename); %RED% // Set input label symbol table; NULL signifies not unspecified %ENDCOLOR% virtual void SetInputSymbols(const SymbolTable* isyms) = 0; %RED% // Set output label symbol table; NULL signifies not unspecified %ENDCOLOR% virtual void SetOutputSymbols(const SymbolTable* osyms) = 0; }; </pre> =MutableFst= is an abstract class (note the pure virtual methods). An example is =VectorFst= %DOX{fst::VectorFst[%H%]}%. The companion [[FstAdvancedUsage#ArcIterators][mutable arc iterator]] class provides access to and modification of the transitions of the FST #FstProperties ---++ Properties Each =Fst= has associated with it a set of stored properties that assert facts about it. These are queried in an FST with the =Properties()= method and set in a =MutableFst= with the =SetProperties()= method. <nop>OpenFst library operations use these properties to optimize their performance. <nop>OpenFst library operations and mutable FSTs attempt to preserve as much property information in their results as possible without significant added computation. Some properties are binary - they are either true or false. For each such property, there is a single stored bit that is set if true and not set if false. The binary =Fst= properties are: | *Name* | *Description* | | =kExpanded= | Is an [[FstAdvancedUsage#ExpandedFst][ExpandedFst]] | | =kMutable= | Is a [[FstAdvancedUsage#MutableFst][MutableFst]] | Other properties are trinary - they are either true, false or unknown. For each such property, there are two stored bits; one is set if true, the other is set if false and neither is set if unknown. |*Type* | *Name* | *Description* | | Acceptor | =kAcceptor= | Input and output label are equal for each arc | | | =kNotAcceptor= | Input and output label are not equal for some arc | | Accessible | =kAccessible= | All states reachable from the initial state | | | =kNotAccessible= | Not all states reachable from the initial state | | | =kCoAccessible= | All states can reach a final state | | | =kNotCoAccessible= | Not all states can reach a final state | | Cyclic | =kCyclic= | Has cycles | | | =kAcyclic= | Has no cycles | | | =kInitialCyclic= | Has cycles containing the initial state | | | =KInitialAcyclic= | Has no cycles containing the initial state | | Deterministic | =kIDeterministic=| Input labels are unique leaving each state | | | =kNonIDeterministic=| Input labels are not unique leaving some state | | | =kODeterministic=| Output labels are unique leaving each state | | | =kNonODeterministic=| Output labels are not unique leaving some state | | Epsilons | =kEpsilons= | Has input/output epsilons | | | =KNoEpsilons= | Has no input/output epsilons | | | =kIEpsilons= | Has input epsilons | | | =KNoIEpsilons= | Has no input epsilons | | | =kOEpsilons= | Has output epsilons | | | =KNoOEpsilons= | Has no output epsilons | | Sorted | =kILabelSorted= | Input labels sorted for each state | | | =kNotILabelSorted= | Input labels not sorted for each state | | | =kOLabelSorted= | Output labels sorted for each state | | | =kNotOLabelSorted= | Output labels not sorted for each state | | | =kTopSorted= | States topologically sorted | | | =kNotTopSorted= | States not topologically sorted | | Weighted | =kWeighted= | Non-trivial arc or final weights | | | =kNotWeighted= | Only trivial arc and final weights | The call =fst.Properties(mask, false)= returns the stored property bits set in the mask bits; some properties may be unknown. The call =fst.Properties(mask, true)= returns the stored property bits set in the mask bits after computing and updating any of those set in the mask that are unknown. #StateIterators ---++ State Iterators A state iterator %DOX{fst::StateIterator[%H%]}% is used to access the states of an FST. It has the form: <pre> template <class F> class StateIterator { typedef typename F::Arc Arc; typedef typename Arc::StateId StateId; public: StateIterator(const &F fst); %RED% // End of iterator? %ENDCOLOR% bool Done() const; %RED% // Current state ID (when !Done) %ENDCOLOR% StateId Value() const; %RED% // Advance to next state (when !Done) %ENDCOLOR% void Next(); %RED% // Return to initial position %ENDCOLOR% void Reset(); }; </pre> It is templated on the Fst class =F= to allow efficient specializations but defaults to a generic version on the abstract base [[FstAdvancedUsage#BaseFsts][Fst]] class. See [[FstConventions][here]] for conventions that state iterator use must respect. An example use of a state iterator is shown [[FstQuickTour#StateIterator][here]]. #StateQueues ---++ State Queues State queues are used by, among others, the [[ShortestPathDoc][shortest path]] and [[ShortestDistanceDoc][shortest distance]] algorithms and by the [[FstAdvancedUsage#FstVisitors][Visit]] operation. A =state queue= has the form: <pre> template <class StateId> class Queue { public: %RED% // Ctr: may need args (e.g., Fst, comparator) for some queues %ENDCOLOR% Queue(...); %RED% // Returns the head of the queue %ENDCOLOR% StateId Head() const; %RED% // Inserts a state %ENDCOLOR% void Enqueue(StateId s); %RED% // Removes the head of the queue %ENDCOLOR% void Dequeue(); %RED% // Updates ordering of state s when weight changes, if necessary %ENDCOLOR% void Update(StateId s); %RED% // Does the queue contain no elements? %ENDCOLOR% bool Empty() const; %RED% // Remove all states from queue %ENDCOLOR% void Clear(); }; </pre> Pre-defined state queues include: | *Queue* | *Description* | | | =AutoQueue= | Automatically-selected from Fst properties | %DOX{fst::AutoQueue[%H%]}% | | =FifoQueue= | First-In, first-Out | %DOX{fst::FifoQueue[%H%]}% | | =LifoQueue= | Last-In, first-Out | %DOX{fst::LifoQueue[%H%]}% | | =SccQueue= | Component graph top-ordered meta-queue | %DOX{fst::SccQueue[%H%]}% | | =ShortestFirstQueue= | Priority (least weight) | %DOX{fst::ShortestFirstQueue[%H%]}% | | =StateOrderQueue= | State-ID ordered | %DOX{fst::StateOrderQueue[%H%]}% | | =TopOrderQueue= | Topologically ordered | %DOX{fst::TopOrderQueue[%H%]}% | Some queues accept [[FstAdvancedUsage#ArcFilters][arc filters]] to control which transitions are explored. #StateTables ---++State Tables %ICON{wip}% #NewArcs ---++ User-defined Fst Arcs and Weights %ICON{wip}% #NewFsts ---++ User-defined Fst Classes %ICON{wip}% #FstVisitors ---++Visitors The simplest way to traverse an FST is in state order using a [[FstQuickTour#StateIterator][state iterator]]. A very general traversal method is to use: <pre> Visit(fst, visitor, queue); %DOX{namespacefst.html#Visit[%H%]}% </pre> where the =visitor= object specfies the actions taken in the traversal while the [[FstAdvancedUsage#StateQueues][state queue]] object specifies the traversal order. A =visitor= has the form: <pre>%RED%// Visitor Interface - class determines actions taken during a visit. // If any of the boolean member functions return false, the visit is // aborted by first calling FinishState() on all unfinished (grey) // states and then calling FinishVisit(). %ENDCOLOR% template <class Arc> class Visitor { public: typedef typename Arc::StateId StateId; Visitor(T *return_data); %RED% // Invoked before visit %ENDCOLOR% void InitVisit(const Fst<Arc> &fst); %RED% // Invoked when state discovered (2nd arg is visitation root) %ENDCOLOR% bool InitState(StateId s, StateId root); %RED% // Invoked when arc to white/undiscovered state examined %ENDCOLOR% bool WhiteArc(StateId s, const Arc &a); %RED% // Invoked when arc to grey/unfinished state examined %ENDCOLOR% bool GreyArc(StateId s, const Arc &a); %RED% // Invoked when arc to black/finished state examined %ENDCOLOR% bool BlackArc(StateId s, const Arc &a); %RED% // Invoked when state finished %ENDCOLOR% void FinishState(StateId s); %RED% // Invoked after visit %ENDCOLOR% void FinishVisit(); }; </pre> While a depth-first search can be implemented using =Visit()= with the =LifoQueue()=, it is often better to use the more specialized =DFSVisit()%DOX{namespacefst.html#DfsVisit[%H%]}%= in [[http://cims.nyu.edu/~openfst/doxygen/html/dfs-visit_8h-source.html][<fst/dfs-visit.h>]] since it is somewhat more space-efficient and the specialized visitor interface described there has additional funcitionality for a DFS. Pre-defined FST visitors include: | *Visitor* | *Type* | *Description* | | | =CopyVisitor= | Visit | Copies in a queue-specified order | %DOX{fst::CopyVisitor[%H%]}% | | =SccVisitor= | <nop>DfsVisit | Finds strongly-connected components, [[FstGlossary#AccessibleDef][accessibility]] and [[FstGlossary#CoAccessibleDef][coaccessibility]] | %DOX{fst::SccVisitor[%H%]}% | | =TopOrderVisitor= | <nop>DfsVisit | Finds topological order | %DOX{fst::TopOrderVisitor[%H%]}% | The visit operations optionally accept [[FstAdvancedUsage#ArcFilters][arc filters]] to control which transitions are explored. -- Main.MichaelRiley - 27 Feb 2009
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r73
|
r8
<
r7
<
r6
<
r5
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r6 - 2009-03-03
-
MichaelRiley
FST
Log In
or
Register
FST Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Webs
Contrib
FST
Forum
GRM
Kernel
Main
Sandbox
TWiki
Main
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback