TWiki> FST Web>FstAdvancedUsage (revision 5)EditAttach

OpenFst Advanced Usage

Below are a variety of topics covered in greater depth or of more specialized interest than found in the Quick Tour. Reading the Quick Tour first is recommended.

Arc Iterators

An arc iterator doc has the form:

template <class F>
class ArcIterator {
  typedef typename F::Arc Arc;
  typedef typename Arc::StateId StateId;

 public:
  ArcIterator(const &F fst, StateId s);
  // End of iterator?
  bool Done() const;
  // Current arc (when !Done)     
  const Arc& Value() const;
  // Advance to next arc (when !Done)   
  void Next();
  // Return to initial  condition   
  void Reset();
  // Arc access by position                     
  void Seek(size_t posi);
};

All current OpenFst library Seek() methods are constant time.

An example use of an arc iterator is shown here.

A MutableArcIterator doc is similar to an ArcIterator except its constructor takes a pointer to a MutableFst and it additionally has a SetValue() method.

Arc Filters

Arc filters are accepted by various operations to control which arcs are transitioned. An arc filter has the form:

template <class Arc>
class ArcFilter {
public:
  // Return true iff arc is to be transitioned.
  bool operator()(const Arc &arc) const;
};

Pre-defined arc filters include:

Name Description  
AnyArcFilter Accept all arcs doc
EpsilonArcFilter Accept only arcs with input and output epsilons doc
InputEpsilonArcFilter Accept only arcs with input epsilons doc
OutputEpsilonArcFilter Accept only arcs with output epsilons doc

Base Fsts

TBA

Command Line Flags

OpenFst has several global options in the library proper that most users can ignore, leaving them with their default values:

Option Type Default Description
FLAGS_fst_default_cache_gc bool true Enable garbage collection of cached Fsts
FLAGS_fst_default_cache_gc_limit int64 1048576 Byte size that triggers garbage collection of cached Fsts
FLAGS_fst_pair_parentheses string "" Characters enclosing the first weight of a printed pair weight (and derived classes) to ensure proper I/O of nested pair weights; must have size 0 (none) or 2 (open and close parenthesis)
FLAGS_fst_pair_separator string "," Character separator between printed pair weights; must be a single character
FLAGS_fst_verify_properties bool false Verify Fst properites are correctly set when queried

The first two are used to control the caching of expanded state and arc information found in most of the on-the-fly Fst classes; the default values should normally be satisfactory. The second two are used to control the text formating of ProductWeight doc and other weight pairs. The last is used to ensure that the properties of an FST have been correctly set; it is for debugging only since it incurs considerable computational cost.

In each of the Fst distribution installed binaries, the above options, as well as any of those defined specific to the binary, can be set from the command line using e.g. --fst_default_cache_gc=false or --fst_pair_parenthesis="(" . Additionally, the option --help and --v=N (where N = 0,1,2,..) will print out usage information and set the verbosity level of logging, respectively. The flag processing is modeled after the Google gflags package.

In a user-defined binary, the command line options processing will all also work if the user calls:

SetFlags(usage, &argc, &argv, true);

In that case, the user can set his own flags as well, following the conventions in <fst/flags.h>.

Alternatively, the user can process options in his own way and directly assign to any of the above global options if he wishes to modify their defaults.

Composition Filters

TBA

Expanded Fsts

TBA

FST Input/Output

TBA

Matchers

TBA

Mutable Fsts

Properties

Each Fst has associated with it a set of stored properties that assert facts about it. These are queried in an FST with the Properties() method and set in a MutableFst with the SetProperties() method. OpenFst library operations use these properties to optimize their performance. OpenFst library operations and mutable FSTs attempt to preserve as much property information in their results as possible without significant added computation.

Some properties are binary - they are either true or false. For each such property, there is a single stored bit that is set if true and not set if false. The binary Fst properties are:

Name Description
kExpanded True if an ExpandedFst
kMutable True if a MutableFst

Other properties are trinary - they are either true, false or unknown. For each such property, there are two stored bits; one is set if true, the other is set if false and neither is set if unknown.

Names Description
kAcceptor True if the input and output label are equal for each arc
kNotAcceptor True if an input and output label is not equal for some arc
kIDeterministic True if input labels are unique leaving each state
kNonIDeterministic True if input labels are not unique leaving some state
kODeterministic True if output labels are unique leaving each state
kNonODeterministic True if output labels are not unique leaving some state

The call fst.Properties(mask, false) returns the stored property bits corresponding to the mask bits. Some properties may be unknown. The call fst.Properties(mask, true) will return the stored property bits corresponding to the mask bits after computing and updating any of those that are unknown.

State Iterators

A state iterator doc has the form:

template <class F>
class StateIterator {
  typedef typename F::Arc Arc;
  typedef typename Arc::StateId StateId;

 public:
  StateIterator(const &F fst);
  // End of iterator?
  bool Done() const;
  // Current arc (when !Done)     
  StateId Value() const;
  // Advance to next arc (when !Done)   
  void Next();
  // Return to initial  condition   
  void Reset();
};

An example use of a state iterator is shown here.

State Queues

State queues are used by, among others, the shortest path and shortest distance algorithms and by the Visit operation. A state queue has the form:

template <class StateId>                                                       
class Queue {
 public:
   // Ctr: may need args (e.g., Fst, comparator) for some queues
   Queue(...);
   // Returns the head of the queue
   StateId Head() const;
   // Inserts a state
   void Enqueue(StateId s);
   // Removes the head of the queue 
   void Dequeue();
   // Updates ordering of state s when weight changes, if necessary 
   void Update(StateId s);
   // Does the queue contain no elements?
   bool Empty() const;
   // Remove all states from queue 
   void Clear();
};

Pre-defined state queues include:

Queue Description  
AutoQueue Automatically-selected from Fst properties doc
FifoQueue First-In, first-Out doc
LifoQueue Last-In, first-Out doc
SccQueue Component graph top-ordered meta-queue doc
ShortestFirstQueue Priority (least weight) doc
StateOrderQueue State-ID ordered doc
TopOrderQueue topologically ordered doc

Some queues accept arc filters to control which transitions are explored.

State Tables

TBA

User-defined Fst Arcs and Weights

TBA

User-defined Fst Classes

Visitors

The simplest way to traverse an FST is in state order using a state iterator.

A very general traversal method is to use: doc [bad link?]

Visit(fst, visitor, queue); 

where the visitor object specfies the actions taken in the traversal while the state queue object specifies the traversal order. A visitor has the form:

// Visitor Interface - class determines actions taken during a visit.           
// If any of the boolean member functions return false, the visit is            
// aborted by first calling FinishState() on all unfinished (grey)              
// states and then calling FinishVisit().                                       
                                          
template <class Arc>
class Visitor {
public:
   typedef typename Arc::StateId StateId;
   Visitor(T *return_data);
   // Invoked before visit
   void InitVisit(const Fst<Arc> &fst);
   // Invoked when state discovered (2nd arg is visitation root) 
   bool InitState(StateId s, StateId root);
   // Invoked when arc to white/undiscovered state examined
   bool WhiteArc(StateId s, const Arc &a);
   // Invoked when arc to grey/unfinished state examined 
   bool GreyArc(StateId s, const Arc &a);
   // Invoked when arc to black/finished state examined
   bool BlackArc(StateId s, const Arc &a);
   // Invoked when state finished.
   void FinishState(StateId s); 
   // Invoked after visit
   void FinishVisit(); 
};

While a depth-first search can be implemented using Visit() with the LifoQueue(), it is often better to use the more specialized DFSVisit()doc [bad link?] in <fst/dfs-visit.h> since it is somewhat more space-efficient and the specialized visitor interface described there has additional funcitionality for a DFS.

Pre-defined FST visitors include:

Visitor Type Description  
CopyVisitor Visit Copies in a queue-specified order doc
SccVisitor DfsVisit Finds strongly-connected components, accessibility and coaccessibility doc
TopOrderVisitor DfsVisit Finds topological order doc

The visit operations optionally accept arc filters to control which transitions are explored.

-- MichaelRiley - 27 Feb 2009

Edit | Attach | Watch | Print version | History: r73 | r7 < r6 < r5 < r4 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r5 - 2009-03-03 - MichaelRiley
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback