TWiki
>
GRM Web
>
PyniniOptimizeDoc
(2021-03-06,
KyleGorman
)
(raw view)
E
dit
A
ttach
---+ FST optimization There are several ways to "optimize" a weighted finite-state transducer (WFST). The C++ template function =fst::Optimize= underlies Pynini's =optimize= instance method and Thrax's =Optimize= function. But what exactly does this make more optimal [sic]? How does it accomplish that? And what, if anything, can we assert about the resulting transducer? ---++ What makes an WFST suboptimal... Tolstoy wrote that "All happy families are alike; each unhappy family is unhappy in its own way", and that's true of WFSTs too. * WFST operations (including composition, concatenation, and union) introduce _epsilon-transitions_, arcs with epsilon input and output labels (e.g., all but two arcs [[http://www.openfst.org/twiki/bin/view/FST/RmEpsilonDoc][here]]). *Epsilon-removal* removes these transitions. * A WFST is _deterministic_ if for each state, there is never more than one arc with a given label leaving that state (e.g., state 1 [[http://www.openfst.org/twiki/bin/view/FST/DeterminizeDoc][here]]). *Determinization* makes an WFST deterministic. While this can actually increase the size of the WFST, it is a prerequisite for the next algorithm. * A WFST is _minimal_ if it has the minimum number of states possible for expressing its relation. *Minimization* modifies a deterministic WFST's topology so that it becomes minimal (e.g., the examples [[http://www.openfst.org/twiki/bin/view/FST/MinimizeDoc][here]]). ---++ ...what we can do about it... =fst::Optimize= checks whether the WFST is known to be epsilon-free; if not, epsilon-removal is performed. Then, arc-sum mapping is performed. Then, it attempts to perform minimization, determinizing along the way, if necessary, as follows. If the WFST is (known to be) an acyclic acceptor but not (known to be) deterministic, we determinize it, then minimize it. This works because any acyclic transducer (over a _zero-sum-free_ semiring) is determinizable (Mohri 2009). However, if the WFST is a transducer, it may be non-functional (i.e., it describes a one-to-many relation); non-functional transducers can be determinized, but this is quite inefficient. And, if the WFST has weighted cycles (whether or not it's an acceptor or a transducer), it may lack the _twins property_, in which case the determinization algorithm will never terminate. However, in such a case, it is still usually possible to reduce the number of arcs and states in the WFST (Allauzen et al. 2004): the WFST is determinized and minimized _as if it were an acceptor_ (and if the WFST has weighted cycles, _as if it were an unweighted acceptor_). (Alternatively, we could have checked for the twins property and/or for non-functionalism, but both checks are themselves quite expensive, so we have chosen not to.) In that case, determinization and minimization may introduce _multi-arcs_, arcs with the same source state, destination state, input label, and output-label. We therefore perform *arc-sum mapping*, which merges such arcs using ⊕ to combine their weights, making the WFST _multi-arc-free_. ---++ ...and what we can assert about an optimized transducer Once we have performed optimization on a WFST, it is free of multi-arcs and unnecessary epsilon-epsilon arcs. And, if the WFST was an acyclic acceptor, it is guaranteed to also be deterministic and minimal. We *cannot* be sure that it is smaller (in terms of the number of states and arcs) than the input, though it usually is in practice. ---++ Other behind-the-scenes optimizations When computing the [[http://www.openfst.org/twiki/bin/view/FST/DifferenceDoc][difference]] of two WFSTs, the right-hand side must be epsilon-free and deterministic. If it is not known to have these properties, Thrax and Pynini perform epsilon-removal and attempt determinization. ---++ References Allauzen, C., Mohri, M., Riley, M., and Roark, B. 2004. A generalized construction of integrated speech recognition transducers. In _ICASSP_, 761-764. Mohri, M. 2009. Weighted automata algorithms. In M. Droste, W. Kuich, and H. Vogler, _Handbook of weighted automata_, pages 213-254. New York: Springer.
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r4 - 2021-03-06
-
KyleGorman
GRM
Log In
or
Register
GRM Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Webs
Contrib
FST
Forum
GRM
Kernel
Main
Sandbox
TWiki
Main
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback