TWiki
>
GRM Web
>
PyniniOperatorsDoc
(revision 4) (raw view)
Edit
Attach
---+ Specialty operators This describes specialty FST functions for grammar compilation. #CrossProductDoc ---++ =fst::CrossProduct= The _cross-product_ operation generates a transducer from two acceptors _A_, _B_. The resulting transducer represents a relation mapping from any string in _A_ to any string in _B_ (hence the name "cross-product"). In the case that _A_ is a transducer, it is implicitly reinterpreted as the acceptor given by its input projection; similarly, in the case that _B_ is a transducer, it is implicitly reintrepreted as the acceptor given by its output projection. The user may also specify a super-final weight to be concatenated to the resulting transducer to assign a non-Zero cost to the transduction. The algorithm is as follows: 1. Replace all output labels of _A_ with epsilon. 2. Replace all input labels of _B_ with epsilon. 3. Concatenate (1) and (2). 4. If a final weight (other than semiring One, the default), is specified, create a superfinal state with the desired final weight and concatenate it with (3). 5. Push the labels of (3-4) towards the initial state. 6. Perform epsilon-removal on (5). #LenientlyComposeDoc ---++ =fst::LenientlyCompose= The _lenient composition_ (Karttunen 1998) of two FSTs _A_, _B_ is simply the _priority union_ (to be defined) of (_A_ ∘ _B_) and _A_. The priority union of two FSTs _C_, _D_ is similar to the [[http://www.openfst.org/twiki/bin/view/FST/UnionDoc][union]] of _C_ and _D_ except that the relations in _C_ have "priority" over those in _D_. For example, let it be the case that: _C(a)_ → _b_ _D(a)_ → _c_ The ("vanilla") union of _C_, _D_ is then the relation _a_ → { _b, c_ }, whereas the priority union of _C_, _D_ is the narrower relation _a_ → _c_. We can implement this in Thrax as follows: <verbatim> func PriorityUnion[C, D, sigma_star] { input = Determinize[RmEpsilon[Project[C, 'input']]]; return C | ((sigma_star - input) @ D); } </verbatim> where =sigma_star= represents the closure over the alphabet. The lenient composition of FSTs _A_, _B_ is simply the priority union of (_A_ ∘ _B_), _A_. We can implement this in Thrax as follows: <verbatim> func LenientlyCompose[A, B, sigma_start] { return PriorityUnion[A @ B, A, sigma_star]; } </verbatim> where =sigma_star= represents the closure over the alphabet. #RepeatDoc ---++ =fst::Repeat= The _repeat operation_ is a generalization of [[http://www.openfst.org/twiki/bin/view/FST/ClosureDoc][closure]]. A path through the closure of an FST _A_ is valid if it can be generated by taking zero or more paths through _A_. The first argument to =fst::Repeat= represents a lower bound on the number of paths through the input FST required to form a valid path in the output FST. The second argument represents the upper bound on the number of paths through the input FST; by convention, 0 is used to indicate an infinite upper bound. Naturally, the lower bound must be less than or equal to the upper bound. We can then derive the following familiar operations as special cases: * To compute "star"-closure, use arguments =0, 0=. * To compute "plus"-closure, use arguments =1, 0=. * To generate exactly _k_ concatenations of an FST, use arguments =k, k=. ---++ References Karttunen, L. 1998. The proper treatment of Optimality Theory in computational phonology. In _Proc. FSMNLP_, 1-12.
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r8
|
r6
<
r5
<
r4
<
r3
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r4 - 2019-08-05
-
KyleGorman
GRM
Log In
or
Register
GRM Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Webs
Contrib
FST
Forum
GRM
Kernel
Main
Sandbox
TWiki
Main
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback