Finitestate transducer
A finitestate transducer (FST) is a finitestate machine with two memory tapes, following the terminology for Turing machines: an input tape and an output tape. This contrasts with an ordinary finitestate automaton, which has a single tape. An FST is a type of finitestate automaton that maps between two sets of symbols.^{[1]} An FST is more general than a finitestate automaton (FSA). An FSA defines a formal language by defining a set of accepted strings while an FST defines relations between sets of strings.
An FST will read a set of strings on the input tape and generates a set of relations on the output tape. An FST can be thought of as a translator or relater between strings in a set.
In morphological parsing, an example would be inputting a string of letters into the FST, the FST would then output a string of morphemes.
Contents
Overview
An automaton can be said to recognize a string if we view the content of its tape as input. In other words, the automaton computes a function that maps strings into the set {0,1}. Alternatively, we can say that an automaton generates strings, which means viewing its tape as an output tape. On this view, the automaton generates a formal language, which is a set of strings. The two views of automata are equivalent: the function that the automaton computes is precisely the indicator function of the set of strings it generates. The class of languages generated by finite automata is known as the class of regular languages.
The two tapes of a transducer are typically viewed as an input tape and an output tape. On this view, a transducer is said to transduce (i.e., translate) the contents of its input tape to its output tape, by accepting a string on its input tape and generating another string on its output tape. It may do so nondeterministically and it may produce more than one output for each input string. A transducer may also produce no output for a given input string, in which case it is said to reject the input. In general, a transducer computes a relation between two formal languages.
Each stringtostring finitestate transducer relates the input alphabet Σ to the output alphabet Γ. Relations R on Σ*×Γ* that can be implemented as finitestate transducers are called rational relations. Rational relations that are partial functions, i.e. that relate every input string from Σ* to at most one Γ*, are called rational functions.
Finitestate transducers are often used for phonological and morphological analysis in natural language processing research and applications. Pioneers in this field include Ronald Kaplan, Lauri Karttunen, Martin Kay and Kimmo Koskenniemi.^{[2]}^{[nonprimary source needed]} A common way of using transducers is in a socalled "cascade", where transducers for various operations are combined into a single transducer by repeated application of the composition operator (defined below).
Formal construction
Formally, a finite transducer T is a 6tuple (Q, Σ, Γ, I, F, δ) such that:
 Q is a finite set, the set of states;
 Σ is a finite set, called the input alphabet;
 Γ is a finite set, called the output alphabet;
 I is a subset of Q, the set of initial states;
 F is a subset of Q, the set of final states; and
 (where ε is the empty string) is the transition relation.
We can view (Q, δ) as a labeled directed graph, known as the transition graph of T: the set of vertices is Q, and means that there is a labeled edge going from vertex q to vertex r. We also say that a is the input label and b the output label of that edge.
NOTE: This definition of finite transducer is also called letter transducer (Roche and Schabes 1997); alternative definitions are possible, but can all be converted into transducers following this one.
Define the extended transition relation as the smallest set such that:
 ;
 for all ; and
 whenever and then .
The extended transition relation is essentially the reflexive transitive closure of the transition graph that has been augmented to take edge labels into account. The elements of are known as paths. The edge labels of a path are obtained by concatenating the edge labels of its constituent transitions in order.
The behavior of the transducer T is the rational relation [T] defined as follows: if and only if there exists and such that . This is to say that T transduces a string into a string if there exists a path from an initial state to a final state whose input label is x and whose output label is y.
Weighted automata
Finite State Transducers can be weighted, where each transition is labelled with a weight in addition to the input and output labels. A Weighted Finite State Transducer (WFST) over a set K of weights can be defined similarly to an unweighted one as an 8tuple T=(Q, Σ, Γ, I, F, E, λ, ρ), where:
 Q, Σ, Γ, I, F are defined as above;
 (where ε is the empty string) is the finite set of transitions;
 maps initial states to weights;
 maps final states to weights.
In order to make certain operations on WFSTs welldefined, it is convenient to require the set of weights to form a semiring.^{[3]} Two typical semirings used in practice are the log semiring and tropical semiring: unweighted automata may be regarded as having weights in the Boolean semiring.^{[4]}
Stochastic FST
Stochastic FSTs (also known as probabilistic FSTs or statistical FSTs) are presumably a form of weighted FST.^{[citation needed]}
Operations on finitestate transducers
The following operations defined on finite automata also apply to finite transducers:
 Union. Given transducers T and S, there exists a transducer such that if and only if or .
 Concatenation. Given transducers T and S, there exists a transducer such that if and only if there exist with and
 Kleene closure. Given a transducer T, there exists a transducer with the following properties:

;
(k1)


and then ;
(k2)

 Composition. Given a transducer T on alphabets Σ and Γ and a transducer S on alphabets Γ and Δ, there exists a transducer on Σ and Δ such that if and only if there exists a string such that and . This operation extends to the weighted case.^{[5]}
 This definition uses the same notation used in mathematics for relation composition. However, the conventional reading for relation composition is the other way around: given two relations T and S, when there exist some y such that and
 Projection to an automaton. There are two projection functions: preserves the input tape, and preserves the output tape. The first projection, is defined as follows:
 Given a transducer T, there exists a finite automaton such that accepts x if and only if there exists a string y for which
 The second projection, is defined similarly.
 Determinization. Given a transducer T, we want to build an equivalent transducer that has a unique initial state and such that no two transitions leaving any state share the same input label. The powerset construction can be extended to transducers, or even weighted transducers, but sometimes fails to halt; indeed, some nondeterministic transducers do not admit equivalent deterministic transducers.^{[6]} Characterizations of determinizable transducers have been proposed^{[7]} along with efficient algorithms to test them:^{[8]} they rely on the semiring used in the weighted case as well as a general property on the structure of the transducer (the twins property).
 Weight pushing for the weighted case.^{[9]}
 Minimization for the weighted case.^{[10]}
 Removal of epsilontransitions.
Additional properties of finitestate transducers
 It is decidable whether the relation [T] of a transducer T is empty.
 It is decidable whether there exists a string y such that x[T]y for a given string x.
 It is undecidable whether two transducers are equivalent.^{[11]} Equivalence is however decidable in the special case where the relation [T] of a transducer T is a (partial) function.
 If one defines the alphabet of labels , finitestate transducers are isomorphic to NDFA over the alphabet , and may therefore be determinized (turned into deterministic finite automata over the alphabet ) and subsequently minimized so that they have the minimum number of states.^{[citation needed]}
Applications
Contextsensitive rewriting rules of the form a → b / c _ d, used in linguistics to model phonological rules and sound change, are computationally equivalent to finitestate transducers, provided that application is nonrecursive, i.e. the rule is not allowed to rewrite the same substring twice.^{[12]}
Weighted FSTs found applications in natural language processing, including machine translation, and in machine learning.^{[13]}^{[14]} An implementation for partofspeech tagging can be found as one component of the OpenGrm^{[15]} library.
See also
Notes
 ^ Jurafsky, Daniel (2009). Speech and Language Processing. Pearson. ISBN 9789332518414.
 ^ Koskenniemi 1983
 ^ Berstel, Jean; Reutenauer, Christophe (2011). Noncommutative rational series with applications. Encyclopedia of Mathematics and Its Applications. 137. Cambridge: Cambridge University Press. p. 16. ISBN 9780521190220. Zbl 1250.68007.
 ^ Lothaire, M. (2005). Applied combinatorics on words. Encyclopedia of Mathematics and Its Applications. 105. A collective work by Jean Berstel, Dominique Perrin, Maxime Crochemore, Eric Laporte, Mehryar Mohri, Nadia Pisanti, MarieFrance Sagot, Gesine Reinert, Sophie Schbath, Michael Waterman, Philippe Jacquet, Wojciech Szpankowski, Dominique Poulalhon, Gilles Schaeffer, Roman Kolpakov, Gregory Koucherov, JeanPaul Allouche and Valérie Berthé. Cambridge: Cambridge University Press. p. 211. ISBN 0521848024. Zbl 1133.68067.
 ^ Mohri 2004, pp. 3–5
 ^ [1]
 ^ Mohri 2004, pp. 5–6
 ^ Allauzen 2003
 ^ Mohri 2004, pp. 7–9
 ^ Mohri 2004, pp. 9–11
 ^ Griffiths 1968
 ^ "Regular Models of Phonological Rule Systems" (PDF). Retrieved August 25, 2012.
 ^ Kevin Knight and Jonathan May (2009). "Applications of Weighted Automata in Natural Language Processing". In Manfred Droste; Werner Kuich; Heiko Vogler. Handbook of Weighted Automata. Springer Science & Business Media. ISBN 9783642014925.CS1 maint: Uses authors parameter (link)
 ^ "Learning with Weighted Transducers" (PDF). Retrieved April 29, 2017.
 ^ OpenGrm
References
 Allauzen, Cyril; Mohri, Mehryar (2003). "Efficient Algorithms for Testing the Twins Property" (PDF). Journal of Automata, Languages and Combinatorics. 8 (2): 117–144.
 Koskenniemi, Kimmo (1983), Twolevel morphology: A general computational model of wordform recognition and production (PDF), Department of General Linguistics, University of Helsinki
 Mohri, Mehryar (2004). "Weighted FiniteState Transducer Algorithms: An Overview" (PDF). Formal Languages and Applications. 148 (620): 551–564. doi:10.1007/9783540398868_29.
 Griffiths, T. V. (1968). "The unsolvability of the Equivalence Problem for ΛFree nondeterministic generalized machines". 15 (3). ACM: 409–413.
External links
 OpenFst, an opensource library for FST operations.
 Stuttgart Finite State Transducer Tools, another opensource FST toolkit
 java FST Framework, an opensource java FST Framework capable of handling OpenFst text format.
 Vcsn, an opensource platform (C++ & IPython) platform for weighted automata and rational expressions.
 Finite State MorphologyThe Book XFST/ LEXC, a description of Xerox's implementation of finitestate transducers intended for linguistic applications.
 FOMA, an opensource implementation of most of the capabilities of the Xerox XFST/ LEXC implementation.
Further reading
 Jurafsky, Daniel; James H. Martin (2000). Speech and Language Processing. Prentice Hall. pp. 71–83. ISBN 0130950696.
 Kornai, Andras (1999). Extended Finite State Models of Language. Cambridge University Press. ISBN 052163198X.
 Roche, Emmanuel; Yves Schabes (1997). Finitestate language processing. MIT Press. pp. 1–65. ISBN 0262181827.
 Beesley, Kenneth R.; Lauri Karttunen (2003). Finite State Morphology. Center for the Study of Language and Information. ISBN 1575864347.
 Roark, Brian; Richard Sproat (2007). Computational Approaches to Morphology and Syntax. Oxford University Press. ISBN 0199274789.
 Berstel, Jean (1979). Transductions and Contextfree Languages. Teubner Verlag.. Free PDF version