tailieunhanh - Báo cáo khoa học: "Finite State Transducers Approximating Hidden Markov Models"

This paper describes the conversion of a Hidden Markov Model into a sequential transducer that closely approximates the behavior of the stochastic model. This transformation is especially advantageous for part-of-speech tagging because the resulting transducer can be composed with other transducers that encode correction rules for the most frequent tagging errors. The speed of tagging is also improved. The described methods have been implemented and successfully tested on six languages. | Finite State Transducers Approximating Hidden Markov Models André Kempe Rank Xerox Research Centre - Grenoble Laboratory 6 chemin de Maupertuis - 38240 Meylan - France http research mltt Abstract This paper describes the conversion of a Hidden Markov Model into a sequential transducer that closely approximates the behavior of the stochastic model. This transformation is especially advantageous for part-of-speech tagging because the resulting transducer can be composed with other transducers that encode correction rules for the most frequent tagging errors. The speed of tagging is also improved. The described methods have been implemented and successfully tested on six languages. 1 Introduction Finite-state automata have been successfully applied in many areas of computational linguistics. This paper describes two algorithms1 which approximate a Hidden Markov Model HMM used for part-of-speech tagging by a finite-state transducer FST . These algorithms may be useful beyond the current description on any kind of analysis of written or spoken language based on both finite-state technology and HMMs such as corpus analysis speech recognition etc. Both algorithms have been fully implemented. An HMM used for tagging encodes like a transducer a relation between two languages. One language contains sequences of ambiguity classes obtained by looking up in a lexicon all words of a sentence. The other language contains sequences of tags obtained by statistically disambiguating the class sequences. From the outside an HMM tagger behaves like a sequential transducer that deterministically 1There is a different unpublished algorithm by Julian M. Kupiec and John T. Maxwell . . maps every class sequence to a tag sequence . DET PRO ADJ NOUN ADJ NOUN . end DET ADJ NOUN . END J The aim of the conversion is not to generate FSTs that behave in the same way or in as similar a way as possible like HMMs but rather FSTs that .

TỪ KHÓA LIÊN QUAN