tailieunhanh - Báo cáo khoa học: "Dynamic compilation of weighted context-free grammars"
Weighted context-free grammars are a convenient formalism for representing grammatical constructions and their likelihoods in a variety of language-processing applications. In particular, speech understanding applications require appropriate grammars both to constrain speech recognition and to help extract the meaning of utterances. In many of those applications, the actual languages described are regular, but context-free representations are much more concise and easier to create. | Dynamic compilation of weighted context-free grammars Mehryar Mohri and Fernando c. N. Pereira AT T Labs - Research 180 Park Avenue Florham Park NJ 07932 USA mohri pereira @ Abstract Weighted context-free grammars are a convenient formalism for representing grammatical constructions and their likelihoods in a variety of language-processing applications. In particular speech understanding applications require appropriate grammars both to constrain speech recognition and to help extract the meaning of utterances. In many of those applications the actual languages described are regular but context-free representations are much more concise and easier to create. We describe an efficient algorithm for compiling into weighted finite automata an interesting class of weighted context-free grammars that represent regular languages. The resulting automata can then be combined with other speech recognition components. Our method allows the recognizer to dynamically activate or deactivate grammar rules and substitute a new regular language for some terminal symbols depending on previously recognized inputs all without recompilation. We also report experimental results showing the practicality of the approach. 1. Motivation Context-free grammars CFGs are widely used in language processing systems. In many applications in particular in speech recognition in addition to recognizing grammatical sequences it is necessary to provide some measure of the probability of those sequences. It is then natural to use weighted CFGs in which each rule is given a weight from an appropriate weight algebra Salomaa and Soittola 1978 . Weights can encode probabilities for instance by setting a rule s weight to the negative logarithm of the probability of the rule. Rule probabilities can be estimated in a variety of ways which we will not discuss further in this paper. Since speech recognizers cannot be fully certain about the correct transcription of a spoken utterance they instead
đang nạp các trang xem trước