tailieunhanh - Báo cáo khoa học: "Parameter Estimation for Probabilistic Finite-State Transducers∗"

Weighted finite-state transducers suffer from the lack of a training algorithm. Training is even harder for transducers that have been assembled via finite-state operations such as composition, minimization, union, concatenation, and closure, as this yields tricky parameter tying. We formulate a “parameterized FST” paradigm and give training algorithms for it, including a general bookkeeping trick (“expectation semirings”) that cleanly and efficiently computes expectations and gradients. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 1-8. Parameter Estimation for Probabilistic Finite-State Transducers Jason Eisner Department of Computer Science Johns Hopkins University Baltimore mD USA 21218-2691 jason@ Abstract Weighted finite-state transducers suffer from the lack of a training algorithm. Training is even harder for transducers that have been assembled via finite-state operations such as composition minimization union concatenation and closure as this yields tricky parameter tying. We formulate a parameterized FST paradigm and give training algorithms for it including a general bookkeeping trick expectation semirings that cleanly and efficiently computes expectations and gradients. 1 Background and Motivation Rational relations on strings have become widespread in language and speech engineering Roche and Schabes 1997 . Despite bounded memory they are well-suited to describe many linguistic and textual processes either exactly or approximately. A relation is a set of input output pairs. Relations are more general than functions because they may pair a given input string with more or fewer than one output string. The class of so-called rational relations admits a nice declarative programming paradigm. Source code describing the relation a regular expression is compiled into efficient object code in the form of a 2-tape automaton called a finite-state transducer . The object code can even be optimized for runtime and code size via algorithms such as deter-minization and minimization of transducers . This programming paradigm supports efficient nondeterminism including parallel processing over infinite sets of input strings and even allows reverse computation from output to input. Its unusual flexibility for the practiced programmer stems from the many operations under which rational relations are closed. It is common to define further useful operations as macros which

TỪ KHÓA LIÊN QUAN