tailieunhanh - Báo cáo khoa học: "Efficient Path Counting Transducers for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices"
This paper presents an efficient implementation of linearised lattice minimum Bayes-risk decoding using weighted finite state transducers. We introduce transducers to efficiently count lattice paths containing n-grams and use these to gather the required statistics. We show that these procedures can be implemented exactly through simple transformations of word sequences to sequences of n-grams. | Efficient Path Counting Transducers for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices Graeme Blackwood Adria de Gispert William Byrne Machine Intelligence Laboratory Cambridge University Engineering Department Trumpington Street CB2 1PZ . gwb24 ad465 wjb31 @ Abstract This paper presents an efficient implementation of linearised lattice minimum Bayes-risk decoding using weighted finite state transducers. We introduce transducers to efficiently count lattice paths containing n-grams and use these to gather the required statistics. We show that these procedures can be implemented exactly through simple transformations of word sequences to sequences of n-grams. This yields a novel implementation of lattice minimum Bayes-risk decoding which is fast and exact even for very large lattices. 1 Introduction This paper focuses on an exact implementation of the linearised form of lattice minimum Bayes-risk LMBR decoding using general purpose weighted finite state transducer WFST opera-tions1 . The LMBR decision rule in Tromble et al. 2008 has the form E argmaxiỠO E V 0u u E p u E E 1 âN J 1 where E is a lattice of translation hypotheses N is the set of all n-grams in the lattice typically n 1. 4 and the parameters 0 are constants estimated on held-out data. The quantity p u E we refer to as the path posterior probability of the n-gram u. This particular posterior is defined as p u E p Eu E P E F 2 E Eu where Eu E G E u E 0 is the subset of lattice paths containing the n-gram u at least 1We omit an introduction to WFSTs for space reasons. See Mohri et al. 2008 for details of the general purpose WFST operations used in this paper. once. It is the efficient computation of these path posterior n-gram probabilities that is the primary focus of this paper. We will show how general purpose WFST algorithms can be employed to efficiently compute p E for all u G N. Tromble et al. 2008 use Equation 1 as an approximation to the general form of .
đang nạp các trang xem trước