tailieunhanh - Báo cáo khoa học: "Effective Phrase Translation Extraction from Alignment Models"
Statistical Machine Translation defines the task of translating a source language sentence into a target language sentence . The traditional framework presented in (Brown et al., 1993) assumes a generative process where the source sentence is passed through a noisy stochastic process to produce the target sentence. | Effective Phrase Translation Extraction from Alignment Models Ashish Venugopal Language Technologies Institute Carnegie Mellon University Pittsburgh PA 15213 ashishv@ Stephan Vogel Language Technologies Institute Carnegie Mellon University Pittsburgh PA 15213 vogel @ Alex Waibel Language Technologies Institute Carnegie Mellon University Pittsburgh PA 15213 ahw@ Abstract Phrase level translation models are effective in improving translation quality by addressing the problem of local re-ordering across language boundaries. Methods that attempt to fundamentally modify the traditional IBM translation model to incorporate phrases typically do so at a prohibitive computational cost. We present a technique that begins with improved IBM models to create phrase level knowledge sources that effectively represent local as well as global phrasal context. Our method is robust to noisy alignments at both the sentence and corpus level delivering high quality phrase level translation pairs that contribute to significant improvements in translation quality as measured by the BLEU metric over word based lexica as well as a competing alignment based method. 1 Introduction Statistical Machine Translation defines the task of translating a source language sentence s Si Sf into a target language sentence t . The traditional framework presented in Brown et al. 1993 assumes a generative process where the source sentence is passed through a noisy stochastic process to produce the target sentence. The task can be formally stated as finding the t i argmax p t s where the search component is commonly referred to as the decoding step Wang and Waibel 1998 . Within the generative model the Bayes reformulation is used to estimate p t s p t p s t where p t is considered the language model and p s t is the translation model the IBM Brown et al. 1993 models being the de facto standard. Direct translation approaches Foster 2000 consider estimating p t s directly and .
đang nạp các trang xem trước