tailieunhanh - Báo cáo khoa học: "Ordering Phrases with Function Words"

This paper presents a Function Word centered, Syntax-based (FWS) solution to address phrase ordering in the context of statistical machine translation (SMT). Motivated by the observation that function words often encode grammatical relationship among phrases within a sentence, we propose a probabilistic synchronous grammar to model the ordering of function words and their left and right arguments. We improve phrase ordering performance by lexicalizing the resulting rules in a small number of cases corresponding to function words. . | Ordering Phrases with Function Words Hendra Setiawan and Min-Yen Kan Haizhou Li School of Computing Institute for Infocomm Research National University of Singapore 21 Heng Mui Keng Terrace Singapore 117543 Singapore 119613 hendrase kanmy @ hli@ Abstract This paper presents a Function Word centered Syntax-based FWS solution to address phrase ordering in the context of statistical machine translation SMT . Motivated by the observation that function words often encode grammatical relationship among phrases within a sentence we propose a probabilistic synchronous grammar to model the ordering of function words and their left and right arguments. We improve phrase ordering performance by lexi-calizing the resulting rules in a small number of cases corresponding to function words. The experiments show that the FWS approach consistently outperforms the baseline system in ordering function words arguments and improving translation quality in both perfect and noisy word alignment scenarios. 1 Introduction The focus of this paper is on function words a class of words with little intrinsic meaning but is vital in expressing grammatical relationships among words within a sentence. Such encoded grammatical information often implicit makes function words pivotal in modeling structural divergences as projecting them in different languages often result in long-range structural changes to the realized sentences. Just as a foreign language learner often makes mistakes in using function words we observe that current machine translation MT systems often perform poorly in ordering function words arguments 712 lexically correct translations often end up reordered incorrectly. Thus we are interested in modeling the structural divergence encoded by such function words. A key finding of our work is that modeling the ordering of the dependent arguments of function words results in better translation quality. Most current systems use statistical knowledge .

TÀI LIỆU LIÊN QUAN