tailieunhanh - Báo cáo khoa học: "A Comparison of Syntactically Motivated Word Alignment Spaces"

This work is concerned with the space of alignments searched by word alignment systems. We focus on situations where word re-ordering is limited by syntax. We present two new alignment spaces that limit an ITG according to a given dependency parse. We provide D-ITG grammars to search these spaces completely and without redundancy. We conduct a careful comparison of five alignment spaces, and show that limiting search with an ITG reduces error rate by 10%, while a D-ITG produces a 31% reduction. . | A Comparison of Syntactically Motivated Word Alignment Spaces Colin Cherry Department of Computing Science University of Alberta Edmonton AB Canada T6G 2E8 colinc@ Dekang Lin Google Inc. 1600 Amphitheatre Parkway Mountain View CA USA 94043 lindek@ Abstract This work is concerned with the space of alignments searched by word alignment systems. We focus on situations where word re-ordering is limited by syntax. We present two new alignment spaces that limit an ITG according to a given dependency parse. We provide D-ITG grammars to search these spaces completely and without redundancy. We conduct a careful comparison of five alignment spaces and show that limiting search with an ITG reduces error rate by 10 while a D-ITG produces a 31 reduction. 1 Introduction Bilingual word alignment finds word-level correspondences between parallel sentences. The task originally emerged as an intermediate result of training the IBM translation models Brown et al. 1993 . These models use minimal linguistic intuitions they essentially treat sentences as flat strings. They remain the dominant method for word alignment Och and Ney 2003 . There have been several proposals to introduce syntax into word alignment. Some work within the framework of synchronous grammars Wu 1997 Melamed 2003 while others create a generative story that includes a parse tree provided for one of the sentences Yamada and Knight 2001 . There are three primary reasons to add syntax to word alignment. First one can incorporate syntactic features such as grammar productions into the models that guide the alignment search. Second movement can be modeled more naturally when a three-word noun phrase moves during translation it can be modeled as one movement operation instead of three. Finally one can restrict the type of movement that is considered shrinking the number of alignments that are attempted. We investigate this last advantage of syntactic alignment. We fix an alignment scoring model .

TỪ KHÓA LIÊN QUAN