tailieunhanh - Báo cáo khoa học: "Unsupervised Word Alignment with Arbitrary Features"

We introduce a discriminatively trained, globally normalized, log-linear variant of the lexical translation models proposed by Brown et al. (1993). In our model, arbitrary, nonindependent features may be freely incorporated, thereby overcoming the inherent limitation of generative models, which require that features be sensitive to the conditional independencies of the generative process. However, unlike previous work on discriminative modeling of word alignment (which also permits the use of arbitrary features), the parameters in our models are learned from unannotated parallel sentences, rather than from supervised word alignments. . | Unsupervised Word Alignment with Arbitrary Features Chris Dyer Jonathan Clark Alon Lavie Noah A. Smith Language Technologies Institute Carnegie Mellon University PittsbUrgh PA 15213 UsA cdyer jhclark alavie nasmith @ Abstract We introduce a discriminatively trained globally normalized log-linear variant of the lexical translation models proposed by Brown et al. 1993 . In our model arbitrary nonindependent features may be freely incorporated thereby overcoming the inherent limitation of generative models which require that features be sensitive to the conditional independencies of the generative process. However unlike previous work on discriminative modeling of word alignment which also permits the use of arbitrary features the parameters in our models are learned from unannotated parallel sentences rather than from supervised word alignments. Using a variety of intrinsic and extrinsic measures including translation performance we show our model yields better alignments than generative baselines in a number of language pairs. 1 Introduction Word alignment is an important subtask in statistical machine translation which is typically solved in one of two ways. The more common approach uses a generative translation model that relates bilingual string pairs using a latent alignment variable to designate which source words or phrases generate which target words. The parameters in these models can be learned straightforwardly from parallel sentences using EM and standard inference techniques can recover most probable alignments Brown et al. 1993 . This approach is attractive because it only requires parallel training data. An alternative to the generative approach uses a discriminatively trained 409 alignment model to predict word alignments in the parallel corpus. Discriminative models are attractive because they can incorporate arbitrary overlapping features meaning that errors observed in the predictions made by the model can be addressed by engineering new

TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN