tailieunhanh - Báo cáo khoa học: "Bayesian Learning of Non-compositional Phrases with Synchronous Parsing"

We combine the strengths of Bayesian modeling and synchronous grammar in unsupervised learning of basic translation phrase pairs. The structured space of a synchronous grammar is a natural fit for phrase pair probability estimation, though the search space can be prohibitively large. Therefore we explore efficient algorithms for pruning this space that lead to empirically effective results. | Bayesian Learning of Non-compositional Phrases with Synchronous Parsing Hao Zhang Computer Science Department University of Rochester Rochester NY 14627 zhanghao@ Robert C. Moore Microsoft Research One Microsoft Way Redmond WA 98052 UsA bobmoore@ Chris Quirk Microsoft Research One Microsoft Way Redmond WA 98052 UsA chrisq@ Daniel Gildea Computer Science Department University of Rochester Rochester NY 14627 gildea@ Abstract We combine the strengths of Bayesian modeling and synchronous grammar in unsupervised learning of basic translation phrase pairs. The structured space of a synchronous grammar is a natural fit for phrase pair probability estimation though the search space can be prohibitively large. Therefore we explore efficient algorithms for pruning this space that lead to empirically effective results. Incorporating a sparse prior using Variational Bayes biases the models toward generalizable parsimonious parameter sets leading to significant improvements in word alignment. This preference for sparse solutions together with effective pruning methods forms a phrase alignment regimen that produces better end-to-end translations than standard word alignment approaches. 1 Introduction Most state-of-the-art statistical machine translation systems are based on large phrase tables extracted from parallel text using word-level alignments. These word-level alignments are most often obtained using Expectation Maximization on the conditional generative models of Brown et al. 1993 and Vogel et al. 1996 . As these word-level alignment models restrict the word alignment complexity by requiring each target word to align to zero or one source words results are improved by aligning both source-to-target as well as target-to-source then heuristically combining these alignments. Finally the set of phrases consistent with the word alignments are extracted from every sentence pair these form the basis of the decoding .

TÀI LIỆU MỚI ĐĂNG
20    204    2    15-05-2024
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.