tailieunhanh - manning schuetze statisticalnlp phần 8

sau đó chúng tôi có thể tính toán phân tích cú pháp có thể xảy ra nhất cho một câu như sau. Bước khởi tạo gán cho từng sản xuất nhất nguyên tại một nút lá xác suất của nó. Đối với bước quy nạp, chúng ta lại biết rằng nguyên tắc đầu tiên áp dụng phải là một quy tắc nhị phân | Text Alignment 467 Kong. One reason for using such texts is that they are easy to obtain in quantity but we suspect that the nature of these texts has also been helpful to Statistical NLP researchers the demands of accuracy lead the translators of this sort of material to to use very consistent literal translations. Other sources have been used such as articles from newspapers and magazines published in several languages and yet other sources are easily available religious and literary works are often freely available in many languages but these not only do not provide such a large supply of text from a consistent period and genre but they also tend to involve much less literal translation and hence good results are harder to come by. Given that parallel texts are available online a first task is to perform ALIGNMENT gross large scale alignment noting which paragraphs or sentences in one language correspond to which paragraphs or sentences in another language. This problem has been well-studied and a number of quite successful methods have been proposed. Once this has been achieved a second problem is to learn which words tend to be translated by which other words which one could view as the problem of acquiring a bilingual dictionary from text. In this section we deal with the text alignment problem while the next section deals with word alignment and induction of bilingual dictionaries from aligned text. Aligning sentences and paragraphs Text alignment is an almost obligatory first step for making use of multilingual text corpora. Text alignment can be used not only for the two tasks considered in the following sections bilingual lexicography and machine translation but it is also a first step in using multilingual corpora as knowledge sources in other domains such as for word sense biguation or multilingual information retrieval. Text alignment can also be a useful practical tool for assisting translators. In many situations such as when dealing with

TỪ KHÓA LIÊN QUAN