tailieunhanh - Báo cáo khoa học: "Machine Transliteration"
It is challenging to translate names and technical terms across languages with different alphabets and sound inventories. These items are commonly transliterated, ., replaced with approximate phonetic equivalents. For example, computer in English comes out as ~ i/l:::'=--~-- (konpyuutaa) in Japanese. Translating such items from Japanese back to English is even more challenging, and of practical interest, as transliterated items make up the bulk of text phrases not found in bilingual dictionaries. We describe and evaluate a method for performing backwards transliterations by machine. This method uses a generative model, incorporating several distinct stages in the transliteration process | Machine Transliteration Kevin Knight and Jonathan Graehl Information Sciences Institute University of Southern California Marina del Rey CA 90292 Abstract It is challenging to translate names and technical terms across languages with different alphabets and sound inventories. These items are commonly transliterated . replaced with approximate phonetic equivalents. For example computer in English comes out as V e a. ỷ konpyuutaa in Japanese. Translating such items from Japanese back to English is even more challenging and of practical interest as transliterated items make up the bulk of text phrases not found in bilingual dictionaries. We describe and evaluate a method for performing backwards transliterations by machine. This method uses a generative model incorporating several distinct stages in the transliteration process. 1 Introduction Translators must deal with many problems and one of the most frequent is translating proper names and technical terms. For language pairs like Spanish English this presents no great challenge a phrase like Antonio Gil usually gets translated as Antonio Gil. However the situation is more complicated for language pairs that employ very different alphabets and sound systems such as Japanese English and Arabic English. Phonetic translation across these pairs is called transliteration. We will look at Japanese English transliteration in this paper. Japanese frequently imports vocabulary from other languages primarily but not exclusively from English. It has a special phonetic alphabet called katakana which is used primarily but not exclusively to write down foreign names and loanwords. To write a word like golfbag in katakana some compromises must be made. For example Japanese has no distinct L and R sounds the two English sounds collapse onto the same Japanese sound. A similar compromise must be struck for English H and F. Also Japanese generally uses an alternating consonant-vowel structure making it
đang nạp các trang xem trước