tailieunhanh - Báo cáo khoa học: "Punjabi Machine Transliteration"

Machine Transliteration is to transcribe a word written in a script with approximate phonetic equivalence in another language. It is useful for machine translation, cross-lingual information retrieval, multilingual text and speech processing. Punjabi Machine Transliteration (PMT) is a special case of machine transliteration and is a process of converting a word from Shahmukhi (based on Arabic script) to Gurmukhi (derivation of Landa, Shardha and Takri, old scripts of Indian subcontinent), two scripts of Punjabi, irrespective of the type of word. . | Punjabi Machine Transliteration M. G. Abbas Malik Department of Linguistics Denis Diderot University of Paris 7 Paris France Abstract Machine Transliteration is to transcribe a word written in a script with approximate phonetic equivalence in another language. It is useful for machine translation cross-lingual information retrieval multilingual text and speech processing. Punjabi Machine Transliteration PMT is a special case of machine transliteration and is a process of converting a word from Shahmukhi based on Arabic script to Gurmukhi derivation of Landa Shardha and Takri old scripts of Indian subcontinent two scripts of Punjabi irrespective of the type of word. The Punjabi Machine Transliteration System uses transliteration rules character mappings and dependency rules for transliteration of Shahmukhi words into Gurmukhi. The PMT system can transliterate every word written in Shahmukhi. 1 Introduction Punjabi is the mother tongue of more than 110 million people of Pakistan 66 million India 44 million and many millions in America Canada and Europe. It has been written in two mutually incomprehensible scripts Shahmukhi and Gur-mukhi for centuries. Punjabis from Pakistan are unable to comprehend Punjabi written in Gur-mukhi and Punjabis from India are unable to comprehend Punjabi written in Shahmukhi. In contrast they do not have any problem to understand the verbal expression of each other. Punjabi Machine Transliteration PMT system is an effort to bridge the written communication gap between the two scripts for the benefit of the millions of Punjabis around the globe. Transliteration refers to phonetic translation across two languages with different writing systems Knight Graehl 1998 such as Arabic to English Nasreen Leah 2003 . Most prior work has been done for Machine Translation MT Knight Leah 97 Paola Sanjeev 2003 Knight Stall 1998 from English to other major languages of the world like Arabic Chinese etc. for cross-lingual information