tailieunhanh - Báo cáo khoa học: "How do you pronounce your name? Improving G2P with transliterations"
Grapheme-to-phoneme conversion (G2P) of names is an important and challenging problem. The correct pronunciation of a name is often reflected in its transliterations, which are expressed within a different phonological inventory. We investigate the problem of using transliterations to correct errors produced by state-of-the-art G2P systems. We present a novel re-ranking approach that incorporates a variety of score and n-gram features, in order to leverage transliterations from multiple languages. . | How do you pronounce your name Improving G2P with transliterations Aditya Bhargava and Grzegorz Kondrak Department of Computing Science University of Alberta Edmonton Alberta Canada T6G 2E8 abhargava kondrak @ Abstract Grapheme-to-phoneme conversion G2P of names is an important and challenging problem. The correct pronunciation of a name is often reflected in its transliterations which are expressed within a different phonological inventory. We investigate the problem of using transliterations to correct errors produced by state-of-the-art G2P systems. We present a novel re-ranking approach that incorporates a variety of score and n-gram features in order to leverage transliterations from multiple languages. Our experiments demonstrate significant accuracy improvements when re-ranking is applied to n-best lists generated by three different G2P programs. 1 Introduction Grapheme-to-phoneme conversion G2P in which the aim is to convert the orthography of a word to its pronunciation phonetic transcription plays an important role in speech synthesis and understanding. Names which comprise over 75 of unseen words Black et al. 1998 present a particular challenge to G2P systems because of their high pronunciation variability. Guessing the correct pronunciation of a name is often difficult especially if they are of foreign origin this is attested by the ad hoc transcriptions which sometimes accompany new names introduced in news articles especially for international stories with many foreign names. Transliterations provide a way of disambiguating the pronunciation of names. They are more abundant than phonetic transcriptions for example when news items of international or global significance are reported in multiple languages. In addition writing 399 scripts such as Arabic Korean or Hindi are more consistent and easier to identify than various phonetic transcription schemes. The process of transliteration also called phonetic translation Li et al. 2009b .
đang nạp các trang xem trước