tailieunhanh - Báo cáo khoa học: "NICT-ATR Speech-to-Speech Translation System"
This paper describes the latest version of speech-to-speech translation systems developed by the team of NICT-ATR for over twenty years. The system is now ready to be deployed for the travel domain. A new noise-suppression technique notably improves speech recognition performance. Corpus-based approaches of recognition, translation, and synthesis enable coverage of a wide variety of topics and portability to other languages. recent progress. | NICT-ATR Speech-to-Speech Translation System Eiichiro Sumita Tohru Shimizu Satoshi Nakamura National Institute of Information and Communications Technology ATR Spoken Language Communication Research Laboratories 2-2-2 Hikaridai Keihanna Science City Kyoto 619-0288 Japan Abstract This paper describes the latest version of speech-to-speech translation systems developed by the team of NICT-ATR for over twenty years. The system is now ready to be deployed for the travel domain. A new noise-suppression technique notably improves speech recognition performance. Corpus-based approaches of recognition translation and synthesis enable coverage of a wide variety of topics and portability to other languages. 1 Introduction Speech recognition speech synthesis and machine translation research started about half a century ago. They have developed independently for a long time until speech-to-speech translation research was proposed in the 1980 s. The feasibility of speech-to-speech translation was the focus of research at the beginning because each component was difficult to build and their integration seemed more difficult. After groundbreaking work for two decades corpus-based speech and language processing technology have recently enabled the achievement of speech-to-speech translation that is usable in the real world. This paper introduces at ACL 2007 the state-of-the-art speech-to-speech translation system developed by NICT-ATR Japan. 2 SPEECH-TO-SPEECH TRANSLATION SYSTEM A speech-to-speech translation system is very large and complex. In this paper we prefer to describe recent progress. Detailed information can be found in 1 2 3 and their references. Speech recognition To obtain a compact accurate model from corpora with a limited size we use MDL-SSS 4 and composite multi-class N-gram models 5 for acoustic and language modeling respectively. MDL-SSS is an algorithm that automatically determines the appropriate .
đang nạp các trang xem trước