tailieunhanh - Báo cáo khoa học: "Processing Broadcast Audio for Information Access"
This paper addresses recent progress in speaker-independent, large vocabulary, continuous speech recognition, which has opened up a wide range of near and mid-term applications. One rapidly expanding application area is the processing of broadcast audio for information access. At L IMSI, broadcast news transcription systems have been developed for English, French, German, Mandarin and Portuguese, and systems for other languages are under development. | Processing Broadcast Audio for Information Access Jean-Luc Gauvain Lori Lamel Gilles Adda Martine Adda-Decker Claude Barras Langzhou Chen and Yannick de Kercadio Spoken Language Processing Group LIMSI-CNRS 133 91403 Orsay cedex France gauvain@ http tlp Abstract This paper addresses recent progress in speaker-independent large vocabulary continuous speech recognition which has opened up a wide range of near and mid-term applications. One rapidly expanding application area is the processing of broadcast audio for information access. At Limsi broadcast news transcription systems have been developed for English French German Mandarin and Portuguese and systems for other languages are under development. Audio indexation must take into account the specificities of audio data such as needing to deal with the continuous data stream and an imperfect word transcription. Some near-term applications areas are audio data mining selective dissemination of information and media monitoring. 1 Introduction A major advance in speech processing technology is the ability of todays systems to deal with non-homogeneous data as is exemplified by broadcast data. With the rapid expansion of different media sources there is a pressing need for automatic processing of such audio streams. Broadcast audio is challenging as it contains segments of various acoustic and linguistic natures which require appropriate modeling. A special section in the Communications of the ACM devoted to News on Demand Maybury 2000 includes contributions from many of the sites carrying out active research in this area. Via speech recognition spoken document retrieval SDR can support random access to relevant portions of audio documents reducing the time needed to identify recordings in large multimedia databases. The TREC Text REtrieval Conference SDR evaluation showed that only small differences in information retrieval performance are observed for automatic and manual transcriptions .
đang nạp các trang xem trước