Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Combining Speech Retrieval Results with Generalized Additive Models"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

Rapid and inexpensive techniques for automatic transcription of speech have the potential to dramatically expand the types of content to which information retrieval techniques can be productively applied, but limitations in accuracy and robustness must be overcome before that promise can be fully realized. Combining retrieval results from systems built on various errorful representations of the same collection offers some potential to address these challenges. | Combining Speech Retrieval Results with Generalized Additive Models J. Scott Olsson and Douglas W. Oard UMIACS Laboratory for Computational Linguistics and Information Processing University of Maryland College Park MD 20742 Human Language Technology Center of Excellence John Hopkins University Baltimore MD 21211 olsson@math.umd.edu oard@umd.edu Abstract Rapid and inexpensive techniques for automatic transcription of speech have the potential to dramatically expand the types of content to which information retrieval techniques can be productively applied but limitations in accuracy and robustness must be overcome before that promise can be fully realized. Combining retrieval results from systems built on various errorful representations of the same collection offers some potential to address these challenges. This paper explores that potential by applying Generalized Additive Models to optimize the combination of ranked retrieval results obtained using transcripts produced automatically for the same spoken content by substantially different recognition systems. Topic-averaged retrieval effectiveness better than any previously reported for the same collection was obtained and even larger gains are apparent when using an alternative measure emphasizing results on the most difficult topics. 1 Introduction Speech retrieval like other tasks that require transforming the representation of language suffers from both random and systematic errors that are introduced by the speech-to-text transducer. Limitations in signal processing acoustic modeling pronunciation vocabulary and language modeling can be accommodated in several ways each of which make different trade-offs and thus induce different Dept. of Mathematics AMSC UMD t College of Information Studies UMD error characteristics. Moreover different applications produce different types of challenges and different opportunities. As a result optimizing a single recognition system for all transcription tasks is well beyond .