tailieunhanh - Báo cáo khoa học: "Position Specific Posterior Lattices for Indexing Speech"
The paper presents the Position Specific Posterior Lattice, a novel representation of automatic speech recognition lattices that naturally lends itself to efficient indexing of position information and subsequent relevance ranking of spoken documents using proximity. In experiments performed on a collection of lecture recordings — MIT iCampus data — the spoken document ranking accuracy was improved by 20% relative over the commonly used baseline of indexing the 1-best output from an automatic speech recognizer. The Mean Average Precision (MAP) increased from when using 1-best output to when using the new lattice representation. . | Position Specific Posterior Lattices for Indexing Speech Ciprian Chelba and Alex Acero Microsoft Research Microsoft Corporation One Microsoft Way Redmond WA 98052 chelba alexac @ Abstract The paper presents the Position Specific Posterior Lattice a novel representation of automatic speech recognition lattices that naturally lends itself to efficient indexing of position information and subsequent relevance ranking of spoken documents using proximity. In experiments performed on a collection of lecture recordings MIT iCam-pus data the spoken document ranking accuracy was improved by 20 relative over the commonly used baseline of indexing the 1-best output from an automatic speech recognizer. The Mean Average Precision MAP increased from when using 1-best output to when using the new lattice representation. The reference used for evaluation is the output of a standard retrieval engine working on the manual transcription of the speech collection. Albeit lossy the PSPL lattice is also much more compact than the ASR 3-gram lattice from which it is computed which translates in reduced inverted index size as well at virtually no degradation in word-error-rate performance. Since new paths are introduced in the lattice the ORACLE accuracy increases over the original ASR lattice. 1 Introduction Ever increasing computing power and connectivity bandwidth together with falling storage costs result in an overwhelming amount of data of various types being produced exchanged and stored. Consequently search has emerged as a key application as more and more data is being saved Church 2003 . Text search in particular is the most active area with applications that range from web and intranet search to searching for private information residing on one s hard-drive. Speech search has not received much attention due to the fact that large collections of untranscribed spoken material have not been available mostly due to storage constraints. As storage is becoming .
đang nạp các trang xem trước