tailieunhanh - Báo cáo khoa học: "Real-Time Correction of Closed-Captions"
Live closed-captions for deaf and hard of hearing audiences are currently produced by stenographers, or by voice writers using speech recognition. Both techniques can produce captions with errors. We are currently developing a correction module that allows a user to intercept the real-time caption stream and correct it before it is broadcast. We report results of preliminary experiments on correction rate and actual user performance using a prototype correction module connected to the output of a speech recognition captioning system. . | Real-Time Correction of Closed-Captions P. Cardinal G. Boulianne M. Comeau M. Boisvert Centre de recherche Informatique de Montreal CRIM Montreal Canada Abstract Live closed-captions for deaf and hard of hearing audiences are currently produced by stenographers or by voice writers using speech recognition. Both techniques can produce captions with errors. We are currently developing a correction module that allows a user to intercept the real-time caption stream and correct it before it is broadcast. We report results of preliminary experiments on correction rate and actual user performance using a prototype correction module connected to the output of a speech recognition captioning system. 1 Introduction CRIM s automatic speech recognition system has been applied to live closed-captioning of french-canadian television programs Boulianne et al. 2006 . The low error rate of our approach depends notably on the integration of the re-speak method Imai et al. 2002 for a controlled acoustic environment automatic speaker adaptation and dynamic updates of language models and vocabularies and was deemed acceptable by several Canadian broadcasters RDS CPAC GTVA and TQS who have adopted it over the past few years for captioning sports public affairs and newscasts. However for sensitive applications where error rates must practically be zero or other situations where speech recognition error rates are too high we are currently developing a real-time correction interface. In essence this interface allows a user to correct the word stream from speech recognition before it arrives at the closed-caption encoder. 2 Background Real-time correction must be done within difficult constraints with typical captioning rates of 130 words per minute and 5 to 10 word error rate the user must correct between 6 and 13 errors per minute. In addition the process should not introduce more than a few seconds of additional delay over the 3 seconds already needed by speech
đang nạp các trang xem trước