tailieunhanh - Báo cáo khoa học: "Domain Adaptation of Maximum Entropy Language Models"

We investigate a recently proposed Bayesian adaptation method for building style-adapted maximum entropy language models for speech recognition, given a large corpus of written language data and a small corpus of speech transcripts. Experiments show that the method consistently outperforms linear interpolation which is typically used in such cases. | Domain Adaptation of Maximum Entropy Language Models Tanel Alumae Adaptive Informatics Research Centre School of Science and Technology Aalto University Helsinki Finland tanel@ Abstract We investigate a recently proposed Bayesian adaptation method for building style-adapted maximum entropy language models for speech recognition given a large corpus of written language data and a small corpus of speech transcripts. Experiments show that the method consistently outperforms linear interpolation which is typically used in such cases. 1 Introduction In large vocabulary speech recognition a language model LM is typically estimated from large amounts of written text data. However recognition is typically applied to speech that is stylistically different from written language. For example in an often-tried setting speech recognition is applied to broadcast news that includes introductory segments conversations and spontaneous interviews. To decrease the mismatch between training and test data often a small amount of speech data is human-transcribed. A LM is then built by interpolating the models estimated from large corpus of written language and the small corpus of transcribed data. However in practice different models might be of different importance depending on the word context. Global interpolation doesn t take such variability into account and all predictions are weighted across models identically regardless of the context. In this paper we investigate a recently proposed Bayesian adaptation approach Daume III 2007 Finkel and Manning 2009 for adapting a conditional maximum entropy ME LM Rosenfeld 1996 to a new domain given a large corpus of out-of-domain training data and a small corpus of in-domain data. The main contribution of this Currently with Tallinn University of Technology Estonia Mikko Kurimo Adaptive Informatics Research Centre School of Science and Technology Aalto University Helsinki Finland paper is that we show how the .