tailieunhanh - Báo cáo khoa học: "A Dynamic Bayesian Framework to Model Context and Memory in Edit Distance Learning: An Application to Pronunciation Classification"
Sitting at the intersection between statistics and machine learning, Dynamic Bayesian Networks have been applied with much success in many domains, such as speech recognition, vision, and computational biology. While Natural Language Processing increasingly relies on statistical methods, we think they have yet to use Graphical Models to their full potential. In this paper, we report on experiments in learning edit distance costs using Dynamic Bayesian Networks and present results on a pronunciation classification task. . | A Dynamic Bayesian Framework to Model Context and Memory in Edit Distance Learning An Application to Pronunciation Classification Karim Filali and Jeff Bilmes Departments of Computer Science Engineering and Electrical Engineering University of Washington Seattle Wa 98195 UsA karim@cs bilmes@ee . Abstract Sitting at the intersection between statistics and machine learning Dynamic Bayesian Networks have been applied with much success in many domains such as speech recognition vision and computational biology. While Natural Language Processing increasingly relies on statistical methods we think they have yet to use Graphical Models to their full potential. In this paper we report on experiments in learning edit distance costs using Dynamic Bayesian Networks and present results on a pronunciation classification task. By exploiting the ability within the DBN framework to rapidly explore a large model space we obtain a 40 reduction in error rate compared to a previous transducer-based method of learning edit distance. 1 Introduction Edit distance ED is a common measure of the similarity between two strings. It has a wide range of applications in classification natural language processing computational biology and many other fields. It has been extended in various ways for example to handle simple Lowrance and Wagner 1975 or constrained block transpositions Leusch et al. 2003 and other types of block operations Shapira and Storer 2003 and to measure similarity between graphs Myers et al. 2000 Klein 1998 or automata Mohri 2002 . This material was supported by NSF under Grant No. ISS-0326276. Another important development has been the use of data-driven methods for the automatic learning of edit costs such as in Ristad and Yianilos 1998 in the case of string edit distance and in Neuhaus and Bunke 2004 for graph edit distance. In this paper we revisit the problem of learning string edit distance costs within the Graphical Models framework. We apply our method .
đang nạp các trang xem trước