tailieunhanh - Báo cáo khoa học: "Limitations of Current Grammar Induction Algorithms"

I review a number of grammar induction algorithms (ABL, Emile, Adios), and test them on the Eindhoven corpus, resulting in disappointing results, compared to the usually tested corpora (ATIS, OVIS). Also, I show that using neither POS-tags induced from Biemann’s unsupervised POS-tagging algorithm nor hand-corrected POS-tags as input improves this situation. Last, I argue for the development of entirely incremental grammar induction algorithms instead of the approaches of the systems discussed before. . | Limitations of Current Grammar Induction Algorithms Bart Cramer School of Behavioral and Cognitive Neurosciences University of Groningen Groningen the Netherlands Abstract I review a number of grammar induction algorithms ABL Emile Adios and test them on the Eindhoven corpus resulting in disappointing results compared to the usually tested corpora ATIS OVIS . Also I show that using neither POS-tags induced from Biemann s unsupervised POS-tagging algorithm nor hand-corrected POS-tags as input improves this situation. Last I argue for the development of entirely incremental grammar induction algorithms instead of the approaches of the systems discussed before. 1 Introduction Grammar induction is a task within the field of natural language processing that attempts to construct a grammar of a given language solely on the basis of positive examples of this language. If a successful method is found this will have both practical applications and considerable theoretical implications. Concerning the practical side this will make the engineering of NLP systems easier especially for less widely studied languages. One can conceive successful GI algorithms as an inspiration for statistical machine translation systems. Theoretically grammar induction is important as well. One of the main assertions in the nativist s position is the Poverty of the Stimulus argument which means that the child does not perceive enough positive examples of language throughout his early youth to have learned the grammar from his parents without the help of innate knowledge or Universal Grammar that severely constrains the number of hypotheses . grammars that he can learn. Proved more strictly for formal grammars Gold s 1967 work showed that one cannot learn any type of su-perfinite grammar . regular languages context-free languages if one only perceives an unlimited amount of positive examples. After say n examples there is always more than 1 grammar that would be able to