tailieunhanh - Báo cáo khoa học: "A Generative Constituent-Context Model for Improved Grammar Induction"

We present a generative distributional model for the unsupervised induction of natural language syntax which explicitly models constituent yields and contexts. Parameter search with EM produces higher quality analyses than previously exhibited by unsupervised systems, giving the best published unsupervised parsing results on the ATIS corpus. Experiments on Penn treebank sentences of comparable length show an even higher F1 of 71% on nontrivial brackets. We compare distributionally induced and actual part-of-speech tags as input data, and examine extensions to the basic model. We discuss errors made by the system, compare the system to previous models, and discuss upper bounds,. | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 128-135. A Generative Constituent-Context Model for Improved Grammar Induction Dan Klein and Christopher D. Manning Computer Science Department Stanford University Stanford CA 94305-9040 klein manning @ Abstract We present a generative distributional model for the unsupervised induction of natural language syntax which explicitly models constituent yields and contexts. Parameter search with EM produces higher quality analyses than previously exhibited by unsupervised systems giving the best published unsupervised parsing results on the ATIS corpus. Experiments on Penn treebank sentences of comparable length show an even higher F1 of 71 on nontrivial brackets. We compare distributionally induced and actual part-of-speech tags as input data and examine extensions to the basic model. We discuss errors made by the system compare the system to previous models and discuss upper bounds lower bounds and stability for this task. 1 Introduction The task of inducing hierarchical syntactic structure from observed yields alone has received a great deal of attention Carroll and Charniak 1992 Pereira and Schabes 1992 Brill 1993 Stolcke and Omohun-dro 1994 . Researchers have explored this problem for a variety of reasons to argue empirically against the poverty of the stimulus Clark 2001 to use induction systems as a first stage in constructing large treebanks van Zaanen 2000 or to build better language models Baker 1979 Chen 1995 . In previous work we presented a conditional model over trees which gave the best published results for unsupervised parsing of the ATIS corpus Klein and Manning 2001b . However it suffered from several drawbacks primarily stemming from the conditional model used for induction. Here we improve on that model in several ways. First we construct a generative model which utilizes the same features. Then we extend the model to

crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.