tailieunhanh - Báo cáo khoa học: "Annealing Structural Bias in Multilingual Weighted Grammar Induction∗"

We first show how a structural locality bias can improve the accuracy of state-of-the-art dependency grammar induction models trained by EM from unannotated examples (Klein and Manning, 2004). Next, by annealing the free parameter that controls this bias, we achieve further improvements. We then describe an alternative kind of structural bias, toward “broken” hypotheses consisting of partial structures over segmented sentences, and show a similar pattern of improvement. We relate this approach to contrastive estimation (Smith and Eisner, 2005a), apply the latter to grammar induction in six languages, and show that our new approach improves accuracy by 1–17% (absolute). | Annealing Structural Bias in Multilingual Weighted Grammar Induction Noah A. Smith and Jason Eisner Department of Computer Science Center for Language and Speech Processing Johns Hopkins University Baltimore MD 21218 USA nasmith jason @ Abstract We first show how a structural locality bias can improve the accuracy of state-of-the-art dependency grammar induction models trained by EM from unannotated examples Klein and Manning 2004 . Next by annealing the free parameter that controls this bias we achieve further improvements. We then describe an alternative kind of structural bias toward broken hypotheses consisting of partial structures over segmented sentences and show a similar pattern of improvement. We relate this approach to contrastive estimation Smith and Eisner 2005a apply the latter to grammar induction in six languages and show that our new approach improves accuracy by 1-17 absolute over CE and 8-30 over EM achieving to our knowledge the best results on this task to date. Our method structural annealing is a general technique with broad applicability to hidden-structure discovery problems. 1 Introduction Inducing a weighted context-free grammar from flat text is a hard problem. A common starting point for weighted grammar induction is the Expectation-Maximization EM algorithm Dempster et al. 1977 Baker 1979 . EM s mediocre performance Table 1 reflects two problems. First it seeks to maximize likelihood but a grammar that makes the training data likely does not necessarily assign a linguistically defensible syntactic structure. Second the likelihood surface is not globally concave and learners such as the EM algorithm can get trapped on local maxima Charniak 1993 . We seek here to capitalize on the intuition that at least early in learning the learner should search primarily for string-local structure because most structure is local. 1 By penalizing dependencies between two words that are farther apart in the string we obtain consistent .

TÀI LIỆU MỚI ĐĂNG
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.