tailieunhanh - Báo cáo khoa học: "Probabilistic disambiguation models for wide-coverage HPSG parsing"

This paper reports the development of loglinear models for the disambiguation in wide-coverage HPSG parsing. The estimation of log-linear models requires high computational cost, especially with widecoverage grammars. Using techniques to reduce the estimation cost, we trained the models using 20 sections of Penn Treebank. A series of experiments empirically evaluated the estimation techniques, and also examined the performance of the disambiguation models on the parsing of real-world sentences. . | Probabilistic disambiguation models for wide-coverage HPSG parsing Yusuke Miyao Department of Computer Science University of Tokyo Hongo 7-3-1 Bunkyo-ku Tokyo Japan yusuke@ Jun ichi Tsujii Department of Computer Science University of Tokyo Hongo 7-3-1 Bunkyo-ku Tokyo Japan CREST JST tsujii@ Abstract This paper reports the development of log-linear models for the disambiguation in wide-coverage HPSG parsing. The estimation of log-linear models requires high computational cost especially with wide-coverage grammars. Using techniques to reduce the estimation cost we trained the models using 20 sections of Penn Treebank. A series of experiments empirically evaluated the estimation techniques and also examined the performance of the disambiguation models on the parsing of real-world sentences. 1 Introduction Head-Driven Phrase Structure Grammar HPSG Pollard and Sag 1994 has been studied extensively from both linguistic and computational points of view. However despite research on HPSG processing efficiency Oepen et al. 2002a the application of HPSG parsing is still limited to specific domains and short sentences Oepen et al. 2002b Toutanova and Manning 2002 . Scaling up HPSG parsing to assess real-world texts is an emerging research field with both theoretical and practical applications. Recently a wide-coverage grammar and a large treebank have become available for English HPSG Miyao et al. 2004 . A large treebank can be used as training and test data for statistical models. Therefore we now have the basis for the development and the evaluation of statistical disambiguation models for wide-coverage HPSG parsing. The aim of this paper is to report the development of log-linear models for the disambiguation in wide-coverage HPSG parsing and their empirical evaluation through the parsing of the Wall Street Journal of Penn Treebank II Marcus et al. 1994 . This is challenging because the estimation of log-linear models is computationally

crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.