tailieunhanh - Báo cáo khoa học: "Parsing the Wall Street Journal with the Inside-Outside Algorithm"

We report grammar inference experiments on partially parsed sentences taken from the Wall Street Journal corpus using the inside-outside algorithm for stochastic context-free grammars. The initial grammar for the inference process makes no ,assumption of the kinds of structures and their distributions. The inferred grammar is evaluated by its predicting power and by comparing the bracketing of held out sentences imposed by the inferred grammar with the partial bracketings of these sentences given in the corpus. Using part-of-speech tags as the only source of lexical information, high bracketing accuracy is achieved even with a small subset of the available. | Parsing the Wall Street Journal with the Inside-Outside Algorithm Yves Schabes Michal Roth Randy Osborne Mitsubishi Electric Research Laboratories Cambridge MA 02139 USA schabes roth os borne@ Abstract We report grammar inference experiments on partially parsed sentences taken from the Wall Street Journal corpus using the inside-outside algorithm for stochastic context-free grammars. The initial grammar for the inference process makes no assumption of the kinds of structures and their distributions. The inferred grammar is evaluated by its predicting power and by comparing the bracketing of held out sentences imposed by the inferred grammar with the partial bracketings of these sentences given in the corpus. Using part-of-speech tags as the only source of lexical information high bracketing accuracy is achieved even with a small subset of the available training material 1045 sentences for test sentences shorter than 10 words and for sentences shorter than 15 words. 1 Introduction Most broad coverage natural language parsers have been designed by incorporating hand-crafted rules. These rules are also very often further refined by statistical training. Furthermore it is widely believed that high performance can only be achieved by disambiguating lexically sensitive phenomena such as prepositional attachment ambiguity coordination or subcategorization. So far grammar inference has not been shown to be effective for designing wide coverage parsers. Baker 1979 describes a training algorithm for stochastic context-free grammars SCFG which can be used for grammar reestimation Fujisaki et al. 1989 Sharman et al. 1990 Black et al. 1992 Briscoe and Waegner 1992 or grammar inference from scratch Lari and Young 1990 . However the application of SCFGs and the original inside-outside algorithm for grammar inference has been inconclusive for two reasons. First each iteration of the algorithm on a grammar with n nonterminals requires 0 ns w 3 time per training .

TỪ KHÓA LIÊN QUAN
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.