tailieunhanh - Báo cáo khoa học: "A Probabilistic"

This i)al)er describes a Ilatural language i)ars ing algorith,n for unrestricted text which uses a prol)al)ility-I~ased scoring function to select the "l)est" i)arse of a sclfl,ence. The parser, T~earl, is a time-asynchronous I)ottom-ul) chart parser with Earley-tyl)e tol)-down prediction which l)ur sues the highest-scoring theory iu the chart, where the score of a theory represents tim extent t o which the context of the sentence predicts that interpretation. | pearl A Probabilistic Chart Parser David M. Magerman cs Department Stanford University Stanford CA 94305 111 age r n 1 an @cs. s tai 1 lồ r d. ed II Abstract This paper describes a natural language parsing algorithm for unrestricted text which uses a probability-based scoring function to select the best parse of a sentence. The parser Pearl is a lime-asynchronous bottom-up chart parser with Earley-type top-down prediction which pursues the highest-scoring theory in the chart where the score of a theory represents the extent to which the context of the sentence predicts that interpretation. This parser differs from previous attempts at stochastic parsers in that it uses a richer form of conditional probabilities based on context to predict likelihood. Pearl also provides a framework for incorporating the results of previous work in part-of-spcech assignment unknown word models and other probabilistic models of linguistic features into one parsing tool interleaving these techniques instead of using the traditional pipeline architecture. In preliminary tests Pearl has been successful at resolving part-of-speech and word in speech processing ambiguity determining categories for unknown words and selecting correct parses first using a very loosely fitting covering Introduction All natural language grammars are ambiguous. Even tightly fitting natural language grammars are ambiguous in some ways. Loosely fitting grammars which are necessary for handling the variability and complexity of unrestricted text and speech are worse. The standard technique for dealing with this ambiguity pruning This work was partially supported by DARPA grant No. NOOJ4-85-KO018 ONR contract No. N00014-89-C-01Ĩ1 by DARPA and AFOSR jointly under grant No. AI OSR-90-0066 and by grant No. DAAL 03-89-C0031 PR1. Special thanks to Carl Weir and Lynette Hirschman at Unisys for their valued input guidance and support. The grammar used for our experiments is the string grammar used in .

TỪ KHÓA LIÊN QUAN