tailieunhanh - Báo cáo khoa học: "Parsing in the Ahsmmeeofa Comldete Lexicon"
I. Introduction It is impractical for natural language parsers which serve as front ends to large or changing databases to maintain a complete in-core lexicon of words and meanings. This note discusses a practical approach to using alternative sources of lexical knowledge by postponing word categorization decisions until the parse is complete, and resolving remaining lexical anthiguities usiug a variety of informatkm available at that time. il. The Problem A natutal language parser working with a database query system ( PLANES [Waltz et al, 1976], LADDER [Hcndrix, 1977], ROBOT [Harris, 1977], CO-OP [Kaplan, 19791) encounters lexical diflicultics not present. | Parsing in the Absence of a Complete Lexicon Jim Davidson and s. Jerrold Kaplan Computer Science Department Stanford University Stanford CA 94305 I. Introduction It is impractical for natural language parsers which serve as front ends to large or changing databases to maintain a complete in-core lexicon of words and meanings. This note discusses a practical approach to using alternative sources of lexical knowledge by postponing word categorization decisions until the parse Is complete and resolving remaining lexical ambiguities using a variety of information available at that time. u. The Problem A natural language parser working with a database query system . PLANES Waltz et al. 1976 LADDER Hendrix 1977 ROIiOT Harns 1977 CO-Ol Kaplan 1979 encounters lexical difficulties not present in simpler applications. In particular the description of the domain of discourse may be quite large millions of words and varies as the underlying database changes. This precludes reliance upon an explicit fixed lexicon a dictionary which records all the terms known to the system because of a redundancy. Keeping the same information in two places the lexicon and the database leads to problems of integrity. Updating is more difficult if it must occur simultaneously in two places. b size A database of say 30 000 entries cannot be duplicated in primary memory. For example it may be impractical for a system dealing with a database of ships to store the names of all the ships in a separate in-core lexicon. If not all allowable lexical entries arc explicitly encoded there will be terms encountered by the parser about which nothing is known. The problem is to assign these terms to a particular class in tlxc absence of a specific lexical entry. Thus given the sentence Where is the Fox docked the parser would have to decide in the absence of any prior information about Fox that it was the name of a ship and not. say a port. III. . Previous approaches There arc several methods by which .
đang nạp các trang xem trước