tailieunhanh - Báo cáo khoa học: "Long Sentence Analysis by Domain-Specific Pattern Grammar"

We propose a method for analyzing long complex and compound sentences that utilizes global structure analysis with domain-specific pattern grammar. Previously, long sentence analysis with global information used the following methods: two-level analysis--global structure analysis of long sentences with domain-independent function words and parsing of their constituents[Doi et al., 1991], and pattern matching--adaptation of domain-specific fixed pattern to input sentences. By utilizing domaindependent information the latter method could analyze long sentences of that domain. But since the matching is made only on the surface the sentence isn't analyzed well when patterns appear recursively. . | Long Sentence Analysis by Domain-Specific Pattern Grammar Shinichi Doi Kazunori Muraki Shinichiro Kamei Kiyoshi Yamabana NEC Corp c c Information Technology Research Laboratories 4-1-1 Miyazaki Miyamae-ku Kawasaki 216 JAPAN 1 Long Sentence Analysis We propose a method for analyzing long complex and compound sentences that utilizes global structure analysis with domain-specific pattern grammar. Previously long sentence analysis with global information used the following methods two-level analysis global structure analysis of long sentences with domain-independent function words and parsing of their constituents Doi et al. 1991 and pattern matching adaptation of domain-specific fixed pattern to input sentences. By utilizing domaindependent information the latter method could analyze long sentences of that domain. But since the matching is made only on the surface the sentence isn t analyzed well when patterns appear recursively. 2 Domain-Specific Pattern Grammar Our method analyzes the global structure of long sentences by using three knowledge-bases domainspecific patterns that can be described as a phrase structure grammar a list of keywords that denote constituents of the patterns and a pure basic grammar. An input sentence is initially parsed and divided into its constituents with these knowledgebases and then each constituent is parsed with a general grammar. Each constituent must be guaranteed uniformity by parsing with pure basic grammar. To obtain a pattern grammar of Japanese long sentences we analyzed the structures of about 750 long sentences from the leads of news articles in a Japanese newspaper Asahi Shinbun and identified several fixed global patterns. An example of pattern grammar is shown in Fig. 1. Using the pattern grammar and keyword list a-c the global structure of the sentence d was analyzed as f . 3 Conclusion Our method takes advantage of both two-level analysis and pattern matching and can deal with the irregular appearance of patterns .

TỪ KHÓA LIÊN QUAN