Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "A Unified Statistical Model for the Identification of English BaseNP"
Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper presents a novel statistical model for automatic identification of English baseNP. It uses two steps: the Nbest Part-Of-Speech (POS) tagging and baseNP identification given the N-best POS-sequences. Unlike the other approaches where the two steps are separated, we integrate them into a unified statistical framework. Our model also integrates lexical information. Finally, Viterbi algorithm is applied to make global search in the entire sentence, allowing us to obtain linear complexity for the entire process. . | A Unified Statistical Model for the Identification of English BaseNP Endong Xun Microsoft Research China No. 49 Zhichun Road Haidian District 100080 China i-edxun@microsoft.com Ming Zhou Microsoft Research China No. 49 Zhichun Road Haidian District 100080 China Mingzhou@microsoft.com Abstract This paper presents a novel statistical model for automatic identification of English baseNP. It uses two steps the N-best Part-Of-Speech POS tagging and baseNP identification given the N-best POS-sequences. Unlike the other approaches where the two steps are separated we integrate them into a unified statistical framework. Our model also integrates lexical information. Finally Viterbi algorithm is applied to make global search in the entire sentence allowing us to obtain linear complexity for the entire process. Compared with other methods using the same testing set our approach achieves 92.3 in precision and 93.2 in recall. The result is comparable with or better than the previously reported results. 1 Introduction Finding simple and non-recursive base Noun Phrase baseNP is an important subtask for many natural language processing applications such as partial parsing information retrieval and machine translation. A baseNP is a simple noun phrase that does not contain other noun phrase recursively for example the elements within . in the following example are baseNPs where NNS IN VBG etc are part-of-speech tags as defined in M. Marcus 1993 . Changning Huang Microsoft Research China No. 49 Zhichun Road Haidian District 100080 China cnhuang@microsoft.com Measures NNS of IN manufacturing VBG activity NN fell VBD more RBR than IN the DT overall JJ measures NNS . . Figure 1 An example sentence with baseNP brackets A number of researchers have dealt with the problem of baseNP identification Church 1988 Bourigault 1992 Voutilainen 1993 Justeson Katz 1995 . Recently some researchers have made experiments with the same test corpus extracted from the 20th section of the Penn Treebank .