tailieunhanh - Báo cáo khoa học: "A Unified Statistical Model for the Identification of English BaseNP"

This paper presents a novel statistical model for automatic identification of English baseNP. It uses two steps: the Nbest Part-Of-Speech (POS) tagging and baseNP identification given the N-best POS-sequences. Unlike the other approaches where the two steps are separated, we integrate them into a unified statistical framework. Our model also integrates lexical information. Finally, Viterbi algorithm is applied to make global search in the entire sentence, allowing us to obtain linear complexity for the entire process. . | A Unified Statistical Model for the Identification of English BaseNP Endong Xun Microsoft Research China No. 49 Zhichun Road Haidian District 100080 China i-edxun@ Ming Zhou Microsoft Research China No. 49 Zhichun Road Haidian District 100080 China Mingzhou@ Abstract This paper presents a novel statistical model for automatic identification of English baseNP. It uses two steps the N-best Part-Of-Speech POS tagging and baseNP identification given the N-best POS-sequences. Unlike the other approaches where the two steps are separated we integrate them into a unified statistical framework. Our model also integrates lexical information. Finally Viterbi algorithm is applied to make global search in the entire sentence allowing us to obtain linear complexity for the entire process. Compared with other methods using the same testing set our approach achieves in precision and in recall. The result is comparable with or better than the previously reported results. 1 Introduction Finding simple and non-recursive base Noun Phrase baseNP is an important subtask for many natural language processing applications such as partial parsing information retrieval and machine translation. A baseNP is a simple noun phrase that does not contain other noun phrase recursively for example the elements within . in the following example are baseNPs where NNS IN VBG etc are part-of-speech tags as defined in M. Marcus 1993 . Changning Huang Microsoft Research China No. 49 Zhichun Road Haidian District 100080 China cnhuang@ Measures NNS of IN manufacturing VBG activity NN fell VBD more RBR than IN the DT overall JJ measures NNS . . Figure 1 An example sentence with baseNP brackets A number of researchers have dealt with the problem of baseNP identification Church 1988 Bourigault 1992 Voutilainen 1993 Justeson Katz 1995 . Recently some researchers have made experiments with the same test corpus extracted from the 20th section of the Penn Treebank .

crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.