tailieunhanh - Báo cáo khoa học: "Representing Text Chunks"

Dividing sentences in chunks of words is a useful preprocessing step for parsing, information extraction and information retrieval. (l~mshaw and Marcus, 1995) have introduced a "convenient" data representation for chunking by converting it to a tagging task. In this paper we will examine seven different data representations for the problem of recognizing noun phrase chunks. We will show that the the data representation choice has a minor influence on chunking performance. However, equipped with the most suitable data representation, our memory-based learning chunker was able to improve the best published chunking results for a standard data set. . | Proceedings of EACL 99 Representing Text Chunks Erik F. Tjong Kim Sang Jorn Veenstra Center for Dutch Language and Speech Computational Linguistics University of Antwerp Tilburg University Universiteitsplein 1 . Box 90153 B-2610 Wilrijk Belgium 5000 LE Tilburg The Netherlands erikt@ veenstra@ Abstract Dividing sentences in chunks of words is a useful preprocessing step for parsing information extraction and information retrieval. Ramshaw and Marcus 1995 have introduced a convenient data representation for chunking by converting it to a tagging task. In this paper we will examine seven different data representations for the problem of recognizing noun phrase chunks. We will show that the the data representation choice has a minor influence on chunking performance. However equipped with the most suitable data representation our memory-based learning chunker was able to improve the best published chunking results for a standard data set. 1 Introduction The text corpus tasks parsing information extraction and information retrieval can benefit from dividing sentences in chunks of words. Ramshaw and Marcus 1995 describe an error-driven transformation-based learning TBL method for finding NP chunks in texts. NP chunks or baseNPs are non-overlapping non-recursive noun phrases. In their experiments they have modeled chunk recognition as a tagging task words that are inside a baseNP were marked I words outside a baseNP received an 0 tag and a special tag B was used for the first word inside a baseNP immediately following another baseNP. A text example original In yv early trading yv in V Hong Kong jv yv Monday yv at gold yv was quoted at tv yv V an ounce v tagged In O early I trading I in o Hong I Kong I Monday B O gold I was o quoted O at o I 1 an B ounce I . O Other representations for NP chunking can be used as well. An example is the representation used in Ratnaparkhi 1998 where all the chunkinitial words receive the same start tag .

TỪ KHÓA LIÊN QUAN
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.