Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Semantic Links on a Thesaurus*"
Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Hypernym links acquired through an information extraction procedure are projected on multi-word terms through the recognition of semantic variations. The quality of the projected links resulting from corpus-based acquisition is compared with projected links extracted from a technical thesaurus. 1 Motivation In the domain of corpus-based terminology, there are two m a i n topics of research: term acquisition--the discovery of candidate terms-and automatic thesaurus construction--the addition of semantic links to a term bank. . | Projecting Corpus-Based Semantic Links on a Thesaurus Emmanuel Morin IRIN 2 chemin de la housinière - BP 92208 44322 NANTES Cedex 3 FRANCE morinỗirin.univ-nantes.fr Christian Jacquemin LIMSI-CNRS BP 133 91403 ORSAY Cedex FRANCE jacquemin@limsi.fr Abstract Hypernym links acquired through an information extraction procedure are projected on multi-word terms through the recognition of semantic variations. The quality of the projected links resulting from corpus-based acquisition is compared with projected links extracted from a technical thesaurus. 1 Motivation In the domain of corpus-based terminology there are two main topics of research term acquisition the discovery of candidate terms and automatic thesaurus construction the addition of semantic links to a term bank. Several studies have focused on automatic acquisition of terms from corpora Bourigault 1993 Justeson and Katz 1995 Daille 1996 . The output of these tools is a list of unstructured multi-word terms. On the other hand contributions to automatic construction of thesauri provide classes or links between single words. Classes are produced by clustering techniques based on similar word contexts Schiitze 1993 or similar distributional contexts Grefenstette 1994 . Links result from automatic acquisition of relevant predicative or discursive patterns Hearst 1992 Basili et al. 1993 Riloff 1993 . Predicative patterns yield predicative relations such as cause or effect whereas discursive patterns yield non-predicative relations such as generic specific or synonymy links. The experiments presented in this paper were performed on AGRO a 1.3-million word French corpus of scientific abstracts in the agricultural domain. The termer used for multi-word term acquisition is ACABIT Daille 1996 . It has produced 15 875 multi-word terms composed of 4 194 single words. For expository purposes some examples are taken from MEDIC a 1.56million word English corpus of scientific abstracts in the medical domain. The main .