tailieunhanh - Báo cáo khoa học: "Exploiting Shallow Linguistic Information for Relation Extraction from Biomedical Literature"

We propose an approach for extracting relations between entities from biomedical literature based solely on shallow linguistic information. We use a combination of kernel functions to integrate two different information sources: (i) the whole sentence where the relation appears, and (ii) the local contexts around the interacting entities. We performed experiments on extracting gene and protein interactions from two different data sets. The results show that our approach outperforms most of the previous methods based on syntactic and semantic information. . | Exploiting Shallow Linguistic Information for Relation Extraction from Biomedical Literature Claudio Giuliano and Alberto Lavelli and Lorenza Romano ITC-irst Via Sommarive 18 38050 Povo TN Italy giuliano lavelli romano @ Abstract We propose an approach for extracting relations between entities from biomedical literature based solely on shallow linguistic information. We use a combination of kernel functions to integrate two different information sources i the whole sentence where the relation appears and ii the local contexts around the interacting entities. We performed experiments on extracting gene and protein interactions from two different data sets. The results show that our approach outperforms most of the previous methods based on syntactic and semantic information. 1 Introduction Information Extraction IE is the process of finding relevant entities and their relationships within textual documents. Applications of IE range from Semantic Web to Bioinformatics. For example there is an increasing interest in automatically extracting relevant information from biomedical literature. Recent evaluation campaigns on bio-entity recognition such as BioCreAtIvE and JNLPBA 2004 shared task have shown that several systems are able to achieve good performance even if it is a bit worse than that reported on news articles . However relation identification is more useful from an applicative perspective but it is still a considerable challenge for automatic tools. In this work we propose a supervised machine learning approach to relation extraction which is applicable even when deep linguistic processing is not available or reliable. In particular we explore a kernel-based approach based solely on shallow linguistic processing such as tokeniza- tion sentence splitting Part-of-Speech PoS tagging and lemmatization. Kernel methods Shawe-Taylor and Cristianini 2004 show their full potential when an explicit computation of the feature map becomes computationally infeasible

TỪ KHÓA LIÊN QUAN