tailieunhanh - Báo cáo khoa học: "A System for Semantic Analysis of Chemical Compound Names"
Mapping and classification of chemical compound names are important aspects of the tasks of BioNLP. This paper introduces the architecture of a system for the syntactic and semantic analysis of such names. Our system aims at yielding both the denoted chemical structure and a classification of a given name. We employ a novel approach to the task which promises an elegant and efficient way of solving the problem. The proposed system differs significantly from existing systems, in that it is also able to deal with underspecifying names and class names. . | A System for Semantic Analysis of Chemical Compound Names Henriette Engelken EML Research gGmbH Schloss-Wolfsbrunnenweg 33 69118 Heidelberg Germany Institute for Natural Language Processing University of Stuttgart Azenbergstr. 12 70174 Stuttgart Germany engelken@ Abstract Mapping and classification of chemical compound names are important aspects of the tasks of BioNLP. This paper introduces the architecture of a system for the syntactic and semantic analysis of such names. Our system aims at yielding both the denoted chemical structure and a classification of a given name. We employ a novel approach to the task which promises an elegant and efficient way of solving the problem. The proposed system differs significantly from existing systems in that it is also able to deal with underspecifying names and class names. 1 Introduction BioNLP is the branch of computational linguistics developing tools and algorithms tailored to the life sciences domain. Scientific and patent literature in this domain are growing at an enormous pace. This results in a valuable resource for researchers but at the same time it poses the problem that it can hardly be processed manually by humans. Thus a major goal of BioNLP is to automatically support humans by means of research in the area of information retrieval data mining and information extraction. Term identification is of great importance in these tasks. Krauthammer and Nenadic 2004 divide the identification task into the subtasks of term recognition marking the interesting words in a text term classification classifying them according to a taxonomy or an ontology and term mapping1 identifying a term with respect to a referent data source . 1Term mapping is also called term grounding amongst others by Kim and Park 2004 . Chemical compound names i. e. names of molecules are terms which prominently occur in scientific publications patents and in biochemical databases. Any chemical compound can be unambiguously denoted .
đang nạp các trang xem trước