tailieunhanh - Báo cáo khoa học: "Accessing GermaNet Data and Computing Semantic Relatedness"

We present an API developed to access GermaNet, a lexical semantic database for German represented in XML. The API provides a set of software functions for parsing and retrieving information from GermaNet. Then, we present a case study which builds upon the GermaNet API and implements an application for computing semantic relatedness according to five different metrics. The package can, again, serve as a software library to be deployed in natural language processing applications. A graphical user interface allows to interactively experiment with the system. . | Accessing GermaNet Data and Computing Semantic Relatedness Iryna Gurevych and Hendrik Niederlich EML Research gGmbH Schloss-Wolfsbrunnenweg 33 69118 Heidelberg Germany http gurevych Abstract We present an API developed to access GermaNet a lexical semantic database for German represented in XML. The API provides a set of software functions for parsing and retrieving information from GermaNet. Then we present a case study which builds upon the GermaNet API and implements an application for computing semantic relatedness according to five different metrics. The package can again serve as a software library to be deployed in natural language processing applications. A graphical user interface allows to interactively experiment with the system. 1 Motivation The knowledge encoded in WordNet Fellbaum 1998 has proved valuable in many natural language processing NLP applications. One particular way to integrate semantic knowledge into applications is to compute semantic similarity of Word-Net concepts. This can be used . to perform word sense disambiguation Patwardhan et al. 2003 to find predominant word senses in untagged text McCarthy et al. 2004 to automatically generate spoken dialogue summaries Gurevych Strube 2004 and to perform spelling correction Hirst Budanitsky 2005 . Extensive research concerning the integration of semantic knowledge into NLP for the English language has been arguably fostered by the emergence of WordNet Similarity package Pedersen et al. 2004 .1 In its turn the development of the WordNet based semantic similarity software has been facilitated by the availability of tools to easily retrieve 1 http tpederse data from WordNet . WordNet QueryData 2 Research integrating semantic knowledge into NLP for languages other than English is scarce. On the one hand there are fewer computational knowledge resources like dictionaries broad enough in coverage to be integrated in robust NLP .