tailieunhanh - Báo cáo khoa học: "Mapping Lexical Entries in a Verbs Database to WordNet Senses"
This paper describes automatic techniques for mapping 9611 entries in a database of English verbs to WordNet senses. The verbs were initially grouped into 491 classes based on syntactic features. Mapping these verbs into WordNet senses provides a resource that supports disambiguation in multilingual applications such as machine translation and cross-language information retrieval. Our techniques make use of (1) a training set of 1791 disambiguated entries, representing 1442 verb entries from 167 classes; (2) word sense probabilities, from frequency counts in a tagged corpus; (3) semantic similarity of WordNet senses for verbs within the same class; (4) probabilistic correlations. | Mapping Lexical Entries in a Verbs Database to WordNet Senses Rebecca Greenf and Lisa Pearlf and Bonnie J. Dorrf and Philip Resnikf Institute for Advanced Computer Studies Department of Computer Science University of Maryland College Park MD 2o742 USA rgreen llsp bonnie resnik @ Abstract This paper describes automatic techniques for mapping 9611 entries in a database of English verbs to Word-Net senses. The verbs were initially grouped into 491 classes based on syntactic features. Mapping these verbs into WordNet senses provides a resource that supports disambiguation in multilingual applications such as machine translation and cross-language information retrieval. Our techniques make use of 1 a training set of 1791 disambiguated entries representing 1442 verb entries from 167 classes 2 word sense probabilities from frequency counts in a tagged corpus 3 semantic similarity of WordNet senses for verbs within the same class 4 probabilistic correlations between WordNet data and attributes of the verb classes. The best results achieved 72 precision and 58 recall versus a lower bound of 62 precision and 38 recall for assigning the most frequently occurring WordNet sense and an upper bound of 87 precision and 75 recall for human judgment. 1 Introduction Our goal is to map entries in a lexical database of 4076 English verbs automatically to Word-Net senses Miller and Fellbaum 1991 Fell-baum 1998 to support such applications as ma chine translation and cross-language information retrieval. For example the verb drop is multiply ambiguous with many potential translations in Spanish bajar caerse dejar caer derribar disminuir echar hundir soltar etc. The database specifies a set of interpretations for drop depending on its context in the source-language SL . Inclusion of WordNet senses in the database enables the selection of an appropriate verb in the target language TL . Final selection is based on a frequency count of WordNet senses across all classes to which
đang nạp các trang xem trước