Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Towards Robust Animacy Classification Using Morphosyntactic Distributional Features"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

This paper presents results from experiments in automatic classification of animacy for Norwegian nouns using decision-tree classifiers. The method makes use of relative frequency measures for linguistically motivated morphosyntactic features extracted from an automatically annotated corpus of Norwegian. The classifiers are evaluated using leave-oneout training and testing and the initial results are promising (approaching 90% accuracy) for high frequency nouns, however deteriorate gradually as lower frequency nouns are classified. Experiments attempting to empirically locate a frequency threshold for the classification method indicate that a subset of the chosen morphosyntactic features exhibit a notable resilience to data sparseness. . | Towards Robust Animacy Classification Using Morphosyntactic Distributional Features Lilja 0vrelid NLP-unit Dept. of Swedish Goteborg University SE-40530 Goteborg Sweden lilja.ovrelid@svenska.gu.se Abstract This paper presents results from experiments in automatic classification of animacy for Norwegian nouns using decision-tree classifiers. The method makes use of relative frequency measures for linguistically motivated morphosyn-tactic features extracted from an automatically annotated corpus of Norwegian. The classifiers are evaluated using leave-one-out training and testing and the initial results are promising approaching 90 accuracy for high frequency nouns however deteriorate gradually as lower frequency nouns are classified. Experiments attempting to empirically locate a frequency threshold for the classification method indicate that a subset of the chosen mor-phosyntactic features exhibit a notable resilience to data sparseness. Results will be presented which show that the classification accuracy obtained for high frequency nouns with absolute frequencies 1000 can be maintained for nouns with considerably lower frequencies 50 by backing off to a smaller set of features at classification. 1 Introduction Animacy is a an inherent property of the referents of nouns which has been claimed to figure as an influencing factor in a range of different grammatical phenomena in various languages and it is correlated with central linguistic concepts such as agentivity and discourse salience. Knowledge about the animacy of a noun is therefore relevant for several different kinds of NLP problems ranging from coreference resolution to parsing and generation. In recent years a range of linguistic studies have examined the influence of argument animacy in grammatical phenomena such as differential object marking Aissen 2003 the passive construction Dingare 2001 the dative alternation Bresnan et al. 2005 etc. A variety of languages are sensitive to the dimension of animacy .