tailieunhanh - Báo cáo khoa học: "Improving Arabic Dependency Parsing with Form-based and Functional Morphological Features"

We explore the contribution of morphological features – both lexical and inflectional – to dependency parsing of Arabic, a morphologically rich language. Using controlled experiments, we find that definiteness, person, number, gender, and the undiacritzed lemma are most helpful for parsing on automatically tagged input. | Improving Arabic Dependency Parsing with Form-based and Functional Morphological Features Yuval Marton . Watson Research Center IBM yymarton@ Nizar Habash and Owen Rambow Center for Computational Learning Systems Columbia University habash rambow @ Abstract We explore the contribution of morphological features - both lexical and inflectional -to dependency parsing of Arabic a morphologically rich language. Using controlled experiments we find that definiteness person number gender and the undiacritzed lemma are most helpful for parsing on automatically tagged input. We further contrast the contribution of form-based and functional features and show that functional gender and number . broken plurals and the related rationality feature improve over form-based features. It is the first time functional morphological features are used for Arabic NLP. 1 Introduction Parsers need to learn the syntax of the modeled language in order to project structure on newly seen sentences. Parsing model design aims to come up with features that best help parsers to learn the syntax and choose among different parses. One aspect of syntax which is often not explicitly modeled in parsing involves morphological constraints on syntactic structure such as agreement which often plays an important role in morphologically rich languages. In this paper we explore the role of morphological features in parsing Modern Standard Arabic MSA . For MSA the space of possible morphological features is fairly large. We determine which morphological features help and why. We also explore going beyond the easily detectable regular form-based surface features by representing functional values for some morphological features. We expect that representing lexical abstrac- 1586 tions and inflectional features participating in agreement relations would help parsing quality but other inflectional features would not help. We further expect functional features to be superior to .

TỪ KHÓA LIÊN QUAN