tailieunhanh - Báo cáo khoa học: "Arabic Syntactic Trees: from Constituency to Dependency"

This research note reports on the work in progress which regards automatic transformation of phrase-structure syntactic trees of Arabic into dependency-driven analytical ones. Guidelines for these descriptions have been developed at the Linguistic Data Consortium, University of Pennsylvania, and at the Faculty of Mathematics and Physics and the Faculty of Arts, Charles University in Prague, respectively. The transformation consists of (i) a recursive function translating the topology of a phrase tree into a corresponding dependency tree, and (ii) a procedure assigning analytical functions to the nodes of the dependency tree. Apart from an outline of the annotation schemes and. | Arabic Syntactic Trees from Constituency to Dependency Zdenek Zabokrtsky and Otakar Smrz Center for Computational Linguistics Faculty of Mathematics and Physics Charles University in Prague zabokrtsky smrz @ckl. Abstract This research note reports on the work in progress which regards automatic transformation of phrase-structure syntactic trees of Arabic into dependency-driven analytical ones. Guidelines for these descriptions have been developed at the Linguistic Data Consortium University of Pennsylvania and at the Faculty of Mathematics and Physics and the Faculty of Arts Charles University in Prague respectively. The transformation consists of i a recursive function translating the topology of a phrase tree into a corresponding dependency tree and ii a procedure assigning analytical functions to the nodes of the dependency tree. Apart from an outline of the annotation schemes and a deeper insight into these procedures model application of the transformation is given herein. 1 Introduction Exploring the relationship between constituency and dependency sentence representations is not a new issue the first studies go back to the 60 s Gaifman 1965 for more references see . Schneider 1998 . Still some theoretical findings had not been applicable until the first dependency treebanks with well-defined annotation schemes came into existence just in the very last years Hajic et al. 2001 . The need to convert Arabic treebank data of different descriptions arises from a co-operation between the Linguistic Data Consortium LDC University of Pennsylvania and three concerned institutions of Charles University in Prague namely the Center for Computational Linguistics the Institute of Formal and Applied Linguistics and the Institute of Comparative Linguistics. The two parties intend to share the resources they create. Prior to this exchange 10 000 words from the LDC Arabic Newswire A Corpus were manually annotated in both syntactic styles as a step to ensure that

TỪ KHÓA LIÊN QUAN