tailieunhanh - Báo cáo khoa học: "An Ensemble Model that Combines Syntactic and Semantic Clustering for Discriminative Dependency Parsing"

We combine multiple word representations based on semantic clusters extracted from the (Brown et al., 1992) algorithm and syntactic clusters obtained from the Berkeley parser (Petrov et al., 2006) in order to improve discriminative dependency parsing in the MSTParser framework (McDonald et al., 2005). | An Ensemble Model that Combines Syntactic and Semantic Clustering for Discriminative Dependency Parsing Gholamreza Haffari Faculty of Information Technology Monash University Melbourne Australia reza@ Abstract We combine multiple word representations based on semantic clusters extracted from the Brown et al. 1992 algorithm and syntactic clusters obtained from the Berkeley parser Petrov et al. 2006 in order to improve discriminative dependency parsing in the MST-Parser framework McDonald et al. 2005 . We also provide an ensemble method for combining diverse cluster-based models. The two contributions together significantly improves unlabeled dependency accuracy from to . 1 Introduction A simple method for using unlabeled data in discriminative dependency parsing was provided in Koo et al. 2008 which involved clustering the labeled and unlabeled data and then each word in the dependency treebank was assigned a cluster identifier. These identifiers were used to augment the feature representation of the edge-factored or second-order features and this extended feature set was used to discriminatively train a dependency parser. The use of clusters leads to the question of how to integrate various types of clusters possibly from different clustering algorithms in discriminative dependency parsing. Clusters obtained from the Brown et al. 1992 clustering algorithm are typically viewed as semantic . one cluster might contain plan letter request memo . while another may contain people customers employees students . Another clustering view that is more syntactic in nature comes from the use of statesplitting in PCFGs. For instance we could extract a syntactic cluster loss time profit earnings performance rating . all head words of noun phrases corresponding to cluster of direct objects of 710 Marzieh Razavi and Anoop Sarkar School of Computing Science Simon Fraser University Vancouver Canada mrazavi anoop @ verbs like improve. In this paper we

TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG