tailieunhanh - Báo cáo khoa học: "Predicate Argument Structure Analysis using Transformation-based Learning"
Maintaining high annotation consistency in large corpora is crucial for statistical learning; however, such work is hard, especially for tasks containing semantic elements. This paper describes predicate argument structure analysis using transformation-based learning. An advantage of transformation-based learning is the readability of learned rules. | Predicate Argument Structure Analysis using Transformation-based Learning Hirotoshi Taira Sanae Fujita Masaaki Nagata NTT Communication Science Laboratories 2-4 Hikaridai Seika-cho Souraku-gun Kyoto 619-0237 Japan taira sanae @ Abstract Maintaining high annotation consistency in large corpora is crucial for statistical learning however such work is hard especially for tasks containing semantic elements. This paper describes predicate argument structure analysis using transformation-based learning. An advantage of transformation-based learning is the readability of learned rules. A disadvantage is that the rule extraction procedure is time-consuming. We present incremental-based transformation-based learning for semantic processing tasks. As an example we deal with Japanese predicate argument analysis and show some tendencies of annotators for constructing a corpus with our method. 1 Introduction Automatic predicate argument structure analysis PAS provides information of who did what to whom and is an important base tool for such various text processing tasks as machine translation information extraction Hirschman et al. 1999 question answering Narayanan and Harabagiu 2004 Shen and Lapata 2007 and summarization Melli et al. 2005 . Most recent approaches to predicate argument structure analysis are statistical machine learning methods such as support vector machines SVMs Pradhan et al. 2004 . For predicate argument structure analysis we have the following representative large corpora FrameNet Fillmore et al. 2001 PropBank Palmer et al. 2005 and Nom-Bank Meyers et al. 2004 in English the Chinese PropBank Xue 2008 in Chinese the GDA Corpus Hashida 2005 Kyoto Text Corpus Kawahara et al. 2002 and the NAIST Text Corpus Iida et al. 2007 in Japanese. The construction of such large corpora is strenuous and time-consuming. Additionally maintaining high annotation consistency in such corpora is crucial for statistical .
đang nạp các trang xem trước