tailieunhanh - Báo cáo khoa học: "A Novel Discourse Parser Based on Support Vector Machine Classification"

This paper introduces a new algorithm to parse discourse within the framework of Rhetorical Structure Theory (RST). Our method is based on recent advances in the field of statistical machine learning (multivariate capabilities of Support Vector Machines) and a rich feature space. RST offers a formal framework for hierarchical text organization with strong applications in discourse analysis and text generation. | A Novel Discourse Parser Based on Support Vector Machine Classification David A. duVerle National Institute of Informatics Tokyo Japan Pierre Marie Curie University Paris France dave@ Helmut Prendinger National Institute of Informatics Tokyo Japan helmut@ Abstract This paper introduces a new algorithm to parse discourse within the framework of Rhetorical Structure Theory RST . Our method is based on recent advances in the field of statistical machine learning multivariate capabilities of Support Vector Machines and a rich feature space. RST offers a formal framework for hierarchical text organization with strong applications in discourse analysis and text generation. We demonstrate automated annotation of a text with RST hierarchically organised relations with results comparable to those achieved by specially trained human annotators. Using a rich set of shallow lexical syntactic and structural features from the input text our parser achieves in linear time of professional annotators human agreement F-score. The parser is 5 to 12 more accurate than current state-of-the-art parsers. 1 Introduction According to Mann and Thompson 1988 all well-written text is supported by a hierarchically structured set of coherence relations which reflect the authors intent. The goal of discourse parsing is to extract this high-level rhetorical structure. Dependency parsing and other forms of syntactic analysis provide information on the grammatical structure of text at the sentential level. Discourse parsing on the other hand focuses on a higher-level view of text allowing some flexibility in the choice of formal representation while providing a wide range of applications in both analytical and computational linguistics. Rhetorical Structure Theory Mann and Thompson 1988 provides a framework to analyze and study text coherence by defining and applying a set of structural relations to composing units spans of text. Annotation of a text within the RST formalism

crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.