tailieunhanh - Báo cáo khoa học: "Hierarchical Directed Acyclic Graph Kernel: Methods for Structured Natural Language Data"

This paper proposes the “Hierarchical Directed Acyclic Graph (HDAG) Kernel” for structured natural language data. The HDAG Kernel directly accepts several levels of both chunks and their relations, and then efficiently computes the weighed sum of the number of common attribute sequences of the HDAGs. We applied the proposed method to question classification and sentence alignment tasks to evaluate its performance as a similarity measure and a kernel function. The results of the experiments demonstrate that the HDAG Kernel is superior to other kernel functions and baseline methods. . | Hierarchical Directed Acyclic Graph Kernel Methods for Structured Natural Language Data Jun Suzuki Tsutomu Hirao Yutaka Sasaki and Eisaku Maeda NTT Communication Science Laboratories NTT Corp. 2-4 Hikaridai Seika-cho Soraku-gun Kyoto 619-0237 Japan jun hirao sasaki maeda @ Abstract This paper proposes the Hierarchical Directed Acyclic Graph HDAG Kernel for structured natural language data. The HDAG Kernel directly accepts several levels of both chunks and their relations and then efficiently computes the weighed sum of the number of common attribute sequences of the HDAGs. We applied the proposed method to question classification and sentence alignment tasks to evaluate its performance as a similarity measure and a kernel function. The results of the experiments demonstrate that the HDAG Kernel is superior to other kernel functions and baseline methods. 1 Introduction As it has become easy to get structured corpora such as annotated texts many researchers have applied statistical and machine learning techniques to NLP tasks thus the accuracies of basic NLP tools such as POS taggers NP chunkers named entities taggers and dependency analyzers have been improved to the point that they can realize practical applications in NLP. The motivation of this paper is to identify and use richer information within texts that will improve the performance of NLP applications this is in contrast to using feature vectors constructed by a bag-of-words Salton et al. 1975 . We now are focusing on the methods that use numerical feature vectors to represent the features of natural language data. In this case since the original natural language data is symbolic researchers convert the symbolic data into numeric data. This process feature extraction is ad-hoc in nature and differs with each NLP task there has been no neat formulation for generating feature vectors from the semantic and grammatical structures inside texts. Kernel methods Vapnik 1995 Cristianini and .

crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.