tailieunhanh - Báo cáo khoa học: " Extending Latent Semantic Analysis with features for dialogue act classification"

We discuss Feature Latent Semantic Analysis (FLSA), an extension to Latent Semantic Analysis (LSA). LSA is a statistical method that is ordinarily trained on words only; FLSA adds to LSA the richness of the many other linguistic features that a corpus may be labeled with. We applied FLSA to dialogue act classification with excellent results. We report results on three corpora: CallHome Spanish, MapTask, and our own corpus of tutoring dialogues. | FLSA Extending Latent Semantic Analysis with features for dialogue act classification Riccardo Serafin CEFRIEL Via Fucini 2 20133 Milano Italy Barbara Di Eugenio Computer Science University of Illinois Chicago IL 60607 USA bdieugen@ Abstract We discuss Feature Latent Semantic Analysis FLSA an extension to Latent Semantic Analysis LSA . LSA is a statistical method that is ordinarily trained on words only FLSA adds to LSA the richness of the many other linguistic features that a corpus may be labeled with. We applied FLSA to dialogue act classification with excellent results. We report results on three corpora CallHome Spanish MapTask and our own corpus of tutoring dialogues. 1 Introduction In this paper we propose Feature Latent Semantic Analysis FLSA as an extension to Latent Semantic Analysis LSA . LSA can be thought as representing the meaning of a word as a kind of average of the meanings of all the passages in which it appears and the meaning of a passage as a kind of average of the meaning of all the words it contains Landauer and Dumais 1997 . It builds a semantic space where words and passages are represented as vectors. LSA is based on Single Value Decomposition SVD a mathematical technique that causes the semantic space to be arranged so as to reflect the major associative patterns in the data. LSA has been successfully applied to many tasks such as assessing the quality of student essays Foltz et al. 1999 or interpreting the student s input in an Intelligent Tutoring system Wiemer-Hastings 2001 . A common criticism of LSA is that it uses only words and ignores anything else . syntactic information to LSA man bites dog is identical to dog bites man. We suggest that an LSA semantic space can be built from the co-occurrence of arbitrary textual features not just words. We are calling LSA augmented with features FLSA for Feature LSA. Relevant prior work on LSA only includes Structured Latent Semantic Analysis

crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.