tailieunhanh - Báo cáo khoa học: "A corpus-based approach to topic in Danish dialog∗"

We report on an investigation of the pragmatic category of topic in Danish dialog and its correlation to surface features of NPs. Using a corpus of 444 utterances, we trained a decision tree system on 16 features. The system achieved nearhuman performance with success rates of 84–89% and F 1 -scores of – in 10fold cross validation tests (human performance: 89% and ). The most important features turned out to be preverbal position, definiteness, pronominalisation, and non-subordination. We discovered that NPs in epistemic matrix clauses (. “I think . . . ”) were seldom topics and we suspect that. | A corpus-based approach to topic in Danish dialog Philip Diderichsen Lund University Cognitive Science Lund University Sweden Jakob Elming CMOL Dept. of Computational Linguistics Copenhagen Business School Denmark Abstract We report on an investigation of the pragmatic category of topic in Danish dialog and its correlation to surface features of NPs. Using a corpus of 444 utterances we trained a decision tree system on 16 features. The system achieved nearhuman performance with success rates of 84-89 and Fl-scores of in 10fold cross validation tests human performance 89 and . The most important features turned out to be preverbal position definiteness pronominalisa-tion and non-subordination. We discovered that NPs in epistemic matrix clauses . I think. were seldom topics and we suspect that this holds for other interpersonal matrix clauses as well. 1 Introduction The pragmatic category of topic is notoriously difficult to pin down and it has been defined in many ways Buring 1999 Davison 1984 Engdahl and Vallduvi 1996 Gundel 1988 Lambrecht 1994 Reinhart 1982 Vallduvi 1992 . The common denominator is the notion of topic as what an utterance is about. We take this as our point of departure in this corpus-based investigation of the correlations between linguistic surface features and pragmatic topicality in Danish dialog. We thank Daniel Hardt and two anonymous reviewers for many helpful comments on drafts of this paper. Danish is a verb-second language. Its word order is fixed but only to a certain degree in that it allows any main clause constituent to occur in the preverbal position. The first position thus has a privileged status in Danish often associated with topicality Harder and Poulsen 2000 Togeby 2003 . We were thus interested in investigating how well the topic correlates with the preverbal position along with other features if any. Our findings could prove useful for the further investigation of

crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.