Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Discovering Corpus-Specific Word Senses"

Yên Nhi 60 4 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

This paper presents an unsupervised algorithm which automatically discovers word senses from text. The algorithm is based on a graph model representing words and relationships between them. Sense clusters are iteratively computed by clustering the local graph of similar words around an ambiguous word. Discrimination against previously extracted sense clusters enables us to discover new senses. We use the same data for both recognising and resolving ambiguity. | Discovering Corpus-Specific Word Senses Beate Dorow Institut fur Maschinelle Sprachverarbeitung Universitãt Stuttgart Germany beate.dorow@ims.uni-stuttgart.de Dominic Widdows Center for the Study of Language and Information Stanford University California dwiddows@csli.Stanford.edu Abstract This paper presents an unsupervised algorithm which automatically discovers word senses from text. The algorithm is based on a graph model representing words and relationships between them. Sense clusters are iteratively computed by clustering the local graph of similar words around an ambiguous word. Discrimination against previously extracted sense clusters enables us to discover new senses. We use the same data for both recognising and resolving ambiguity. 1 Introduction This paper describes an algorithm which automatically discovers word senses from free text and maps them to the appropriate entries of existing dictionaries or taxonomies. Automatic word sense discovery has applications of many kinds. It can greatly facilitate a lexicographer s work and can be used to automatically construct corpus-based taxonomies or to tune existing ones. The same corpus evidence which supports a clustering of an ambiguous word into distinct senses can be used to decide which sense is referred to in a given context Schiitze 1998 . This paper is organised as follows. In section 2 we present the graph model from which we discover word senses. Section 3 describes the way we divide graphs surrounding ambiguous words into different areas corresponding to different senses using Markov clustering van Dongen 2000 . The quality of the Markov clustering depends strongly on several parameters such as a granularity factor and the size of the local graph. In section 4 we outline a word sense discovery algorithm which bypasses the problem of parameter tuning. We conducted a pilot experiment to examine the performance of our algorithm on a set of words with varying degree of ambiguity. Section 5 describes

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Discovering Sociolinguistic Associations with Structured Sparsity"

Báo cáo khoa học: "Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis"

Báo cáo khoa học: "Discovering asymmetric entailment relations between verbs using selectional preferences"

Báo cáo khoa học: "A Bayesian Model for Discovering Typological Implications"

Báo cáo khoa học: "Discovering Relations among Named Entities from Large Corpora"

Báo cáo khoa học: "Discovering Phonotactic Finite-State Automata by Genetic Search"

Báo cáo khoa học: "Discovering the Lexical Features of a Language"

Báo cáo y học: "Genome-scale approaches for discovering novel nonconventional splicing substrates of the Ire1 nuclease"

Báo cáo y học: "Chipper: discovering transcription-factor targets from chromatin immunoprecipitation microarrays using variance stabilization"

Báo cáo y học: " L2L: a simple tool for discovering the hidden significance in microarray expression data"

crossorigin="anonymous">

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.