tailieunhanh - manning schuetze statisticalnlp phần 5

chúng tôi đang quét một văn bản cho một loại hình cụ thể của sự kiện như thiên tai, các cuộc tấn công khủng bố hoặc mua lại công ty. Nhiệm vụ là xác định những người tham gia sự kiện và thông tin điển hình khác của sự kiện như vậy (ví dụ như giá mua trong một vụ sáp nhập công ty). | 256 7 Word Sense Disambiguation Word Sense Accuracy h Ơ suit lawsuit 95 0 the suiytou wear 96 0 motion physical movement 85 1 proposal for action 88 13 train line of railroad cars 79 19 to teach 55 31 Table Some results of unsupervised disambiguation. The table shows the mean p and standard deviation Ơ for ten experiments with different initial conditions for the EM algorithm. Data are from Schiitze 1998 110 . collocations are hard to isolate in unsupervised disambiguation. Senses like the use of suit in the sense to be appropriate for as in This suits me fine are unlikely to be discovered. However such hard to identify senses often carry less content than senses that are tied to a particular subject area. For an information retrieval system it is probably more important to make the distinction between usage types like civil suit vs. criminal suit than to isolate the verbal sense to suit. Some results of unsupervised disambiguation are shown in table . We need to take into account the variability that is due to different initializations here Step 1 in figure . The table shows both the average accuracy and the standard deviation over ten trials. For senses with a clear correspondence to a particular topic the algorithm works well and variability is low. The word suit is an example. But the algorithm fails for words whose senses are topic-independent such as to teach for train this failure is not unlike other methods that work with topic information only. In addition to the low average performance variability is also quite high for topic-independent senses. In general performance is 5 to 10 lower than that of some of the dictionary-based algorithms as one would expect given that no lexical resources for training or defining senses are used. What Is a Word Sense Now that we have looked at a wide range of different approaches to word sense disambiguation let us revisit the question of what precisely a word What Is a Word Sense sense is. It would

TỪ KHÓA LIÊN QUAN