tailieunhanh - Báo cáo khoa học: "Robust Approach to Abbreviating Terms: A Discriminative Latent Variable Model with Global Information"
The present paper describes a robust approach for abbreviating terms. First, in order to incorporate non-local information into abbreviation generation tasks, we present both implicit and explicit solutions: the latent variable model, or alternatively, the label encoding approach with global information. Although the two approaches compete with one another, we demonstrate that these approaches are also complementary. By combining these two approaches, experiments revealed that the proposed abbreviation generator achieved the best results for both the Chinese and English languages. . | Robust Approach to Abbreviating Terms A Discriminative Latent Variable Model with Global Information Xu Sun Naoaki Okazaki Jun ichi Tsujii Department of Computer Science University of Tokyo Hongo 7-3-1 Bunkyo-ku Tokyo 113-0033 Japan School of Computer Science University of Manchester UK National Centre for Text Mining UK sunxu okazaki tsujii @ Abstract The present paper describes a robust approach for abbreviating terms. First in order to incorporate non-local information into abbreviation generation tasks we present both implicit and explicit solutions the latent variable model or alternatively the label encoding approach with global information. Although the two approaches compete with one another we demonstrate that these approaches are also complementary. By combining these two approaches experiments revealed that the proposed abbreviation generator achieved the best results for both the Chinese and English languages. Moreover we directly apply our generator to perform a very different task from tradition the abbreviation recognition. Experiments revealed that the proposed model worked robustly and outperformed five out of six state-of-the-art abbreviation recognizers. 1 Introduction Abbreviations represent fully expanded forms . hidden markov model through the use of shortened forms . HMM . At the same time abbreviations increase the ambiguity in a text. For example in computational linguistics the acronym HMM stands for hidden markov model whereas in the field of biochemistry HMM is generally an abbreviation for heavy meromyosin. Associating abbreviations with their fully expanded forms is of great importance in various NLP applications Pakhomov 2002 Yu et al. 2006 HaCohen-Kerner et al. 2008 . The core technology for abbreviation disambiguation is to recognize the abbreviation defini tions in the actual text. Chang and Schutze 2006 reported that 64 242 new abbreviations were introduced into the biomedical literatures in 2004. As such .
đang nạp các trang xem trước