tailieunhanh - Báo cáo khoa học: "Weakly Supervised Learning for Hedge Classification in Scientific Literature"
We investigate automatic classification of speculative language (‘hedging’), in biomedical text using weakly supervised machine learning. Our contributions include a precise description of the task with annotation guidelines, analysis and discussion, a probabilistic weakly supervised learning model, and experimental evaluation of the methods presented. We show that hedge classification is feasible using weakly supervised ML, and point toward avenues for future research. | Weakly Supervised Learning for Hedge Classification in Scientific Literature Ben Medlock Computer Laboratory University of Cambridge Cambridge CB3 OFD benmedlock@ Ted Briscoe Computer Laboratory University of Cambridge Cambridge CB3 OFD ejb@ Abstract We investigate automatic classification of speculative language hedging in biomedical text using weakly supervised machine learning. Our contributions include a precise description of the task with annotation guidelines analysis and discussion a probabilistic weakly supervised learning model and experimental evaluation of the methods presented. We show that hedge classification is feasible using weakly supervised ML and point toward avenues for future research. 1 Introduction The automatic processing of scientific papers using NLP and machine learning ML techniques is an increasingly important aspect of technical informatics. In the quest for a deeper machine-driven understanding of the mass of scientific literature a frequently occuring linguistic phenomenon that must be accounted for is the use of hedging to denote propositions of a speculative nature. Consider the following 1. Our results prove that XfK89 inhibits Felin-9. 2. Our results suggest that XfK89 might inhibit Felin-9. The second example contains a hedge signaled by the use of suggest and might which renders the proposition inhibit XfK89 Felin-9 speculative. Such analysis would be useful in various applications for instance consider a system designed to identify and extract interactions between genetic entities in the biomedical domain. Case 1 above provides clear textual evidence of such an interaction 992 and justifies extraction of inhibit XfK89 Felin-9 whereas case 2 provides only weak evidence for such an interaction. Hedging occurs across the entire spectrum of scientific literature though it is particularly common in the experimental natural sciences. In this study we consider the problem of learning to automatically classify .
đang nạp các trang xem trước