tailieunhanh - Báo cáo khoa học: "Assessing the Costs of Sampling Methods in Active Learning for Annotation"

Traditional Active Learning (AL) techniques assume that the annotation of each datum costs the same. This is not the case when annotating sequences; some sequences will take longer than others. We show that the AL technique which performs best depends on how cost is measured. Applying an hourly cost model based on the results of an annotation user study, we approximate the amount of time necessary to annotate a given sentence. This model allows us to evaluate the effectiveness of AL sampling methods in terms of time spent in annotation. . | Assessing the Costs of Sampling Methods in Active Learning for Annotation Robbie Haertel Eric Ringger Kevin Seppi James Carroll Peter McClanahan Department of Computer Science Brigham Young University Provo UT 84602 USA robbie_haertel@ ringger@ kseppi@ jlcarroll@ petermcclanahan@ Abstract Traditional Active Learning AL techniques assume that the annotation of each datum costs the same. This is not the case when annotating sequences some sequences will take longer than others. We show that the AL technique which performs best depends on how cost is measured. Applying an hourly cost model based on the results of an annotation user study we approximate the amount of time necessary to annotate a given sentence. This model allows us to evaluate the effectiveness of AL sampling methods in terms of time spent in annotation. We acheive a 77 reduction in hours from a random baseline to achieve tag accuracy on the Penn Treebank. More significantly we make the case for measuring cost in assessing AL methods. 1 Introduction Obtaining human annotations for linguistic data is labor intensive and typically the costliest part of the acquisition of an annotated corpus. Hence there is strong motivation to reduce annotation costs but not at the expense of quality. Active learning AL can be employed to reduce the costs of corpus annotation Engelson and Dagan 1996 Ringger et al. 2007 Tomanek et al. 2007 . With the assistance of AL the role of the human oracle is either to label a datum or simply to correct the label from an automatic labeler. For the present work we assume that correction is less costly than annotation from scratch testing this assumption is the subject of future work. In AL the learner leverages newly provided annotations to select more informative sentences which in turn can be used by the automatic labeler to provide more accurate annotations in future iterations. Ideally this process yields accurate labels with less