tailieunhanh - Báo cáo khoa học: "Corpus-based interpretation of instructions in virtual environments"
Previous approaches to instruction interpretation have required either extensive domain adaptation or manually annotated corpora. This paper presents a novel approach to instruction interpretation that leverages a large amount of unannotated, easy-to-collect data from humans interacting with a virtual world. We compare several algorithms for automatically segmenting and discretizing this data into (utterance, reaction) pairs and training a classifier to predict reactions given the next utterance. Our empirical analysis shows that the best algorithm achieves 70% accuracy on this task, with no manual annotation required. . | Corpus-based interpretation of instructions in virtual environments Luciana Benotti1 Martin Villalba1 Tessa Lau2 Julian Cerruti3 1 FaMAF Medina Allende s n Universidad Nacional de Cordoba Cordoba Argentina 2IBM Research - Almaden 650 Harry Road San Jose CA 95120 USA 3IBM Argentina Ing. Butty 275 C1001AFA Buenos Aires Argentina benotti villalba @ tessalau@ jcerruti@ Abstract Previous approaches to instruction interpretation have required either extensive domain adaptation or manually annotated corpora. This paper presents a novel approach to instruction interpretation that leverages a large amount of unannotated easy-to-collect data from humans interacting with a virtual world. We compare several algorithms for automatically segmenting and discretizing this data into utterance reaction pairs and training a classifier to predict reactions given the next utterance. Our empirical analysis shows that the best algorithm achieves 70 accuracy on this task with no manual annotation required. 1 Introduction and motivation Mapping instructions into automatically executable actions would enable the creation of natural language interfaces to many applications Lau et al. 2009 Branavan et al. 2009 Orkin and Roy 2009 . In this paper we focus on the task of navigation and manipulation of a virtual environment Vogel and Jurafsky 2010 Chen and Mooney 2011 . Current symbolic approaches to the problem are brittle to the natural language variation present in instructions and require intensive rule authoring to be fit for a new task Dzikovska et al. 2008 . Current statistical approaches require extensive manual annotations of the corpora used for training MacMa-hon et al. 2006 Matuszek et al. 2010 Gorniak and Roy 2007 Rieser and Lemon 2010 . Manual annotation and rule authoring by natural language engineering experts are bottlenecks for developing conversational systems for new domains. 181 This paper proposes a fully automated approach to interpreting
đang nạp các trang xem trước