tailieunhanh - Báo cáo khoa học: "The Software Architecture for the First Challenge on Generating Instructions in Virtual Environments"
Natural language generation (NLG) systems are notoriously hard to evaluate. On the one hand, simply comparing system outputs to a gold standard is not appropriate because there can be multiple generated outputs that are equally good, and finding metrics that account for this variability and produce results consistent with human judgments and task performance measures is difficult (Belz and Gatt, 2008; Stent et al., 2005; Foster, 2008). On the other hand, lab-based evaluations with human subjects to assess each aspect of the system’s functionality are expensive and time-consuming. . | The Software Architecture for the First Challenge on Generating Instructions in Virtual Environments Alexander Koller Saarland University koller@ Donna Byron Northeastern University dbyron@ Justine Cassell Northwestern University justine@ Robert Dale Johanna Moore Jon Oberlander Kristina Striegnitz Macquarie University University of Edinburgh University of Edinburgh Union College striegnk@ Abstract The GIVE Challenge is a new Internetbased evaluation effort for natural language generation systems. In this paper we motivate and describe the software infrastructure that we developed to support this challenge. 1 Introduction Natural language generation NLG systems are notoriously hard to evaluate. On the one hand simply comparing system outputs to a gold standard is not appropriate because there can be multiple generated outputs that are equally good and finding metrics that account for this variability and produce results consistent with human judgments and task performance measures is difficult Belz and Gatt 2008 Stent et al. 2005 Foster 2008 . On the other hand lab-based evaluations with human subjects to assess each aspect of the system s functionality are expensive and time-consuming. These characteristics make it hard to compare different systems and measure progress. GIVE Generating Instructions in Virtual Environments Koller et al. 2007 is a research challenge for the NLG community designed to provide a new approach to NLG system evaluation. In the GIVE scenario users try to solve a treasure hunt in a virtual 3D world that they have not seen before. The computer has a complete symbolic representation of the virtual environment. The challenge for the NLG system is to generate in real time natural-language instructions that will guide the users to the successful completion of their task see Fig. 1 . One crucial advantage of this generation task is
đang nạp các trang xem trước