tailieunhanh - Báo cáo khoa học: "Reinforcement Learning for Mapping Instructions to Actions"

In this paper, we present a reinforcement learning approach for mapping natural language instructions to sequences of executable actions. We assume access to a reward function that defines the quality of the executed actions. During training, the learner repeatedly constructs action sequences for a set of documents, executes those actions, and observes the resulting reward. We use a policy gradient algorithm to estimate the parameters of a log-linear model for action selection. We apply our method to interpret instructions in two domains — Windows troubleshooting guides and game tutorials. . | Reinforcement Learning for Mapping Instructions to Actions . Branavan Harr Chen Luke S. Zettlemoyer Regina Barzilay Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology branavan harr Isz regina @ Abstract In this paper we present a reinforcement learning approach for mapping natural language instructions to sequences of executable actions. We assume access to a reward function that defines the quality of the executed actions. During training the learner repeatedly constructs action sequences for a set of documents executes those actions and observes the resulting reward. We use a policy gradient algorithm to estimate the parameters of a log-linear model for action selection. We apply our method to interpret instructions in two domains Windows troubleshooting guides and game tutorials. Our results demonstrate that this method can rival supervised learning techniques while requiring few or no annotated training 1 Introduction The problem of interpreting instructions written in natural language has been widely studied since the early days of artificial intelligence Winograd 1972 Di Eugenio 1992 . Mapping instructions to a sequence of executable actions would enable the automation of tasks that currently require human participation. Examples include configuring software based on how-to guides and operating simulators using instruction manuals. In this paper we present a reinforcement learning framework for inducing mappings from text to actions without the need for annotated training examples. For concreteness consider instructions from a Windows troubleshooting guide on deleting temporary folders shown in Figure 1. We aim to map 1Code data and annotations used in this work are available at http rbg code rl o Click start point to search and then click for files or folders. o In the search results dialog box on the tools menu click folder options. o In the folder options dialog .

TÀI LIỆU LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
11    175    1    01-07-2024