Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Learning to Adapt to Unknown Users: Referring Expression Generation in Spoken Dialogue Systems"
Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
We present a data-driven approach to learn user-adaptive referring expression generation (REG) policies for spoken dialogue systems. Referring expressions can be difficult to understand in technical domains where users may not know the technical ‘jargon’ names of the domain entities. In such cases, dialogue systems must be able to model the user’s (lexical) domain knowledge and use appropriate referring expressions. | Learning to Adapt to Unknown Users Referring Expression Generation in Spoken Dialogue Systems Srinivasan Janarthanam School of Informatics University of Edinburgh s.janarthanam@ed.ac.uk Oliver Lemon Interaction Lab Mathematics and Computer Science MACS Heriot-Watt University o.lemon@hw.ac.uk Abstract We present a data-driven approach to learn user-adaptive referring expression generation REG policies for spoken dialogue systems. Referring expressions can be difficult to understand in technical domains where users may not know the technical jargon names of the domain entities. In such cases dialogue systems must be able to model the user s lexical domain knowledge and use appropriate referring expressions. We present a reinforcement learning RL framework in which the system learns REG policies which can adapt to unknown users online. Furthermore unlike supervised learning methods which require a large corpus of expert adaptive behaviour to train on we show that effective adaptive policies can be learned from a small dialogue corpus of non-adaptive human-machine interaction by using a RL framework and a statistical user simulation. We show that in comparison to adaptive hand-coded baseline policies the learned policy performs significantly better with an 18.6 average increase in adaptation accuracy. The best learned policy also takes less dialogue time average 1.07 min less than the best hand-coded policy. This is because the learned policies can adapt online to changing evidence about the user s domain expertise. 1 Introduction We present a reinforcement learning Sutton and Barto 1998 framework to learn user-adaptive referring expression generation policies from data-driven user simulations. A user-adaptive REG policy allows the system to choose appropriate expressions to refer to domain entities in a dialogue Jargon Please plug one end of the broadband cable into the broadband filter. Descriptive Please plug one end of the thin white cable with grey ends into the .