tailieunhanh - Báo cáo khoa học: "Learning More Effective Dialogue Strategies Using Limited Dialogue Move Features"

We explore the use of restricted dialogue contexts in reinforcement learning (RL) of effective dialogue strategies for information seeking spoken dialogue systems (. COMMUNICATOR (Walker et al., 2001)). The contexts we use are richer than previous research in this area, . (Levin and Pieraccini, 1997; Scheffler and Young, 2001; Singh et al., 2002; Pietquin, 2004), which use only slot-based information, but are much less complex than the full dialogue “Information States” explored in (Henderson et al., 2005), for which tractabe learning is an issue. . | Learning More Effective Dialogue Strategies Using Limited Dialogue Move Features Matthew Frampton and Oliver Lemon HCRC School of Informatics University of Edinburgh Edinburgh EH8 9LW Uk olemon@ Abstract We explore the use of restricted dialogue contexts in reinforcement learning RL of effective dialogue strategies for information seeking spoken dialogue systems . COMMUNICATOR Walker et al. 2001 . The contexts we use are richer than previous research in this area . Levin and Pieraccini 1997 Scheffler and Young 2001 Singh et al. 2002 Pietquin 2004 which use only slot-based information but are much less complex than the full dialogue Information States explored in Henderson et al. 2005 for which tractabe learning is an issue. We explore how incrementally adding richer features allows learning of more effective dialogue strategies. We use 2 user simulations learned from COMMUNICATOR data Walker et al. 2001 Georgila et al. 2005b to explore the effects of different features on learned dialogue strategies. Our results show that adding the dialogue moves of the last system and user turns increases the average reward of the automatically learned strategies by over the original hand-coded COMMUNICATOR systems and by over a baseline RL policy that uses only slot-status features. We show that the learned strategies exhibit an emergent focus switching strategy and effective use of the give help action. 1 Introduction Reinforcement Learning RL applied to the prob lem of dialogue management attempts to find op timal mappings from dialogue contexts to sys tem actions. The idea of using Markov Deci sion Processes MDPs and reinforcement learn ing to design dialogue strategies for dialogue sys tems was first proposed by Levin and Pierac-cini 1997 . There and in subsequent work such as Singh et al. 2002 Pietquin 2004 Scheffler and Young 2001 only very limited state information was used in strategy learning based always on the .