tailieunhanh - Báo cáo khoa học: "Automatic Detection of Poor Speech Recognition at the Dialogue Level"

The dialogue strategies used by a spoken dialogue system strongly influence performance and user satisfaction. An ideal system would not use a single fixed strategy, but would adapt to the circumstances at hand. To do so, a system must be able to identify dialogue properties that suggest adaptation. This paper focuses on identifying situations where the speech recognizer is performing poorly. We adopt a machine learning approach to learn rules from a dialogue corpus for identifying these situations. . | Automatic Detection of Poor speech Recognition at the Dialogue Level Diane J. Litman Marilyn A. Walker and Michael s. Kearns AT T Labs Research 180 Park Ave Bldg 103 Florham Park . 07932 diane walker mkearns @ Abstract The dialogue strategies used by a spoken dialogue system strongly influence performance and user satisfaction. An ideal system would not use a single fixed strategy but would adapt to the circumstances at hand. To do so a system must be able to identify dialogue properties that suggest adaptation. This paper focuses on identifying situations where the speech recognizer is performing poorly. We adopt a machine learning approach to learn rules from a dialogue corpus for identifying these situations. Our results show a significant improvement over the baseline and illustrate that both lower-level acoustic features and higher-level dialogue features can affect the performance of the learning algorithm. 1 Introduction Builders of spoken dialogue systems face a number of fundamental design choices that strongly influence both performance and user satisfaction. Examples include choices between user system or mixed initiative and between explicit and implicit confirmation of user commands. An ideal system wouldn t make such choices a priori but rather would adapt to the circumstances at hand. For instance a system detecting that a user is repeatedly uncertain about what to say might move from user to system initiative and a system detecting that speech recognition performance is poor might switch to a dialogue strategy with more explicit prompting an explicit confirmation mode or keyboard input mode. Any of these adaptations might have been appropriate in dialogue DI from the Annie system Kamm et al. 1998 shown in Figure 1. In order to improve performance through such adaptation a system must first be able to identify in real time salient properties of an ongoing dialogue that call for some useful change in system strategy. In other words