tailieunhanh - Báo cáo khoa học: "A Unified Syntactic Model for Parsing Fluent and Disfluent Speech∗"

This paper describes a syntactic representation for modeling speech repairs. This representation makes use of a right corner transform of syntax trees to produce a tree representation in which speech repairs require very few special syntax rules, making better use of training data. PCFGs trained on syntax trees using this model achieve high accuracy on the standard Switchboard parsing task. | A Unified Syntactic Model for Parsing Fluent and Disfluent Speech Tim Miller University of Minnesota tmill@ William Schuler University of Minnesota schuler@ Abstract This paper describes a syntactic representation for modeling speech repairs. This representation makes use of a right corner transform of syntax trees to produce a tree representation in which speech repairs require very few special syntax rules making better use of training data. PCFGs trained on syntax trees using this model achieve high accuracy on the standard Switchboard parsing task. 1 Introduction Speech repairs occur when a speaker makes a mistake and decides to partially retrace an utterance in order to correct it. Speech repairs are common in spontaneous speech - one study found 30 of dialogue turns contained repairs Carletta et al. 1993 and another study found one repair every seconds Blackmer and Mitton 1991 . Because of the relatively high frequency of this phenomenon spontaneous speech recognition systems will need to be able to deal with repairs to achieve high levels of accuracy. The speech repair terminology used here follows that of Shriberg 1994 . A speech repair consists of a reparandum an interruption point and the alteration. The reparandum contains the words that the speaker means to replace including both words that are in error and words that will be retraced. The interruption point is the point in time where the stream of speech is actually stopped and the repairing of the mistake can begin. The alteration contains the This research was supported by NSF CAREER award 0447685. The views expressed are not necessarily endorsed by the sponsors. words that are meant to replace the words in the reparandum. Recent advances in recognizing spontaneous speech with repairs Hale et al. 2006 Johnson and Charniak 2004 have used parsing approaches on transcribed speech to account for the structure inherent in speech repairs at the word level and above. One salient .