tailieunhanh - Báo cáo khoa học: "A SPEECH-FIRST MODEL FOR REPAIR DETECTION AND CORRECTION"

Interpreting fully natural speech is an important goal for spoken language understanding systems. However, while corpus studies have shown that about 10% of spontaneous utterances contain self-corrections, or REPAIRS, little is known about the extent to which cues in the speech signal may facilitate repair processing. We identify several cues based on acoustic and prosodic analysis of repairs in a corpus of spontaneous speech, and propose methods for exploiting these cues to detect and correct repairs. We test our acoustic-prosodic cues with other lexical cues to repair identification and find that precision rates of 89-93% and recall of 78-83%. | A SPEECH-FIRST MODEL FOR REPAIR DETECTION AND CORRECTION Christine Nakatani Division of Applied Sciences Harvard University Cambridge MA 02138 chn@ Julia Hirschberg 2D-450 AT T Bell Laboratories 600 Mountain Avenue Murray Hill NJ 07974-0636 julia@ Abstract Interpreting fully natural speech is an important goal for spoken language understanding systems. However while corpus studies have shown that about 10 of spontaneous utterances contain self-corrections or REPAIRS little is known about the extent to which cues in the speech signal may facilitate repair processing. We identify several cues based on acoustic and prosodic analysis of repairs in a corpus of spontaneous speech and propose methods for exploiting these cues to detect and correct repairs. We test our acoustic-prosodic cues with other lexical cues to repair identification and find that precision rates of 89-93 and recall of 78-83 can be achieved depending upon the cues employed from a prosodically labeled corpus. Introduction Disfluencies in spontaneous speech pose serious problems for spoken language systems. First a speaker may produce a partial word or FRAGMENT a string of phonemes that does not form the complete intended word. Some fragments may coincidentally match words actually in the lexicon such as fly in Example 1 others will be identified with the acoustically closest item s in the lexicon as in Example 2 .1 1 What is the earliest fli- flight from Washington to Atlanta leaving on Wednesday September fourth 2 Actual string What is the fare fro- on American Airlines fourteen forty three Recognized string With fare four American Airlines fourteen forty three Even if all words in a disfluent segment are correctly recognized failure to detect a disfluency may lead to interpretation errors during subsequent processing as in Example 3 . The presence of a word fragment in examples is indicated by the diacritic Self-coưected portions of tire utterance appear in boldface. .

TÀI LIỆU LIÊN QUAN