tailieunhanh - Báo cáo khoa học: "Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm"

Most of previous approaches to automatic prosodic event detection are based on supervised learning, relying on the availability of a corpus that is annotated with the prosodic labels of interest in order to train the classification models. However, creating such resources is an expensive and time-consuming task. In this paper, we exploit semi-supervised learning with the co-training algorithm for automatic detection of coarse level representation of prosodic events such as pitch accents, intonational phrase boundaries, and break indices. . | Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm Je Hun Jeon and Yang Liu Computer Science Department The University of Texas at Dallas Richardson TX USA jhjeon yangl @ Abstract Most of previous approaches to automatic prosodic event detection are based on supervised learning relying on the availability of a corpus that is annotated with the prosodic labels of interest in order to train the classification models. However creating such resources is an expensive and time-consuming task. In this paper we exploit semi-supervised learning with the co-training algorithm for automatic detection of coarse level representation of prosodic events such as pitch accents in-tonational phrase boundaries and break indices. We propose a confidence-based method to assign labels to unlabeled data and demonstrate improved results using this method compared to the widely used agreement-based method. In addition we examine various informative sample selection methods. In our experiments on the Boston University radio news corpus using only a small amount of the labeled data as the initial training set our proposed labeling method combined with most confidence sample selection can effectively use unlabeled data to improve performance and finally reach performance closer to that of the supervised method using all the training data. 1 Introduction Prosody represents suprasegmental information in speech since it normally extends over more than one phoneme segment. Prosodic phenomena manifest themselves in speech in different ways including changes in relative intensity to emphasize specific words or syllables variations of the fundamental frequency range and contour and subtle timing variations such as syllable lengthening and insertion of pause. In spoken utterances speakers use prosody to convey emphasis intent attitude and emotion. These are important cues to aid the listener for interpretation of speech. Prosody also plays an .