tailieunhanh - Báo cáo khoa học: "Bootstrapping a Unified Model of Lexical and Phonetic Acquisition"

During early language acquisition, infants must learn both a lexicon and a model of phonetics that explains how lexical items can vary in pronunciation—for instance “the” might be realized as [Di] or [D@]. Previous models of acquisition have generally tackled these problems in isolation, yet behavioral evidence suggests infants acquire lexical and phonetic knowledge simultaneously. | Bootstrapping a Unified Model of Lexical and Phonetic Acquisition Micha Elsner melsner0@ ILCC School of Informatics University of Edinburgh Edinburgh EH8 9AB Uk Sharon Goldwater Jacob Eisenstein sgwater@ jacobe@ ILCC School of Informatics School of Interactive Computing University of Edinburgh Georgia Institute of Technology Edinburgh EH8 9AB Uk Atlanta GA 30308 USA Abstract During early language acquisition infants must learn both a lexicon and a model of phonetics that explains how lexical items can vary in pronunciation for instance the might be realized as pi or Ỗ9 . Previous models of acquisition have generally tackled these problems in isolation yet behavioral evidence suggests infants acquire lexical and phonetic knowledge simultaneously. We present a Bayesian model that clusters together phonetic variants of the same lexical item while learning both a language model over lexical items and a log-linear model of pronunciation variability based on articulatory features. The model is trained on transcribed surface pronunciations and learns by bootstrapping without access to the true lexicon. We test the model using a corpus of child-directed speech with realistic phonetic variation and either gold standard or automatically induced word boundaries. In both cases modeling variability improves the accuracy of the learned lexicon over a system that assumes each lexical item has a unique pronunciation. 1 Introduction Infants acquiring their first language confront two difficult cognitive problems building a lexicon of word forms and learning basic phonetics and phonology. The two tasks are closely related knowing what sounds can substitute for one another helps in clustering together variant pronunciations of the same word while knowing the environments in which particular words can occur helps determine which sound changes are meaningful and which are not Feldman a intended juwantwAn wantekuki b surface jo wa WAn wan a kuki c .

TỪ KHÓA LIÊN QUAN