tailieunhanh - Báo cáo khoa học: "A MARKOV LANGUAGE LEARNING MODEL FOR FINITE PARAMETER SPACES"

This paper shows how to formally characterize language learning in a finite parameter space as a Markov structure, hnportant new language learning results follow directly: explicitly calculated sample complexity learning times under different input distribution assumptions (including CHILDES database language input) and learning regimes. We also briefly describe a new way to formally model (rapid) diachronic syntax change. | A MARKOV LANGUAGE LEARNING MODEL FOR FINITE PARAMETER SPACES Partha Niyogi and Robert c. Berwick Center for Biological and Computational Learning Massachusetts Institute of Technology E25-201 Cambridge MA 02139 USA Internet pn@ berwick@ Abstract This paper shows how to formally characterize language learning in a finite parameter space as a Markov structure. Important new language learning results follow directly explicitly calculated sample complexity learning times under different input distribution assumptions including CHILDES database language input and learning regimes. We also briefly describe a new way to formally model rapid diachronic syntax change. BACKGROUND MOTIVATION TRIGGERS AND LANGUAGE ACQUISITION Recently several researchers including Gibson and Wexler 1994 henceforth GW Dresher and Kaye 1990 and Clark and Roberts 1993 have modeled language learning in a finite space whose grammars are characterized by a finite number of parameters or n-length Boolean-valued vectors. Many current linguistic theories now employ such parametric models explicitly or in spirit including Lexical-Functional Grammar and versions of HPSG besides GB variants. With all such models key questions about sample complexity convergence time and alternative modeling assumptions are difficult to assess without a precise mathematical formalization. Previous research has usually addressed only the question of convergence in the limit without probing the equally important question of sample complexity it is of not much use that a learner can acquire a language if sample complexity is extraordinarily high hence psychologically implausible. This remains a relatively undeveloped area of language learning theory. The current paper aims to fill that gap. We choose as a starting point the GW Triggering Learning Algorithm TLA . Our central result is that the performance of this algorithm and others like it is completely modeled by a Markov chain. We explore the basic .