tailieunhanh - Báo cáo khoa học: "Estimating Strictly Piecewise Distributions"

Strictly Piecewise (SP) languages are a subclass of regular languages which encode certain kinds of long-distance dependencies that are found in natural languages. Like the classes in the Chomsky and Subregular hierarchies, there are many independently converging characterizations of the SP class (Rogers et al., to appear). Here we define SP distributions and show that they can be efficiently estimated from positive data. | Estimating Strictly Piecewise Distributions Jeffrey Heinz University of Delaware Newark Delaware USA heinz@ James Rogers Earlham College Richmond Indiana USA j rogers@ Abstract Strictly Piecewise SP languages are a subclass of regular languages which encode certain kinds of long-distance dependencies that are found in natural languages. Like the classes in the Chomsky and Subregular hierarchies there are many independently converging characterizations of the SP class Rogers et al. to appear . Here we define SP distributions and show that they can be efficiently estimated from positive data. 1 Introduction Long-distance dependencies in natural language are of considerable interest. Although much attention has focused on long-distance dependencies which are beyond the expressive power of models with finitely many states Chomsky 1956 Joshi 1985 Shieber 1985 Kobele 2006 there are some long-distance dependencies in natural language which permit finite-state characterizations. For example although it is well-known that vowel and consonantal harmony applies across any arbitrary number of intervening segments Ringen 1988 Bakovic 2000 Hansson 2001 Rose and Walker 2004 and that phonological patterns are regular Johnson 1972 Kaplan and Kay 1994 it is less well-known that harmony patterns are largely characterizable by the Strictly Piecewise languages a subregular class of languages with independently-motivated converging characterizations see Heinz 2007 to appear and especially Rogers et al. 2009 . As shown by Rogers et al. to appear the Strictly Piecewise SP languages which make distinctions on the basis of potentially discontiguous subsequences are precisely analogous to the Strictly Local SL languages McNaughton and Papert 1971 Rogers and Pullum to appear which make distinctions on the basis of contiguous subsequences. The Strictly Local languages are the formal-language theoretic foundation for n-gram models Garcia et al. 1990 which are widely

TỪ KHÓA LIÊN QUAN