tailieunhanh - Báo cáo khoa học: "A Bayesian Model for Discovering Typological Implications"

A standard form of analysis for linguistic typology is the universal implication. These implications state facts about the range of extant languages, such as “if objects come after verbs, then adjectives come after nouns.” Such implications are typically discovered by painstaking hand analysis over a small sample of languages. We propose a computational model for assisting at this process. Our model is able to discover both well-known implications as well as some novel implications that deserve further study. Moreover, through a careful application of hierarchical analysis, we are able to cope with the well-known sampling problem: languages are not. | A Bayesian Model for Discovering Typological Implications Hal Daume III School of Computing University of Utah me@ Lyle Campbell Department of Linguistics University of Utah lcampbel@ Abstract A standard form of analysis for linguistic typology is the universal implication. These implications state facts about the range of extant languages such as if objects come after verbs then adjectives come after nouns. Such implications are typically discovered by painstaking hand analysis over a small sample of languages. We propose a computational model for assisting at this process. Our model is able to discover both well-known implications as well as some novel implications that deserve further study. Moreover through a careful application of hierarchical analysis we are able to cope with the well-known sampling problem languages are not independent. 1 Introduction Linguistic typology aims to distinguish between logically possible languages and actually observed languages. A fundamental building block for such an understanding is the universal implication Greenberg 1963 . These are short statements that restrict the space of languages in a concrete way for instance object-verb ordering implies adjective-noun ordering Croft 2003 Hawkins 1983 and Song 2001 provide excellent introductions to linguistic typology. We present a statistical model for automatically discovering such implications from a large typological database Haspelmath et al. 2005 . Analyses of universal implications are typically performed by linguists inspecting an array of 30100 languages and a few pairs of features. Looking 65 at all pairs of features typically several hundred is virtually impossible by hand. Moreover it is insufficient to simply look at counts. For instance results presented in the form verb precedes object implies prepositions in 16 19 languages are nonconclusive. While compelling this is not enough evidence to decide if this is a statistically well-founded .

TỪ KHÓA LIÊN QUAN