Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Extracting Social Power Relationships from Natural Language"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

Sociolinguists have long argued that social context influences language use in all manner of ways, resulting in lects 1 . This paper explores a text classification problem we will call lect modeling, an example of what has been termed computational sociolinguistics. In particular, we use machine learning techniques to identify social power relationships between members of a social network, based purely on the content of their interpersonal communication. | Extracting Social Power Relationships from Natural Language Philip Bramsen Louisville KY bramsen@alum.mit.edu Ami Patel Massachusetts Institute of Technology Cambridge MA ampatel@mit.edu Martha Escobar-Molano San Diego CA mescobar@asgard.com Rafael Alonso SET Corporation Arlington VA ralonso@setcorp.com Abstract Sociolinguists have long argued that social context influences language use in all manner of ways resulting in lects 1. This paper explores a text classification problem We will call lect modeling an example of what has been termed computational sociolinguistics. In particular we use machine learning techniques to identify social power relationships between members of a social network based purely on the content of their interpersonal communication. We rely on statistical methods as opposed to language-specific engineering to extract features which represent vocabulary and grammar usage indicative of social power lect. We then apply support vector machines to model the social power lects representing superior-subordinate communication in the Enron email corpus. Our results validate the treatment of lect modeling as a text classification problem - albeit a hard one - and constitute a case for future research in computational sociolinguistics. 1 Introduction Linguists in sociolinguistics pragmatics and related fields have analyzed the influence of social context on language and have catalogued countless phenomena that are influenced by it confirming many with qualitative and quantitative studies. In This work was done while these authors were at SET Corporation an SAIC Company. 1 Fields that deal with society and language have inconsistent terminology lect is chosen here because lect has no other English definitions and the etymology of the word gives it the sense we consider most relevant. 773 deed social context and function influence language at every level - morphologically lexically syntactically and semantically through discourse structure and through .