tailieunhanh - Báo cáo khoa học: "Finding Bursty Topics from Microblogs"

Microblogs such as Twitter reflect the general public’s reactions to major events. Bursty topics from microblogs reveal what events have attracted the most online attention. Although bursty event detection from text streams has been studied before, previous work may not be suitable for microblogs because compared with other text streams such as news articles and scientific publications, microblog posts are particularly diverse and noisy. | Finding Bursty Topics from Microblogs Qiming Diao Jing Jiang Feida Zhu Ee-Peng Lim Living Analytics Research Centre School of Information Systems Singapore Management University jingjiang fdzhu eplim @ Abstract Microblogs such as Twitter reflect the general public s reactions to major events. Bursty topics from microblogs reveal what events have attracted the most online attention. Although bursty event detection from text streams has been studied before previous work may not be suitable for microblogs because compared with other text streams such as news articles and scientific publications microblog posts are particularly diverse and noisy. To find topics that have bursty patterns on microblogs we propose a topic model that simultaneously captures two observations 1 posts published around the same time are more likely to have the same topic and 2 posts published by the same user are more likely to have the same topic. The former helps find event-driven posts while the latter helps identify and filter out personal posts. Our experiments on a large Twitter dataset show that there are more meaningful and unique bursty topics in the top-ranked results returned by our model than an LDA baseline and two degenerate variations of our model. We also show some case studies that demonstrate the importance of considering both the temporal information and users personal interests for bursty topic detection from microblogs. 1 Introduction With the fast growth of Web a vast amount of user-generated content has accumulated on the social Web. In particular microblogging sites such as Twitter allow users to easily publish short instant posts about any topic to be shared with the 536 general public. The textual content coupled with the temporal patterns of these microblog posts provides important insight into the general public s interest. A sudden increase of topically similar posts usually indicates a burst of interest in some event that has .

TỪ KHÓA LIÊN QUAN