tailieunhanh - Báo cáo khoa học: "Structural, Transitive and Latent Models for Biographic Fact Extraction"

This paper presents six novel approaches to biographic fact extraction that model structural, transitive and latent properties of biographical data. The ensemble of these proposed models substantially outperforms standard pattern-based biographic fact extraction methods and performance is further improved by modeling inter-attribute correlations and distributions over functions of attributes, achieving an average extraction accuracy of 80% over seven types of biographic attributes. | Structural Transitive and Latent Models for Biographic Fact Extraction Nikesh Garera and David Yarowsky Department of Computer Science Johns Hopkins University Human Language Technology Center of Excellence Baltimore MD USA ngarera yarowsky @ Abstract This paper presents six novel approaches to biographic fact extraction that model structural transitive and latent properties of biographical data. The ensemble of these proposed models substantially outperforms standard pattern-based biographic fact extraction methods and performance is further improved by modeling inter-attribute correlations and distributions over functions of attributes achieving an average extraction accuracy of 80 over seven types of biographic attributes. 1 Introduction Extracting biographic facts such as Birthdate Occupation Nationality etc. is a critical step for advancing the state of the art in information processing and retrieval. An important aspect of web search is to be able to narrow down search results by distinguishing among people with the same name leading to multiple efforts focusing on web person name disambiguation in the literature Mann and Yarowsky 2003 Artiles et al. 2007 Cucerzan 2007 . While biographic facts are certainly useful for disambiguating person names they also allow for automatic extraction of ency-lopedic knowledge that has been limited to manual efforts such as Britannica Wikipedia etc. Such encyploedic knowledge can advance vertical search engines such as http that are focused on people searches where one can get an enhanced search interface for searching by various biographic attributes. Biographic facts are also useful for powerful query mechanisms such as finding what attributes are common between two people Auer and Lehmann 2007 . Allison Wolfe Allison Wolfe is a Washington DC-based singer and performer. Background Born an identical twin in Memphis Tennessee on November 9 1969 Allison played a significant role in the formation of the

TỪ KHÓA LIÊN QUAN