tailieunhanh - Báo cáo khoa học: "Learning foci for Question Answering over Topic Maps"

This paper introduces the concepts of asking point and expected answer type as variations of the question focus. They are of particular importance for QA over semistructured data, as represented by Topic Maps, OWL or custom XML formats. We describe an approach to the identification of the question focus from questions asked to a Question Answering system over Topic Maps by extracting the asking point and falling back to the expected answer type when necessary. | Learning foci for Question Answering over Topic Maps Alexander Mikhailian Tiphaine Dalmas and Rani Pinchuk tSpace Application Services Leuvensesteenweg 325 B-1932 Zaventem Belgium @ Aethys Abstract This paper introduces the concepts of asking point and expected answer type as variations of the question focus. They are of particular importance for QA over semistructured data as represented by Topic Maps OWL or custom XML formats. We describe an approach to the identification of the question focus from questions asked to a Question Answering system over Topic Maps by extracting the asking point and falling back to the expected answer type when necessary. We use known machine learning techniques for expected answer type extraction and we implement a novel approach to the asking point extraction. We also provide a mathematical model to predict the performance of the system. 1 Introduction Topic Maps is an ISO standard1 for knowledge representation and information integration. It provides the ability to store complex meta-data together with the data itself. This work addresses domain portable Question Answering QA over Topic Maps. That is a QA system capable of retrieving answers to a question asked against one particular topic map or topic maps collection at a time. We concentrate on an empirical approach to extract the question focus. The extracted focus is then anchored to a topic map construct. This way we map the type of the answer as provided in the question to the type of the answer as available in the source data. Our system runs over semi-structured data that encodes ontological information. The classification scheme we propose is based on one dynamic 1ISO IEC 13250 2003 http sam and one static layer contrasting with previous work that uses static taxonomies Li and Roth 2002 . We use the term asking point or AP when the type of the answer is explicit . the