Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "A Ranking Model of Proximal and Structural Text Retrieval Based on Region Algebra"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

This paper investigates an application of the ranked region algebra to information retrieval from large scale but unannotated documents. We automatically annotated documents with document structure and semantic tags by using taggers, and retrieve information by specifying structure represented by tags and words using ranked region algebra. We report in detail what kind of data can be retrieved in the experiments by this approach. | A Ranking Model of Proximal and Structural Text Retrieval Based on Region Algebra Katsuya Masuda Department of Computer Science Graduate School of Information Science and Technology University of Tokyo Hongo 7-3-1 Bunkyo-ku Tokyo 113-0033 Japan kmasuda@is.s.u-tokyo.ac.jp Abstract This paper investigates an application of the ranked region algebra to information retrieval from large scale but unannotated documents. We automatically annotated documents with document structure and semantic tags by using taggers and retrieve information by specifying structure represented by tags and words using ranked region algebra. We report in detail what kind of data can be retrieved in the experiments by this approach. 1 Introduction In the biomedical area the number of papers is very large and increases as it is difficult to search the information. Although keyword-based retrieval systems can be applied to a database of papers users may not get the information they want since the relations between these keywords are not specified. If the document structures such as title sentence author and relation between terms are tagged in the texts then the retrieval is improved by specifying such structures. Models of the retrieval specifying both structures and words are pursued by many researchers Chinenyanga and Kushmerick 2001 Wolff et al. 1999 Theobald and Weilkum 2000 Deutsch et al. 1998 Salminen and Tompa 1994 Clarke et al. 1995 . However these models are not robust unlike keyword-based retrieval that is they retrieve only the exact matches for queries. In the previous research Masuda et al. 2003 we proposed a new ranking model that enables proximal and structural search for structured text. This paper investigates an application of the ranked region algebra to information retrieval from large scale but unannotated documents. We reports in detail what kind of data can be retrieved in the experiments. Our approach is to annotate documents with document structures and semantic tags by