tailieunhanh - Báo cáo khoa học: "SYNTACTIC APPROACHES TO AUTOMATIC BOOK INDEXING"

Automatic book indexing systems are based on the generation of phrase structures capable of reflecting text content. • Some approaches are given for the automatic construction of back-of-book indexes using a syntactic analysis of the available texts, followed by the identification of nominal constructions, the assignment of importance weights to the term phrases, and the choice of phrases as indexing units. research advances may, however, lead to the development of improved automatic book indexing procedures. . | SYNTACTIC APPROACHES TO AUTOMATIC BOOK INDEXING Gerard Salton Department of Computer Science Cornell University Ithaca NY 14853 ABSTRACT Automatic book indexing systems are based on the generation of phrase structures capable of reflecting text content. Some approaches are given for the automatic construction of back-of-book indexes using a syntactic analysis of the available texts followed by the identification of nominal constructions the assignment of importance weights to the term phrases and the choice of phrases as indexing units. INTRODUCTION Book indexing is of wide practical interest to authors publishers and readers of printed materials. For present purposes a standard entry in a book index may be assumed to be a nominal construction listed in normal phrase order or appearing in some permuted form with the. principal term as phrase head. Cross-references see or see also entries between index entries are also normally used in the index. Excerpts from two typical book indexes appear in Fig. 1. Attempts have been made over the years to mechanize the book indexing task based in part on the occurrence characteristics of certain content words in the document texts Borko 1970 and in part on more ambitious syntactic methodologies. Dillon 1983 However as of now completely viable automatic book indexing methods are not available. Two main This study was supported in part by a grant from OCLC Inc. and in part by the National Science Foundation under grant IRI-87-0273S. . research advances may however lead to the development of improved automatic book indexing procedures. These include the generation of advanced syntactic analysis procedures capable of analyzing unrestricted English texts as well as the construction of powerful automatic indexing systems using sophisticated term weighting systems to assess the importance of the indexing units. Salton 1975a 1975b By joining the available linguistic procedures with the available know-how in automatic indexing .

TỪ KHÓA LIÊN QUAN