tailieunhanh - Báo cáo khoa học: "The Multilingual Named Entity Recognition Framework"

This paper presents a multilingual system designed to recognize named entities in a wide variety of languages (currently more than 12 languages are concerned). The system includes original strategies to deal with a wide variety of encoding character sets, analysis strategies and algorithms to process these languages. | The Multilingual Named Entity Recognition Framework Thierry Poibeau and the INaLCO Named Entity Group1 INaLCO CRIM 2 rue de Lille 75007 Paris Abstract This paper presents a multilingual system designed to recognize named entities in a wide variety of languages currently more than 12 languages are concerned . The system includes original strategies to deal with a wide variety of encoding character sets analysis strategies and algorithms to process these languages. 1 Introduction Since the MUC conferences about Information Extraction named entity recognition NERC is a well-established task in the NLP community MUC-6 1995 . Examples of named entities are person names location and company names date and time indications etc. A lot of systems have been developed to perform this task ranging from manually created rule-based systems to fully automatic learning-based systems. We will shortly present these technologies Even if a lot of systems have been developed for languages such as English or Japanese a large range of languages do not have access to such a technology. We propose an open framework to develop resources and tools for named entity recognition. A team of computational linguist students develops this 1 The members of the INaLCO Named Entity Group are A. Acoulon c. Avaux L. Beroff-Beneat- A. Cadeau M. Calberg A. Delale L. De Temmerman . Guenet D. Huis M. Jamalpour A. Krul A. Marcus F. Picoli and c. Plancq. project1 so that it also has pedagogic purposes. But even so the project seems to be sufficiently attractive to interest industrial partners. We describe the different approaches for named entity recognition. We then present the project and the different analysis techniques used. We will conclude with some considerations on evaluation and future work. 2 State of the art NERC systems In this section we examine the different approaches to named entity recognition. We then examine previous experiments to compare systems and techniques. Sekine and .

TỪ KHÓA LIÊN QUAN