tailieunhanh - Báo cáo khoa học: "Aflexible distributed architecture for NLP system development and use"

We describe a distributed, modular architecture for platform independent natural language systems. It features automatic interface generation and self-organization. Adaptive (and nonadaptive) voting mechanisms are used for integrating discrete modules. The architecture is suitable for rapid prototyping and product delivery. | A flexible distributed architecture for NLP system development and use Freddy Y. Y. Choi Artificial Intelligence Group University of Manchester Manchester . choif@ Abstract We describe a distributed modular architecture for platform independent natural language systems. It features automatic interface generation and self-organization. Adaptive and non-adaptive voting mechanisms are used for integrating discrete modules. The architecture is suitable for rapid prototyping and product delivery. 1 Introduction This article describes TEA1 a flexible architecture for developing and delivering platform independent text engineering TE systems. TEA provides a generalized framework for organizing and applying reusable TE components . to-kenizer stemmer . Thus developers are able to focus on problem solving rather than implementation. For product delivery the end user receives an exact copy of the developer s edition. The visibility of configurable options different levels of detail is adjustable along a simple gradient via the automatically generated user interface Edwards Forthcoming . Our target application is telegraphic text compression Choi 1999b of Roelofs Forthcoming Grefenstette 1998 . We aim to improve the efficiency of screen readers for the visually disabled by removing uninformative words . determiners in text documents. This produces a stream of topic cues for rapid skimming. The information value of each word is to be estimated based on an unusually wide range of linguistic information. TEA was designed to be a development environment for this work. However the target application has led us to produce an interesting lTEA is an acronym for Text Engineering Architecture. architecture and techniques that are more generally applicable and it is these which we will focus on in this paper. 2 Architecture Figure 1 An overview of the TEA system framework. The central component of TEA is a framebased data model F see . In this model a document