tailieunhanh - Báo cáo khoa học: "Foma: a finite-state compiler and library"

Foma is a compiler, programming language, and C library for constructing finite-state automata and transducers for various uses. It has specific support for many natural language processing applications such as producing morphological and phonological analyzers. Foma is largely compatible with the Xerox/PARC finite-state toolkit. It also embraces Unicode fully and supports various different formats for specifying regular expressions: the Xerox/PARC format, a Perl-like format, and a mathematical format that takes advantage of the ‘Mathematical Operators’ Unicode block. . | Foma a finite-state compiler and library Mans Hulden University of Arizona mhulden@ Abstract Foma is a compiler programming language and C library for constructing finite-state automata and transducers for various uses. It has specific support for many natural language processing applications such as producing morphological and phonological analyzers. Foma is largely compatible with the Xerox PARC finite-state toolkit. It also embraces Unicode fully and supports various different formats for specifying regular expressions the Xerox PARC format a Perl-like format and a mathematical format that takes advantage of the Mathematical Operators Unicode block. 1 Introduction Foma is a finite-state compiler programming language and regular expression finite-state library designed for multi-purpose use with explicit support for automata theoretic research constructing lexical analyzers for programming languages and building morphological phonological analyzers as well as spellchecking applications. The compiler allows users to specify finite-state automata and transducers incrementally in a similar fashion to AT T s fsm Mohri et al. 1997 and Lextools Sproat 2003 the Xerox PARC finite-state toolkit Beesley and Karttunen 2003 and the SFST toolkit Schmid 2005 . One of Foma s design goals has been compatibility with the Xe-rox PARC toolkit. Another goal has been to allow for the ability to work with n-tape automata and a formalism for expressing first-order logical constraints over regular languages and n-tape-transductions. Foma is licensed under the GNU general public license in keeping with traditions of free software the distribution that includes the source code comes with a user manual and a library of examples. The compiler and library are implemented in C and an API is available. The API is in many ways similar to the standard C library and has similar calling conventions. However all the low-level functions that operate directly on au-tomata .

TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN