tailieunhanh - Báo cáo khoa học: "The OpenGrm open-source finite-state grammar software libraries"

In this paper, we present a new collection of open-source software libraries that provides command line binary utilities and library classes and functions for compiling regular expression and context-sensitive rewrite rules into finite-state transducers, and for n-gram language modeling. The OpenGrm libraries use the OpenFst library to provide an efficient encoding of grammars and general algorithms for building, modifying and applying models. | The OpenGrm open-source finite-state grammar software libraries Brian Roark Richard Sproat Cyril Allauzen0 Michael Riley0 Jeffrey Sorensen0 Terry Tai Oregon Health Science University Portland Oregon Google Inc. New York Abstract In this paper we present a new collection of open-source software libraries that provides command line binary utilities and library classes and functions for compiling regular expression and context-sensitive rewrite rules into finite-state transducers and for n-gram language modeling. The OpenGrm libraries use the OpenFst library to provide an efficient encoding of grammars and general algorithms for building modifying and applying models. 1 Introduction The OpenGrm libraries1 are a growing collection of open-source software libraries for building and applying various kinds of formal grammars. The C libraries use the OpenFst library2 for the underlying finite-state representation which allows for easy inspection of the resulting grammars and models as well as straightforward combination with other finite-state transducers. Like OpenFst there are easy-to-use command line binaries for frequently used operations as well as a C library interface allowing library users to create their own algorithms from the basic classes and functions provided. The libraries can be used for a range of common string processing tasks such as text normalization as well as for building and using large statistical models for applications like speech recognition. In the rest of the paper we will present each of the two libraries starting with the Thrax grammar compiler and then the NGram library. First though we will briefly present some preliminary informal background on weighted finite-state transducers WFST just as needed for this paper. 1http 2http 61 2 Informal WFST preliminaries A weighted finite-state transducer consists of a set of states and transitions between states. There is an initial state and a subset of states are .

TỪ KHÓA LIÊN QUAN