tailieunhanh - Báo cáo khoa học: "A Domain-Specific Statistical Surface Realizer"

We present a search-based approach to automatic surface realization given a corpus of domain sentences. Using heuristic search based on a statistical language model and a structure we introduce called an inheritance table we overgenerate a set of complete syntactic-semantic trees that are consistent with the given semantic structure and have high likelihood relative to the language model. These trees are then lexicalized, linearized, scored, and ranked. This model is being developed to generate real-time navigation instructions. . | A Domain-Specific Statistical Surface Realizer Jeffrey T. Russell Center for the Study of Language and Information Stanford University jefe@ Abstract We present a search-based approach to automatic surface realization given a corpus of domain sentences. Using heuristic search based on a statistical language model and a structure we introduce called an inheritance table we overgenerate a set of complete syntactic-semantic trees that are consistent with the given semantic structure and have high likelihood relative to the language model. These trees are then lexicalized linearized scored and ranked. This model is being developed to generate real-time navigation instructions. 1 Introduction The target application for this work is real-time interactive navigation instructions. Good directiongivers respond actively to a driver s actions and questions and express instructions relative to a large variety of landmarks times and distances. These traits require robust real-time natural language generation. This can be broken into three steps 1 generating a route plan 2 reasoning about the route and the user to produce an abstract representation of individual instructions and 3 realizing these instructions as sentences in natural language in our case English . We focus on the last of these steps given a structure that represents the semantic content of a sentence we want to produce an English sentence that expresses this content. According to the traditional division of content determination sentence planning and surface realization our work is primarily concerned with surface realization but also includes aspects of sentence planning. Our application requires robust flexibility within a restricted domain that is not well represented in the traditional corpora or tools. These requirements suggest using trainable stochastic generation. A number of statistical surface realizers have been described notably the FERGUS Bangalore and Rambow 2000 and HALogen systems .

TÀI LIỆU LIÊN QUAN