tailieunhanh - Báo cáo khoa học: "Stochastic Language Generation Using WIDL-expressions and its Application in Machine Translation and Summarization"
We propose WIDL-expressions as a flexible formalism that facilitates the integration of a generic sentence realization system within end-to-end language processing applications. WIDL-expressions represent compactly probability distributions over finite sets of candidate realizations, and have optimal algorithms for realization via interpolation with language model probability distributions. We show the effectiveness of a WIDL-based NLG system in two sentence realization tasks: automatic translation and headline generation. . | Stochastic Language Generation Using WIDL-expressions and its Application in Machine Translation and Summarization Radu Soricut Information Sciences Institute University of Southern California 4676 Admiralty Way Suite 1001 Marina del Rey CA 90292 radu@ Daniel Marcu Information Sciences Institute University of Southern California 4676 Admiralty Way Suite 1001 Marina del Rey CA 90292 marcu@ Abstract We propose WIDL-expressions as a flexible formalism that facilitates the integration of a generic sentence realization system within end-to-end language processing applications. WIDL-expressions represent compactly probability distributions over finite sets of candidate realizations and have optimal algorithms for realization via interpolation with language model probability distributions. We show the effectiveness of a WIDL-based NLG system in two sentence realization tasks automatic translation and headline generation. 1 Introduction The Natural Language Generation NLG community has produced over the years a considerable number of generic sentence realization systems Penman Matthiessen and Bateman 1991 FUF Elhadad 1991 Nitrogen Knight and Hatzivassiloglou 1995 Fergus Bangalore and Rambow 2000 HALogen Langkilde-Geary 2002 Amalgam Corston-Oliver et al. 2002 etc. However when it comes to end-to-end text-to-text applications - Machine Translation Summarization Question Answering - these generic systems either cannot be employed or in instances where they can be the results are significantly below that of state-of-the-art application-specific systems Hajic et al. 2002 Habash 2003 . We believe two reasons explain this state of affairs. First these generic NLG systems use input representation languages with complex syntax and semantics. These languages involve deep semanticbased subject-verb or verb-object relations such as ACToR agent patient etc. for Penman and FUF syntactic relations such as subject object premod etc. for HALogen or lexical dependencies Fergus
đang nạp các trang xem trước