tailieunhanh - Báo cáo khoa học: "The Complexity of Phrase Alignment Problems"

Many phrase alignment models operate over the combinatorial space of bijective phrase alignments. We prove that finding an optimal alignment in this space is NP-hard, while computing alignment expectations is #P-hard. On the other hand, we show that the problem of finding an optimal alignment can be cast as an integer linear program, which provides a simple, declarative approach to Viterbi inference for phrase alignment models that is empirically quite efficient. | The Complexity of Phrase Alignment Problems John DeNero and Dan Klein Computer Science Division EECS Department University of California at Berkeley denero klein @ Abstract Many phrase alignment models operate over the combinatorial space of bijective phrase alignments. We prove that finding an optimal alignment in this space is NP-hard while computing alignment expectations is P-hard. On the other hand we show that the problem of finding an optimal alignment can be cast as an integer linear program which provides a simple declarative approach to Viterbi inference for phrase alignment models that is empirically quite efficient. 1 Introduction Learning in phrase alignment models generally requires computing either Viterbi phrase alignments or expectations of alignment links. For some restricted combinatorial spaces of alignments those that arise in ITG-based phrase models Cherry and Lin 2007 or local distortion models Zens et al. 2004 inference can be accomplished using polynomial time dynamic programs. However for more permissive models such as Marcu and Wong 2002 and DeNero et al. 2006 which operate over the full space of bijective phrase alignments see below no polynomial time algorithms for exact inference have been exhibited. Indeed Marcu and Wong 2002 conjectures that none exist. In this paper we show that Viterbi inference in this full space is NP-hard while computing expectations is P-hard. On the other hand we give a compact formulation of Viterbi inference as an integer linear program ILP . Using this formulation exact solutions to the Viterbi search problem can be found by highly optimized general purpose ILP solvers. While ILP is of course also NP-hard we show that empirically exact solutions are found very quickly for most problem instances. In an experiment intended to illustrate the practicality of the ILP approach we show speed and search accuracy results for aligning phrases under a standard phrase translation model. 2 Phrase .