tailieunhanh - Báo cáo khoa học: "A Tree Sequence Alignment-based Tree-to-Tree Translation Model"
This paper presents a translation model that is based on tree sequence alignment, where a tree sequence refers to a single sequence of subtrees that covers a phrase. The model leverages on the strengths of both phrase-based and linguistically syntax-based method. It automatically learns aligned tree sequence pairs with mapping probabilities from word-aligned biparsed parallel texts. Compared with previous models, it not only captures non-syntactic phrases and discontinuous phrases with linguistically structured features, but also supports multi-level structure reordering of tree typology with larger span. cess while Sections 4 and . | A Tree Sequence Alignment-based Tree-to-Tree Translation Model Min Zhang1 Hongfei Jiang2 Aiti Aw1 Haizhou Li1 Chew Lim Tan3 and Sheng Li2 2Harbin Institute of Technology 3National University of Singapore hfj iang@mtlab .hit .edu. cn tancl@. edu. sg lisheng@ Institute for Infocomm Research mzhang@ aaiti@ hli@ Abstract This paper presents a translation model that is based on tree sequence alignment where a tree sequence refers to a single sequence of subtrees that covers a phrase. The model leverages on the strengths of both phrase-based and linguistically syntax-based method. It automatically learns aligned tree sequence pairs with mapping probabilities from word-aligned biparsed parallel texts. Compared with previous models it not only captures non-syntactic phrases and discontinuous phrases with linguistically structured features but also supports multi-level structure reordering of tree typology with larger span. This gives our model stronger expressive power than other reported models. Experimental results on the NIST MT-2005 Chinese-English translation task show that our method statistically significantly outperforms the baseline systems. 1 Introduction Phrase-based modeling method Koehn et al. 2003 Och and Ney 2004a is a simple but powerful mechanism to machine translation since it can model local reorderings and translations of multiword expressions well. However it cannot handle long-distance reorderings properly and does not exploit discontinuous phrases and linguistically syntactic structure features Quirk and Menezes 2006 . Recently many syntax-based models have been proposed to address the above deficiencies Wu 1997 Chiang 2005 Eisner 2003 Ding and Palmer 2005 Quirk et al 2005 Cowan et al. 2006 Zhang et al. 2007 Bod 2007 Yamada and Knight 2001 Liu et al. 2006 Liu et al. 2007 Gildea 2003 Poutsma 2000 Hearne and Way 2003 . Although good progress has been reported the fundamental issues in
đang nạp các trang xem trước