tailieunhanh - Báo cáo khoa học: "Exploring Syntactic Structural Features for Sub-Tree Alignment using Bilingual Tree Kernels"

We propose Bilingual Tree Kernels (BTKs) to capture the structural similarities across a pair of syntactic translational equivalences and apply BTKs to sub-tree alignment along with some plain features. Our study reveals that the structural features embedded in a bilingual parse tree pair are very effective for sub-tree alignment and the bilingual tree kernels can well capture such features. | Exploring Syntactic Structural Features for Sub-Tree Alignment using Bilingual Tree Kernels Jun Sun1 2 Min Zhang1 Chew Lim Tan2 1 Institute for Infocomm Research 2School of Computing National University of Singapore sunjun@ mzhang@ tancl@ Abstract We propose Bilingual Tree Kernels BTKs to capture the structural similarities across a pair of syntactic translational equivalences and apply BTKs to sub-tree alignment along with some plain features. Our study reveals that the structural features embedded in a bilingual parse tree pair are very effective for sub-tree alignment and the bilingual tree kernels can well capture such features. The experimental results show that our approach achieves a significant improvement on both gold standard tree bank and automatically parsed tree pairs against a heuristic similarity based method. We further apply the sub-tree alignment in machine translation with two methods. It is suggested that the subtree alignment benefits both phrase and syntax based systems by relaxing the constraint of the word alignment. 1 Introduction Syntax based Statistical Machine Translation SMT systems allow the translation process to be more grammatically performed which provides decent reordering capability. However most of the syntax based systems construct the syntactic translation rules based on word alignment which not only suffers from the pipeline errors but also fails to effectively utilize the syntactic structural features. To address those deficiencies Tinsley et al. 2007 attempt to directly capture the syntactic translational equivalences by automatically conducting sub-tree alignment which can be defined as follows A sub-tree alignment process pairs up sub-tree pairs across bilingual parse trees whose contexts are semantically translational equivalent. According to Tinsley et al. 2007 a sub-tree aligned parse tree pair follows the following criteria i a node can only be linked once ii descendants

TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN