Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Optimal and Syntactically-Informed Decoding for Monolingual Phrase-Based Alignment"
Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
The task of aligning corresponding phrases across two related sentences is an important component of approaches for natural language problems such as textual inference, paraphrase detection and text-to-text generation. In this work, we examine a state-of-the-art structured prediction model for the alignment task which uses a phrase-based representation and is forced to decode alignments using an approximate search approach. | Optimal and Syntactically-Informed Decoding for Monolingual Phrase-Based Alignment Kapil Thadani and Kathleen McKeown Department of Computer Science Columbia University New York NY 10027 UsA kapil kathy @cs.columbia.edu Abstract The task of aligning corresponding phrases across two related sentences is an important component of approaches for natural language problems such as textual inference paraphrase detection and text-to-text generation. In this work we examine a state-of-the-art structured prediction model for the alignment task which uses a phrase-based representation and is forced to decode alignments using an approximate search approach. We propose instead a straightforward exact decoding technique based on integer linear programming that yields order-of-magnitude improvements in decoding speed. This ILP-based decoding strategy permits us to consider syntactically-informed constraints on alignments which significantly increase the precision of the model. 1 Introduction Natural language processing problems frequently involve scenarios in which a pair or group of related sentences need to be aligned to each other establishing links between their common words or phrases. For instance most approaches for natural language inference NLI rely on alignment techniques to establish the overlap between the given premise and a hypothesis before determining if the former entails the latter. Such monolingual alignment techniques are also frequently employed in systems for paraphrase generation multi-document summarization sentence fusion and question answering. Previous work MacCartney et al. 2008 has presented a phrase-based monolingual aligner for NLI 254 MANLI that has been shown to significantly outperform a token-based NLI aligner Chambers et al. 2007 as well as popular alignment techniques borrowed from machine translation Och and Ney 2003 Liang et al. 2006 . However MANLI s use of a phrase-based alignment representation appears to pose a challenge to the decoding