tailieunhanh - Báo cáo khoa học: "Improving Pronoun Translation for Statistical Machine Translation"

Machine Translation is a well–established field, yet the majority of current systems translate sentences in isolation, losing valuable contextual information from previously translated sentences in the discourse. One important type of contextual information concerns who or what a coreferring pronoun corefers to (., its antecedent). Languages differ significantly in how they achieve coreference, and awareness of antecedents is important in choosing the correct pronoun. Disregarding a pronoun’s antecedent in translation can lead to inappropriate coreferring forms in the target text, seriously degrading a reader’s ability to understand it. . | Improving Pronoun Translation for Statistical Machine Translation Liane Guillou School of Informatics University of Edinburgh Edinburgh UK EH8 9AB Abstract Machine Translation is a well-established field yet the majority of current systems translate sentences in isolation losing valuable contextual information from previously translated sentences in the discourse. One important type of contextual information concerns who or what a coreferring pronoun corefers to . its antecedent . Languages differ significantly in how they achieve coreference and awareness of antecedents is important in choosing the correct pronoun. Disregarding a pronoun s antecedent in translation can lead to inappropriate coreferring forms in the target text seriously degrading a reader s ability to understand it. This work assesses the extent to which source-language annotation of coreferring pronouns can improve English-Czech Statistical Machine Translation SMT . As with previous attempts that use this method the results show little improvement. This paper attempts to explain why and to provide insight into the factors affecting performance. 1 Introduction It is well-known that in many natural languages a pronoun that corefers must bear similar features to its antecedent. These can include similar number gender morphological or referential and or animacy. If a pronoun and its antecedent occur in the same unit of translation N-gram or syntactic tree these agreement features can influence the translation. But this locality cannot be guaranteed in either phrase-based or syntax-based Statistical Machine Translation SMT . If it is not within the same unit a coreferring pronoun will be translated without knowledge of its antecedent meaning that its translation will simply reflect local frequency. Incorrectly translating a pronoun can result in readers listeners identifying the wrong antecedent which can mislead or confuse them. There have been two recent attempts to solve

TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.