tailieunhanh - Báo cáo khoa học: "Subsentential Translation Memory for Computer Assisted Writing and Translation"

This paper describes a database of translation memory, TotalRecall, developed to encourage authentic and idiomatic use in second language writing. TotalRecall is a bilingual concordancer that support search query in English or Chinese for relevant sentences and translations. Although initially intended for learners of English as Foreign Language (EFL) in Taiwan, it is a gold mine of texts in English or Mandarin Chinese. | Subsentential Translation Memory for Computer Assisted Writing and Translation Jian-Cheng Wu Department of Computer Science National Tsing Hua University 101 Kuangfu Road Hsinchu 300 Taiwan ROC D928322@ Thomas C. Chuang Department of Computer Science Van Nung Institute of Technology No. 1 Van-Nung Road Chung-Li Tao-Yuan Taiwan ROC tomchuang@ Abstract This paper describes a database of translation memory TotalRecall developed to encourage authentic and idiomatic use in second language writing. TotalRecall is a bilingual concordancer that support search query in English or Chinese for relevant sentences and translations. Although initially intended for learners of English as Foreign Language EFL in Taiwan it is a gold mine of texts in English or Mandarin Chinese. TotalRecall is particularly useful for those who write in or translate into a foreign language. We exploited and structured existing high-quality translations from bilingual corpora from a Taiwan-based Sinorama Magazine and Official Records of Hong Kong Legislative Council to build a bilingual concordance. Novel approaches were taken to provide high-precision bilingual alignment on the subsentential and lexical levels. A browserbased user interface was developed for ease of access over the Internet. Users can search for word phrase or expression in English or Mandarin. The Web-based user interface facilitates the recording of the user actions to provide data for further research. 1 Introduction Translation memory has been found to be more effective alternative to machine translation for translators especially when working with batches of similar texts. That is particularly true with so-called delta translation of the next versions for publications that need continuous revision such as an encyclopaedia or user s manual. On another area of language study researchers on English Language Teaching ELT have increasingly looked to concordancer of very large corpora as a new re-source for