tailieunhanh - Báo cáo khoa học: "Evaluating Machine Translations using mNCD"
This paper introduces mNCD, a method for automatic evaluation of machine translations. The measure is based on normalized compression distance (NCD), a general information theoretic measure of string similarity, and flexible word matching provided by stemming and synonyms. The mNCD measure outperforms NCD in system-level correlation to human judgments in English. method. BADGER scores were directly compared against the scores of METEOR and word error rate (WER). | Evaluating Machine Translations using mNCD Marcus Dobrinkat and Tero Tapiovaara and Jaakko Vayrynen Adaptive Informatics Research Centre Aalto University School of Science and Technology . Box 15400 FI-00076 Aalto Finland @ Kimmo Kettunen Kymenlaakso University of Applied Sciences . Box 9 FI-48401 Kotka Finland Abstract This paper introduces mNCD a method for automatic evaluation of machine translations. The measure is based on normalized compression distance NCD a general information theoretic measure of string similarity and flexible word matching provided by stemming and synonyms. The mNCD measure outperforms NCD in system-level correlation to human judgments in English. 1 Introduction Automatic evaluation of machine translation MT systems requires automated procedures to ensure consistency and efficient handling of large amounts of data. In statistical MT systems automatic evaluation of translations is essential for parameter optimization and system development. Human evaluation is too labor intensive time consuming and expensive for daily evaluations. However manual evaluation is important in the comparison of different MT systems and for the validation and development of automatic MT evaluation measures which try to model human assessments of translations as closely as possible. Furthermore the ideal evaluation method would be language independent fast to compute and simple. Recently normalized compression distance NCD has been applied to the evaluation of machine translations. NCD is a general information theoretic measure of string similarity whereas most MT evaluation measures . BLEU and METEOR are specifically constructed for the task. Parker 2008 introduced BADGER an MT evaluation measure that uses NCD and a language independent word normalization method. BADGER scores were directly compared against the scores of METEOR and word error rate WER . The correlation .
đang nạp các trang xem trước