tailieunhanh - Báo cáo khoa học: "Designing a Task-Based Evaluation Method ology for a Spoken Machine Translation System"

In this paper, I discuss issues pertinent to the design of a task-based evaluation methodology for a spoken machine translation (MT) system processing human to human communication rather than human to machine communication. I claim that system mediated human to human communication requires new evaluation criteria and metrics based on goal complexity and the speaker's prioritization of goals. ystem | Designing a Task-Based Evaluation Methodology for a Spoken Machine Translation System Kavita Thomas Language Technologies Institute Carnegie Mellon University 5000 Forbes Avenue Pittsburgh PA 15213 USA kavita@ Abstract In this paper I discuss issues pertinent to the design of a task-based evaluation methodology for a spoken machine translation MT system processing human to human communication rather than human to machine communication. I claim that system mediated human to human communication requires new evaluation criteria and metrics based on goal complexity and the speaker s prioritization of goals. 1 Introduction Task-based evaluations for spoken language systems focus on evaluating whether the speaker s task is achieved rather than evaluating utterance translation accuracy or other aspects of system performance. Our MT project focuses on the travel reservation domain and facilitates on-line translation of speech between clients and travel agents arranging travel plans. Our prior evaluations Gates et al. 1996 have focused on end-to-end translation accuracy at the utterance level . fraction of utterances translated perfectly acceptably and unacceptably . While this method of evaluation conveys translation accuracy it does not give any information about how many of the client s travel arrangement goals have been conveyed nor does it take into account the complexity of the speaker s goals and task or the priority that they assign to their goals for example the same end-to-end score for two dialogues may hide the fact that in one dialogue the speakers were able to communicate their most important goals while in the other they were only able to communicate successfully the less important goals. One common approach to evaluating spoken language systems focusing on human-machine dialogue is to compare system responses to cor rect reference answers however as discussed by Walker et al. 1997 the set of reference answers for any particular user query is .

crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.