tailieunhanh - Báo cáo khoa học: "User studies and the design of Natural Language Systems"

This paper presents a critical discussion of the various approaches that have been used in the evaluation of Natural Language systems. We conclude that previous approaches have neglected to evaluate systems in the context of their use, . solving a task requiring data retrieval. This raises questions about the validity of such approaches. In the second half of the paper, we report a laboratory study using the Wizard of Oz technique to identify NL requirements for carrying out this task. We evaluate the demands that task dialogues collected using this technique, place upon a prototype Natural Language system. We. | User studies and the design of Natural Language Systems Steve Whittaker and Phil Stenton Hewlett-Packard Laboratories Filton Road Bristol BS12 6QZ UK. email sjw@ Abstract This paper presents a critical discussion of the various approaches that have been used in the evaluation of Natural Language systems. We conclude that previous approaches have neglected to evaluate systems in the context of their use . solving a task requiring data retrieval. This raises questions about the validity of such approaches. In the second half of the paper we report a laboratory study using the Wizard of Oz technique to identify NL requirements for carrying out this task. We evaluate the demands that task dialogues collected using this technique place upon a prototype Natural Language system. We identify three important requirements which arose from the task that we gave our subjects operators specific to the task of database access complex contextual reference and reference to the structure of the information source. We discuss how these might be satisfied by future Natural Language systems. 1 Introduction Approaches to the evaluation of NL systems It is clear that a number of different criteria might be employed in the evaluation of Natural Language NL systems. It is also clear that there is no consensus on how evaluation should be carried out RQR 88 GM84 . Among the different criteria that have been suggested are a Coverage b Learnability c General software requirements d Comparison with other interface media. Coverage is concerned with the set of inputs which the system should be capable of handling and one issue we will discuss is how this set should be identified. Learnability is premised on the fact that complete coverage is not forseeable in the near future. As a consequence any NL system will have limitations and one problem for users will be to learn to communicate within such limitations. Learnability is measured by the ease with which new users are .

TỪ KHÓA LIÊN QUAN
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.