tailieunhanh - Data Analysis Machine Learning and Applications Episode 3 Part 5

Tham khảo tài liệu 'data analysis machine learning and applications episode 3 part 5', kỹ thuật - công nghệ, cơ khí - chế tạo máy phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả | 578 Berenike Loos and Chris Biemann As an application we operate on automatic address extraction from web pages for the tourist domain. Motivation Address extraction from the web In an open-domain spoken dialog system the automatic learning of ontological concepts and corresponding relations between them is essential as a complete manual modeling of them is neither practicable nor feasible due to the continuously changing denotation of real world objects. Therefore the emergence of new entities in the world entails the necessity of a method to deal with those entities in a spoken dialog system as described in Loos 2006 . As a use case to this challenging problem we imagine a user asking the dialog system for a newly established restaurant in a city . How do I get to the Auer-stein . So far the system does not have information about the object and needs the help of an incremental learning component to be able to give the demanded answer to the user. A classification as well as any other information for the word Auerstein are hitherto not modeled in the knowledge base and can be obtained by text mining methods as described in Faulhaber et al. 2006 . As soon as the object is classified and located in the system s domain ontology it can be concluded that it is a building and that all buildings have addresses. At this stage the herein described work comes into play which deals with the extraction of addresses in unstructured text. With a web service as part of the dialog system s infrastructure the newly found address for the demanded object can be used for a route instruction. Even though structured and semi-structured texts such as online directories can be harvested as well they often do not contain addresses of new places and do therefore not cover all addresses needed. However a search in such directories can be used in combination with the method described herein which can be used as a fallback solution. Unsupervised learning supporting supervised .

TỪ KHÓA LIÊN QUAN