Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Using Search Engines for Robust Cross-Domain Named Entity Recognition"

Kiên Bình 64 11 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

We use search engine results to address a particularly difﬁcult cross-domain language processing task, the adaptation of named entity recognition (NER) from news text to web queries. The key novelty of the method is that we submit a token with context to a search engine and use similar contexts in the search results as additional information for correctly classifying the token. We achieve strong gains in NER performance on news, in-domain and out-of-domain, and on web queries. | Piggyback Using Search Engines for Robust Cross-Domain Named Entity Recognition Stefan Rud Institute for NLP University of Stuttgart Germany Massimiliano Ciaramita Google Research Zurich Switzerland Jens Muller and Hinrich Schutze Institute for NLP University of Stuttgart Germany Abstract We use search engine results to address a particularly difficult cross-domain language processing task the adaptation of named entity recognition NER from news text to web queries. The key novelty of the method is that we submit a token with context to a search engine and use similar contexts in the search results as additional information for correctly classifying the token. We achieve strong gains in NER performance on news in-domain and out-of-domain and on web queries. 1 Introduction As statistical Natural Language Processing NLP matures NLP components are increasingly used in real-world applications. In many cases this means that some form of cross-domain adaptation is necessary because there are distributional differences between the labeled training set that is available and the real-world data in the application. To address this problem we propose a new type of features for NLP data features extracted from search engine results. Our motivation is that search engine results can be viewed as a substitute for the world knowledge that is required in NLP tasks but that can only be extracted from a standard training set or precompiled resources to a limited extent. For example a named entity NE recognizer trained on news text may tag the NE London in an out-of-domain web query like London Klondike gold rush as a location. But if we train the recognizer on features derived from search results for the sentence to be tagged correct classification as person is possible. This is because the search results for London Klondike gold rush contain snippets in which the first name Jack precedes London this is a sure indicator of a last name and hence an NE of type person. We call our .

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: " Using the reduced La(Co,Cu)O3 nanoperovskites as catalyst precursors for CO hydrogenation"

báo cáo khoa học: " Improving benchmarking by using an explicit framework for the development of composite indicators: an example using pediatric quality of care"

Báo cáo y học: "Improving benchmarking by using an explicit framework for the development of composite indicators: an example using pediatric quality of care"

Báo cáo y học: "The effectiveness of hand-disinfection by a flow water system using electrolytic products of sodium chloride, compared with a conventional method using alcoholic solution in an"

BÁO CÁO NGHIÊN CỨU KHOA HỌC KỸ THUẬT: 75 USING IN VITRO PROPAGATION TO PRESERVE Glyptostrobus pensilis (Staunton ex.)

Báo cáo khoa học: "Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation"

Báo cáo khoa học: "Historical Change in Language Using Monte Carlo Techniques"

Báo cáo khoa học: "Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia"

Báo cáo khoa học: "Classifying French Verbs Using French and English Lexical Resources"

Báo cáo khoa học: "Text Segmentation by Language Using Minimum Description Length"

crossorigin="anonymous">

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.