Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Paragraph-, word-, and coherence-based approaches to sentence ranking: A comparison of algorithm and human performance"

Thảo Uyên 75 8 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

For each of the sentences in the text, they provided a ranking of how important that sentence is with respect to the content of the text, on an integer scale from 1 (not important) to 7 (very important). The approaches we evaluated are a simple paragraph-based approach that serves as a baseline, two word-based algorithms, and two coherencebased approaches1. | Paragraph- word- and coherence-based approaches to sentence ranking A comparison of algorithm and human performance Florian WOLF Massachusetts Institute of Technology MIT NE20-448 3 Cambridge Center Cambridge MA 02139 USA fwolf@mit.edu Abstract Sentence ranking is a crucial part of generating text summaries. We compared human sentence rankings obtained in a psycholinguistic experiment to three different approaches to sentence ranking A simple paragraph-based approach intended as a baseline two word-based approaches and two coherence-based approaches. In the paragraph-based approach sentences in the beginning of paragraphs received higher importance ratings than other sentences. The word-based approaches determined sentence rankings based on relative word frequencies Luhn 1958 Salton Buckley 1988 . Coherence-based approaches determined sentence rankings based on some property of the coherence structure of a text Marcu 2000 Page et al. 1998 . Our results suggest poor performance for the simple paragraph-based approach whereas wordbased approaches perform remarkably well. The best performance was achieved by a coherence-based approach where coherence structures are represented in a non-tree structure. Most approaches also outperformed the commercially available MSWord summarizer. 1 Introduction Automatic generation of text summaries is a natural language engineering application that has received considerable interest particularly due to the ever-increasing volume of text information available through the internet. The task of a human generating a summary generally involves three subtasks Brandow et al. 1995 Mitra et al. 1997 1 understanding a text 2 ranking text pieces sentences paragraphs phrases etc. for importance 3 generating a new text the summary . Like most approaches to summarization we are concerned with the second subtask e.g. Carlson et al. 2001 Goldstein et al. 1999 Gong Liu 2001 Jing et al. 1998 Edward GIBSON Massachusetts Institute of Technology MIT .

TÀI LIỆU LIÊN QUAN

Kỷ yếu tóm tắt báo cáo khoa học: Hội nghị khoa học tim mạch toàn quốc lần thứ XI - Hội tim mạch Quốc gia Việt Nam

Báo cáo nghiên cứu khoa học: "Danh lục các loài thú ở khu bảo tồn thiên nhiên Pù Huống tỉnh Nghệ An và ý nghĩa bảo tồn nguồn gen quí hiếm của chúng"

Báo cáo khoa học: Hỗ trợ nâng cao năng lực quản lý chất thải sinh hoạt tại thành phố Hội An

Báo cáo nghiên cứu khoa học: " DỊCH CHUYỂN TRUY VẤN OQL VÀO CÁC PHÉP TÍNH BAO HÀM"

Báo cáo nghiên cứu khoa học: "Tính năng động nghệ thuật của văn học hiện đại Việt Nam và một cách nhìn hành trình thể loại"

Báo cáo khoa học: " Áp dụng thủ tục phân tích trong kiểm toán báo cáo tài chính"

Báo cáo nghiên cứu khoa học: "Người lính trở về sau chiến tranh với mặc cảm “ăn mày dĩ vãng’ trong tiểu thuyết Chu Lai"

Báo cáo nghiên cứu khoa học: "Khảo sát hiện tượng chuyển đổi chức năng - nghĩa của động từ tiếng Việt"

Báo cáo nghiên cứu khoa học: " BẢN CHẤT KHOA HỌC VÀ CÁCH MẠNG LÀ CỘI NGUỒN SỨC SỐNG CỦA CHỦ NGHĨA MÁC - LÊNIN"

Báo cáo khoa học: " CẢI TIẾN CÁC THUẬT TOÁN MƯỢN VÀ KHOÁ KÊNH TẦN SỐ MẠNG DI ĐỘNG TẾ BÀO"