Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Discriminative Training of a Neural Network Statistical Parser"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

Discriminative methods have shown significant improvements over traditional generative methods in many machine learning applications, but there has been difficulty in extending them to natural language parsing. One problem is that much of the work on discriminative methods conflates changes to the learning method with changes to the parameterization of the problem. We show how a parser can be trained with a discriminative learning method while still parameterizing the problem according to a generative probability model. We present three methods for training a neural network to estimate the probabilities for a statistical parser, one generative | Discriminative Training of a Neural Network Statistical Parser James HENDERSON School of Informatics University of Edinburgh 2 Buccleuch Place Edinburgh EH8 9LW United Kingdom james.henderson@ed.ac.uk Abstract Discriminative methods have shown significant improvements over traditional generative methods in many machine learning applications but there has been difficulty in extending them to natural language parsing. One problem is that much of the work on discriminative methods conflates changes to the learning method with changes to the parameterization of the problem. We show how a parser can be trained with a discriminative learning method while still parameterizing the problem according to a generative probability model. We present three methods for training a neural network to estimate the probabilities for a statistical parser one generative one discriminative and one where the probability model is generative but the training criteria is discriminative. The latter model outperforms the previous two achieving state-of-the-art levels of performance 90.1 F-measure on constituents . 1 Introduction Much recent work has investigated the application of discriminative methods to NLP tasks with mixed results. Klein and Manning 2002 argue that these results show a pattern where discriminative probability models are inferior to generative probability models but that improvements can be achieved by keeping a generative probability model and training according to a discriminative optimization criteria. We show how this approach can be applied to broad coverage natural language parsing. Our estimation and training methods successfully balance the conflicting requirements that the training method be both computationally tractable for large datasets and a good approximation to the theoretically optimal method. The parser which uses this approach outperforms both a generative model and a discriminative model achieving state-of-the-art levels of performance 90.1 F-measure on .