tailieunhanh - Báo cáo khoa học: "Joint Training of Dependency Parsing Filters through Latent Support Vector Machines"
Graph-based dependency parsing can be sped up significantly if implausible arcs are eliminated from the search-space before parsing begins. State-of-the-art methods for arc filtering use separate classifiers to make pointwise decisions about the tree; they label tokens with roles such as root, leaf, or attaches-tothe-left, and then filter arcs accordingly. Because these classifiers overlap substantially in their filtering consequences, we propose to train them jointly, so that each classifier can focus on the gaps of the others. . | Joint Training of Dependency Parsing Filters through Latent Support Vector Machines Colin Cherry Institute for Information Technology National Research Council Canada Shane Bergsma Center for Language and Speech Processing Johns Hopkins University sbergsma@ Abstract Graph-based dependency parsing can be sped up significantly if implausible arcs are eliminated from the search-space before parsing begins. State-of-the-art methods for arc filtering use separate classifiers to make pointwise decisions about the tree they label tokens with roles such as root leaf or attaches-to-the-left and then filter arcs accordingly. Because these classifiers overlap substantially in their filtering consequences we propose to train them jointly so that each classifier can focus on the gaps of the others. We integrate the various pointwise decisions as latent variables in a single arc-level SVM classifier. This novel framework allows us to combine nine pointwise filters and adjust their sensitivity using a shared threshold based on arc length. Our system filters 32 more arcs than the independently-trained classifiers without reducing filtering speed. This leads to faster parsing with no reduction in accuracy. 1 Introduction A dependency tree represents syntactic relationships between words using directed arcs Meicuk 1987 . Each token in the sentence is a node in the tree and each arc connects a head to its modifier. There are two dominant approaches to dependency parsing graph-based and transition-based where graphbased parsing is understood to be slower but often more accurate McDonald and Nivre 2007 . In the graph-based setting a complete search finds the highest-scoring tree under a model that decomposes over one or two arcs at a time. Much of the time for parsing is spent scoring each potential arc in the complete dependency graph John- 200 son 2007 one for each ordered word-pair in the sentence. Potential arcs are scored using rich linear models
đang nạp các trang xem trước