Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Shallow Dependency Labeling"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

We present a formalization of dependency labeling with Integer Linear Programming. We focus on the integration of subcategorization into the decision making process, where the various subcategorization frames of a verb compete with each other. A maximum entropy model provides the weights for ILP optimization. tion. More formally, the dependency labeling problem is: given a sentence with (i) verbs, , (ii) NP 1, and PP chunks , label all pairs ( ) with a dependency relation (including a class for the null assignment) such that all chunks get attached and for each verb exactly one subcategorization frame is instantiated | Shallow Dependency Labeling Manfred Klenner Institute of Computational Linguistics University of Zurich klenner@cl.unizh.ch Abstract We present a formalization of dependency labeling with Integer Linear Programming. We focus on the integration of subcategorization into the decision making process where the various subcategorization frames of a verb compete with each other. A maximum entropy model provides the weights for ILP optimization. 1 Introduction Machine learning classifiers are widely used although they lack one crucial model property they can t adhere to prescriptive knowledge. Take grammatical role GR labeling which is a kind of shallow dependency labeling as an example chunkverb-pairs are classified according to a GR cf. Buchholz 1999 . The trials are independent of each other thus local decisions are taken such that e.g. a unique GR of a verb might erroneously get multiply instantiated etc. Moreover if there are alternative subcategorization frames of a verb they must not be confused by mixing up GR from different frames to a non-existent one. Often a subsequent filter is used to repair such inconsistent solutions. But usually there are alternative solutions so the demand for an optimal repair arises. We apply the optimization method Integer Linear Programming ILP to shallow dependency labeling in order to generate a globally optimized consistent dependency labeling for a given sentence. A maximum entropy classifier trained on vectors with morphological syntactic and positional information automatically derived from the TIGER treebank German supplies probability vectors that are used as weights in the optimization process. Thus the probabilities of the classifier do not any longer provide as usually the solution i.e. by picking out the most probable candidate but count as probabilistic suggestions to a - globally consistent - solu-201 tion. More formally the dependency labeling problem is given a sentence with i verbs Lổ ii NP and PP chunks1 CH label .