tailieunhanh - Báo cáo khoa học: "Maximum Entropy Model Learning of the Translation Rules"

This paper proposes a learning method of translation rules from parallel corpora. This method applies the maximum entropy principle to a probabilistic model of translation rules. First, we define feature functions which express statistical properties of this model. Next, in order to optimize the model, the system iterates following steps: (1) selects a feature function which maximizes loglikelihood, and (2) adds this function to the model incrementally. | Maximum Entropy Model Learning of the Translation Rules Kengo Sato and Masakazu Nakanishi Department of Computer Science Keio University 3-14-1 Hiyoshi Kohoku Yokohama 223-8522 Japan e-mail satoken czl @nak. . Abstract This paper proposes a learning method of translation rules from parallel corpora. This method applies the maximum entropy principle to a probabilistic model of translation rules. First we define feature functions which express statistical properties of this model. Next in order to optimize the model the system iterates following steps 1 selects a feature function which maximizes loglikelihood and 2 adds this function to the model incrementally. As computational cost associated with this model is too expensive we propose several methods to suppress the overhead in order to realize the system. The result shows that it attained recall rate. 1 Introduction A statistical natural language modeling can be viewed as estimating a combinational distribution X X Y 0 1 using training data xi yi . xt yr Xx Y observed in corpora. For this topic Baum 1972 proposed EM algorithm which was basis of Forward-Backward algorithm for the hidden Markov model HMM and Inside-Outside algorithm Lafferty 1993 for the probabilistic context free grammar PCFG . However these methods have problems such as increasing optimization costs which is due to a lot of parameters. Therefore estimating a natural language model based on the maximum entropy ME method Pietra et al. 1995 Berger et al. 1996 has been highlighted recently. On the other hand dictionaries for multilingual natural language processing such as the machine translation has been made by human hand usually. However since this work requires a great deal of labor and it is difficult to keep description of dictionaries consistent the researches of automatical dictionaries making for machine translation translation rules from corpora become active recently Kay and Rõschesen 1993 Kaji and Aizono 1996 . In this

crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.