tailieunhanh - Báo cáo khoa học: "Detection of Japanese Homophone Errors by a Decision List Including a Written Word as a Default Evidence"

In this paper, we propose a practical method to detect Japanese homophone errors in Japanese texts. It is very important to detect homophone errors in Japanese revision systems because Japanese texts suffer from homophone errors frequently. In order to detect homophone errors, we have only to solve the homophone problem. We can use the decision list to do it because the homophone problem is equivalent to the word sense disambiguation problem. However, the homophone problem is different from the word sense disambiguation problem because the former can use the written word but the latter cannot. . | Proceedings of EACL 99 Detection of Japanese Homophone Errors by a Decision List Including a Written Word as a Default Evidence Hiroyuki Shinnou Ibaraki University Dept of Systems Engineering 4-12-1 Nakanarusawa Hitachi Ibaraki 316-8511 Japan Abstract In this paper we propose a practical method to detect Japanese homophone errors in Japanese texts. It is very important to detect homophone errors in Japanese revision systems because Japanese texts suffer from homophone errors frequently. In order to detect homophone errors we have only to solve the homophone problem. We can use the decision list to do it because the homophone problem is equivalent to the word sense disambiguation problem. However the homophone problem is different from the word sense disambiguation problem because the former can use the written word but the latter cannot. In this paper we incorporate the written word into the original decision list by obtaining the identifying strength of the written word. The improved decision list can raise the F-measure of error detection. 1 Introduction In this paper we propose a method of detecting Japanese homophone errors in Japanese texts. Our method is based on a decision list proposed by Yarowsky Yarowsky 1994 Yarowsky 1995 . We improve the original decision list by using written words in the default evidence. The improved decision list can raise the F-measure of error detection. Most Japanese texts are written using Japanese word processors. To input a word composed of kanji characters we first input the phonetic hfra-gana sequence for the word and then convert it to the desfred kanji sequence. However multiple converted kanji sequences are generally produced and we must then choose the correct kanji sequence. Therefore Japanese texts suffer from ho mophone errors caused by incorrect choices. Carelessness of choice alone is not the cause of homophone errors Ignorance of the difference among homophone words is serious. For .

TỪ KHÓA LIÊN QUAN