tailieunhanh - Báo cáo khoa học: "A HARDWARE ALGORITHM FOR HIGH SPEED MORPHEME EXTRACTION AND ITS IMPLEMENTATION"

This paper describes a new hardware algorithm for morpheme extraction and its implementation on a specific machine (MEX-I), as the first step toward achieving natural language parsing accelerators. It also shows the machine's performance, 100-1,000 times faster than a personal computer. This machine can extract morphemes from 10,000 character Japanese text by searching an 80,000 morpheme dictionary in I second. It can treat multiple text streams, which are composed of character candidates, as well as one text stream. The algorithm is implemented on the machine in linear time for the number of candidates, while conventional sequential algorithms are implemented. | A HARDWARE ALGORITHM FOR HIGH SPEED MORPHEME EXTRACTION AND ITS IMPLEMENTATION Toshikazu Fukushima Yutaka Ohyama and Hitoshi Miyai CfcC Systems Research Laboratories NEC Corporation 1-1 Miyazaki 4-chome Miyamae-ku Kawasaki City Kanagawa 213 Japan fuku@ ohyMna@ miya@ ABSTRACT This paper describes a new hardware algorithm for morpheme extraction and its implementation on a specific machine MEX-I as the first step toward achieving natural language parsing accelerators. It also shows the machine s performance 100-1 000 times faster than a personal computer. This machine can extract morphemes from 10 000 character Japanese text by searching an 80 000 morpheme dictionary ìn 1 second. It can treat multiple text streams which are composed of character candidates as well as one text stream. The algorithm is implemented on the machine in linear time for the number of candidates while conventional sequential algorithms are implemented in combinational time. 1 INTRODUCTION Recent advancement in natural language parsing technology has especially extended the word processor market and the machine translation system market. For further market extension or new market creation for natural language applications parsing speed-up as well as improving parsing accuracy is required. First thè parsing speed-up directly reduces system response time required in such interactive natural language application systems as those using natural language interface speech recognition KanSrto-Kanji 1 conversion which is the most popular Japanese text input method and so on. Second it also increases the advantage of such applications as machine translation document proofreading automatic indexing and so on which are used to treat a large amount of documents. Third it realizes parsing methods based on larger scale dictionary or knowledge database which are necessary to improve parsing accuracy. Until now in the natural language processing field the speed-up

TỪ KHÓA LIÊN QUAN