tailieunhanh - BÁO CÁO "OPTICAL CHARACTER RECOGNITION FOR VIETNAMESE SCANNED TEXT "
Optical Character Recognition is a technology that enable human to digitize scanned images, converting into editable text on the computer and increasing the speed of data transmission directly into computer from many source of documents. In addition, it is also useful in handwriting recognition and making digital images searchable for text. In this paper, we proposed anOCR system which is capable of recognizing Vietnamese characters fortyped texts using template matching and artificial neural network recognizing methods. Each method has its own advantage as well as weakness and they will be clearly shown through this paper, so that the readers can. | Tuyển tập Báo cáo Hội nghị Sinh viên Nghiên cứu Khoa học lần thứ 8 Đại học Đà Nẵng năm 2012 OPTICAL CHARACTER RECOGNITION FOR VIETNAMESE SCANNED TEXT Authors Tran Anh Viet Le Minh Hoang Hac Le Tuan Bao Ngoc Le Anh Duy Class 08ECE Electronic and Communication Engineering Department DaNang University of Technology Advisors . Pham Van Tuan . Hoang Le UyenThuc Electronic and Communication Engineering Department Da Nang University of Technology Abstract Optical Character Recognition is a technology that enable human to digitize scanned images converting into editable text on the computer and increasing the speed of data transmission directly into computer from many source of documents. In addition it is also useful in handwriting recognition and making digital images searchable for text. In this paper we proposed anOCR system which is capable of recognizing Vietnamese characters fortyped texts using template matching and artificial neural network recognizing methods. Each method has its own advantage as well as weakness and they will be clearly shown through this paper so that the readers can figure out what method they might use for specific situation of OCR for Vietnamese typed text. 1. Introduction In recent years OCR has become a popular industry aroundthe world with variety of languages and Vietnamese is not an exception. However in comparison with other languages Vietnamese OCR technology is still young and needs improvement for higher efficiency as well as growing more applicable. With this inspiration our group decides to do a research on OCR to find a simpler but efficient alternative for Vietnamese language. The process of how to do OCR for printed Vietnamese script will be discussed throughout this paper in detail. 2. Procedure A general approach for any OCR problem 2 contains 7 steps as shown in figure 1 Scanned image X input f Recognized text .1 I .1 Preprocessing I I Segmentation I ------ ị Feature extraction j ị Post-processing ị i .
đang nạp các trang xem trước