A high performance domain specific OCR for Bangla script

Includes bibliographical references (page 5).

Bibliografiska uppgifter
Huvudupphovsmän: Hasnat, Md. Abul, Habib, S. M. Murtoza, Khan, Mumit
Övriga upphovsmän: Center for Research on Bangla Language Processing (CRBLP), BRAC University
Materialtyp: Artikel
Språk:English
Publicerad: BRAC University 2010
Ämnen:
Länkar:http://hdl.handle.net/10361/327
id 10361-327
record_format dspace
spelling 10361-3272019-09-29T05:27:24Z A high performance domain specific OCR for Bangla script Hasnat, Md. Abul Habib, S. M. Murtoza Khan, Mumit Center for Research on Bangla Language Processing (CRBLP), BRAC University Optical character reader (OCR) Bangla language processing Includes bibliographical references (page 5). Research on recognizing Bengali script has been started since mid 1980’s. A variety of different techniques have been applied and the performance is examined. In this paper we present a high performance domain specific OCR for recognizing Bengali script. We select the training data set from the script of the specified domain. We choose Hidden Markov Model (HMM) for character classification due to its simple and straightforward way of representation. We examine the primary error types that mainly occurred at preprocessing level and carefully handled those errors by adding special error correcting module as a part of recognizer. Finally we added a dictionary and some error specific rules to correct the probable errors after the word formation is done. The entire technique significantly increases the performance of the OCR for a specific domain to a great extent. Md. Abul Hasnat S. M. Murtoza Habib Mumit Khan 2010-10-04T10:44:46Z 2010-10-04T10:44:46Z 2007 2007 Article http://hdl.handle.net/10361/327 en 5 pages application/pdf BRAC University
institution Brac University
collection Institutional Repository
language English
topic Optical character reader (OCR)
Bangla language processing
spellingShingle Optical character reader (OCR)
Bangla language processing
Hasnat, Md. Abul
Habib, S. M. Murtoza
Khan, Mumit
A high performance domain specific OCR for Bangla script
description Includes bibliographical references (page 5).
author2 Center for Research on Bangla Language Processing (CRBLP), BRAC University
author_facet Center for Research on Bangla Language Processing (CRBLP), BRAC University
Hasnat, Md. Abul
Habib, S. M. Murtoza
Khan, Mumit
format Article
author Hasnat, Md. Abul
Habib, S. M. Murtoza
Khan, Mumit
author_sort Hasnat, Md. Abul
title A high performance domain specific OCR for Bangla script
title_short A high performance domain specific OCR for Bangla script
title_full A high performance domain specific OCR for Bangla script
title_fullStr A high performance domain specific OCR for Bangla script
title_full_unstemmed A high performance domain specific OCR for Bangla script
title_sort high performance domain specific ocr for bangla script
publisher BRAC University
publishDate 2010
url http://hdl.handle.net/10361/327
work_keys_str_mv AT hasnatmdabul ahighperformancedomainspecificocrforbanglascript
AT habibsmmurtoza ahighperformancedomainspecificocrforbanglascript
AT khanmumit ahighperformancedomainspecificocrforbanglascript
AT hasnatmdabul highperformancedomainspecificocrforbanglascript
AT habibsmmurtoza highperformancedomainspecificocrforbanglascript
AT khanmumit highperformancedomainspecificocrforbanglascript
_version_ 1814309741879361536