A high performance domain specific OCR for Bangla script
Includes bibliographical references (page 5).
Asıl Yazarlar: | , , |
---|---|
Diğer Yazarlar: | |
Materyal Türü: | Technical report |
Dil: | English |
Baskı/Yayın Bilgisi: |
BRAC University
2010
|
Konular: | |
Online Erişim: | http://hdl.handle.net/10361/641 |
id |
10361-641 |
---|---|
record_format |
dspace |
spelling |
10361-6412019-09-29T05:39:13Z A high performance domain specific OCR for Bangla script Hasnat, Md. Abul Habib, S. M. Murtoza Khan, Mumit Center for Research on Bangla Language Processing (CRBLP), BRAC University Bangla language processing Bangla OCR Includes bibliographical references (page 5). Abstract-Research on recognizing Bengali script has been started since mid 1980’s. A variety of different techniques have been applied and the performance is examined. In this paper we present a high performance domain specific OCR for recognizing Bengali script. We select the training data set from the script of the specified domain. We choose Hidden Markov Model (HMM) for character classification due to its simple and straightforward way of representation. We examine the primary error types that mainly occurred at preprocessing level and carefully handled those errors by adding special error correcting module as a part of recognizer. Finally we added a dictionary and some error specific rules to correct the probable errors after the word formation is done. The entire technique significantly increases the performance of the OCR for a specific domain to a great extent. 2010-10-27T04:32:33Z 2010-10-27T04:32:33Z 2008 2008 Technical report http://hdl.handle.net/10361/641 en 5 pages application/pdf BRAC University |
institution |
Brac University |
collection |
Institutional Repository |
language |
English |
topic |
Bangla language processing Bangla OCR |
spellingShingle |
Bangla language processing Bangla OCR Hasnat, Md. Abul Habib, S. M. Murtoza Khan, Mumit A high performance domain specific OCR for Bangla script |
description |
Includes bibliographical references (page 5). |
author2 |
Center for Research on Bangla Language Processing (CRBLP), BRAC University |
author_facet |
Center for Research on Bangla Language Processing (CRBLP), BRAC University Hasnat, Md. Abul Habib, S. M. Murtoza Khan, Mumit |
format |
Technical report |
author |
Hasnat, Md. Abul Habib, S. M. Murtoza Khan, Mumit |
author_sort |
Hasnat, Md. Abul |
title |
A high performance domain specific OCR for Bangla script |
title_short |
A high performance domain specific OCR for Bangla script |
title_full |
A high performance domain specific OCR for Bangla script |
title_fullStr |
A high performance domain specific OCR for Bangla script |
title_full_unstemmed |
A high performance domain specific OCR for Bangla script |
title_sort |
high performance domain specific ocr for bangla script |
publisher |
BRAC University |
publishDate |
2010 |
url |
http://hdl.handle.net/10361/641 |
work_keys_str_mv |
AT hasnatmdabul ahighperformancedomainspecificocrforbanglascript AT habibsmmurtoza ahighperformancedomainspecificocrforbanglascript AT khanmumit ahighperformancedomainspecificocrforbanglascript AT hasnatmdabul highperformancedomainspecificocrforbanglascript AT habibsmmurtoza highperformancedomainspecificocrforbanglascript AT khanmumit highperformancedomainspecificocrforbanglascript |
_version_ |
1814307241300328448 |