Bangla character recognition for Android devices
This thesis report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2015.
主要な著者: | , , , |
---|---|
フォーマット: | 学位論文 |
言語: | English |
出版事項: |
BRAC University
2016
|
主題: | |
オンライン・アクセス: | http://hdl.handle.net/10361/4894 |
id |
10361-4894 |
---|---|
record_format |
dspace |
spelling |
10361-48942022-01-26T10:04:57Z Bangla character recognition for Android devices Manzur, Shahrin Islam, Shafiqul Foysal, Abu Chowdhury, Aparajita Optical Character Recognition (OCR) Tesseract Bangla language Android Leptonica This thesis report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2015. In this paper, we illustrate our attempt to create editable documents from images by retrieving the text. The process is widely known as Optical Character Recognition (OCR). We have tried to build an Android application for detecting Bengali characters. Previously, several attempts have been made in developing a Bengali OCR. However, there were a few limitations which drove us to work on this project. In order to recognize more characters and joint letters, we decided to work on reducing the error rate to preserve more texts. To serve our purpose, we found the Tesseract OCR engine and Leptonica Image Processing Library to be the best option. Tesseract is used in order to recognize the characters and Leptonica is used to build an Android application by extracting data from the text. We are using the Tesseract 3.03 version currently available to work on this project. Moreover, we demonstrate how we obtained better results by manipulating Tesseract along with Serak to create box files and trained data. In addition to that, we discuss how we dealt with joint letters, dangerous ambiguity and contrast issues in order to increase efficiency. Furthermore, we explain our analyzed data, our progress and the future scopes of improvement. 2016-01-19T13:20:26Z 2016-01-19T13:20:26Z 2015-12 Thesis ID 12101113 ID 12101128 ID 12101131 ID 12301056 http://hdl.handle.net/10361/4894 en application/pdf BRAC University |
institution |
Brac University |
collection |
Institutional Repository |
language |
English |
topic |
Optical Character Recognition (OCR) Tesseract Bangla language Android Leptonica |
spellingShingle |
Optical Character Recognition (OCR) Tesseract Bangla language Android Leptonica Manzur, Shahrin Islam, Shafiqul Foysal, Abu Chowdhury, Aparajita Bangla character recognition for Android devices |
description |
This thesis report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2015. |
format |
Thesis |
author |
Manzur, Shahrin Islam, Shafiqul Foysal, Abu Chowdhury, Aparajita |
author_facet |
Manzur, Shahrin Islam, Shafiqul Foysal, Abu Chowdhury, Aparajita |
author_sort |
Manzur, Shahrin |
title |
Bangla character recognition for Android devices |
title_short |
Bangla character recognition for Android devices |
title_full |
Bangla character recognition for Android devices |
title_fullStr |
Bangla character recognition for Android devices |
title_full_unstemmed |
Bangla character recognition for Android devices |
title_sort |
bangla character recognition for android devices |
publisher |
BRAC University |
publishDate |
2016 |
url |
http://hdl.handle.net/10361/4894 |
work_keys_str_mv |
AT manzurshahrin banglacharacterrecognitionforandroiddevices AT islamshafiqul banglacharacterrecognitionforandroiddevices AT foysalabu banglacharacterrecognitionforandroiddevices AT chowdhuryaparajita banglacharacterrecognitionforandroiddevices |
_version_ |
1814307101488447488 |