Bangla character recognition for Android devices

This thesis report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2015.

書誌詳細
主要な著者: Manzur, Shahrin, Islam, Shafiqul, Foysal, Abu, Chowdhury, Aparajita
フォーマット: 学位論文
言語:English
出版事項: BRAC University 2016
主題:
オンライン・アクセス:http://hdl.handle.net/10361/4894
id 10361-4894
record_format dspace
spelling 10361-48942022-01-26T10:04:57Z Bangla character recognition for Android devices Manzur, Shahrin Islam, Shafiqul Foysal, Abu Chowdhury, Aparajita Optical Character Recognition (OCR) Tesseract Bangla language Android Leptonica This thesis report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2015. In this paper, we illustrate our attempt to create editable documents from images by retrieving the text. The process is widely known as Optical Character Recognition (OCR). We have tried to build an Android application for detecting Bengali characters. Previously, several attempts have been made in developing a Bengali OCR. However, there were a few limitations which drove us to work on this project. In order to recognize more characters and joint letters, we decided to work on reducing the error rate to preserve more texts. To serve our purpose, we found the Tesseract OCR engine and Leptonica Image Processing Library to be the best option. Tesseract is used in order to recognize the characters and Leptonica is used to build an Android application by extracting data from the text. We are using the Tesseract 3.03 version currently available to work on this project. Moreover, we demonstrate how we obtained better results by manipulating Tesseract along with Serak to create box files and trained data. In addition to that, we discuss how we dealt with joint letters, dangerous ambiguity and contrast issues in order to increase efficiency. Furthermore, we explain our analyzed data, our progress and the future scopes of improvement. 2016-01-19T13:20:26Z 2016-01-19T13:20:26Z 2015-12 Thesis ID 12101113 ID 12101128 ID 12101131 ID 12301056 http://hdl.handle.net/10361/4894 en application/pdf BRAC University
institution Brac University
collection Institutional Repository
language English
topic Optical Character Recognition (OCR)
Tesseract
Bangla language
Android
Leptonica
spellingShingle Optical Character Recognition (OCR)
Tesseract
Bangla language
Android
Leptonica
Manzur, Shahrin
Islam, Shafiqul
Foysal, Abu
Chowdhury, Aparajita
Bangla character recognition for Android devices
description This thesis report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2015.
format Thesis
author Manzur, Shahrin
Islam, Shafiqul
Foysal, Abu
Chowdhury, Aparajita
author_facet Manzur, Shahrin
Islam, Shafiqul
Foysal, Abu
Chowdhury, Aparajita
author_sort Manzur, Shahrin
title Bangla character recognition for Android devices
title_short Bangla character recognition for Android devices
title_full Bangla character recognition for Android devices
title_fullStr Bangla character recognition for Android devices
title_full_unstemmed Bangla character recognition for Android devices
title_sort bangla character recognition for android devices
publisher BRAC University
publishDate 2016
url http://hdl.handle.net/10361/4894
work_keys_str_mv AT manzurshahrin banglacharacterrecognitionforandroiddevices
AT islamshafiqul banglacharacterrecognitionforandroiddevices
AT foysalabu banglacharacterrecognitionforandroiddevices
AT chowdhuryaparajita banglacharacterrecognitionforandroiddevices
_version_ 1814307101488447488