Segmentation free Bangla OCR using HMM: Training and recognition

Includes bibliographical references (page 7-8).

Detalles Bibliográficos
Main Authors: Hasnat, Md. Abul, Habib, S. M. Murtoza, Khan, Mumit
Outros autores: Center for Research on Bangla language processing (CRBLP), BRAC University
Formato: Artigo
Idioma:English
Publicado: BRAC University 2010
Subjects:
Acceso en liña:http://hdl.handle.net/10361/666
id 10361-666
record_format dspace
spelling 10361-6662019-09-29T05:27:30Z Segmentation free Bangla OCR using HMM: Training and recognition Hasnat, Md. Abul Habib, S. M. Murtoza Khan, Mumit Center for Research on Bangla language processing (CRBLP), BRAC University Optical character recognition (OCR) Hidden Markov Model (HMM) HTK Discrete cosine transform (DCT) Includes bibliographical references (page 7-8). The wide area of the application of HMM is in Speech Recognition where each spoken word is considered as a single unit to be recognized from the trained word network. Using this concept some research has been done for character recognition. In this paper, we present the training and recognition mechanism of a Hidden Markov Model (HMM) based multi font supported Optical Character Recognition (OCR) system for Bangla character. In our approach the central idea is separate HMM model for each segmented character or word. We emphasize on word level segmentation and like to consider the single character as a word when the character appears alone after segmentation process is done. The system uses HTK toolkit for data preparation, model training from multiple samples and recognition. Features of each trained character are calculated by applying Discrete Cosine Transform (DCT) to each pixel value of the character image where the image is divided into several frames according to its size. The extracted features of each frame are used as discrete probability distributions that will be given as input parameter to each HMM model. In case of recognition a model for each separated character or word is build up using the same approach. This model is given to the HTK toolkit to perform the recognition using Viterbi Decoding. The experimental result shows significant performance. Md. Abul Hasnat S. M. Murtoza Habib Mumit Khan 2010-12-06T10:37:04Z 2010-12-06T10:37:04Z 2007 2007 Article http://hdl.handle.net/10361/666 en application/pdf BRAC University
institution Brac University
collection Institutional Repository
language English
topic Optical character recognition (OCR)
Hidden Markov Model (HMM)
HTK
Discrete cosine transform (DCT)
spellingShingle Optical character recognition (OCR)
Hidden Markov Model (HMM)
HTK
Discrete cosine transform (DCT)
Hasnat, Md. Abul
Habib, S. M. Murtoza
Khan, Mumit
Segmentation free Bangla OCR using HMM: Training and recognition
description Includes bibliographical references (page 7-8).
author2 Center for Research on Bangla language processing (CRBLP), BRAC University
author_facet Center for Research on Bangla language processing (CRBLP), BRAC University
Hasnat, Md. Abul
Habib, S. M. Murtoza
Khan, Mumit
format Article
author Hasnat, Md. Abul
Habib, S. M. Murtoza
Khan, Mumit
author_sort Hasnat, Md. Abul
title Segmentation free Bangla OCR using HMM: Training and recognition
title_short Segmentation free Bangla OCR using HMM: Training and recognition
title_full Segmentation free Bangla OCR using HMM: Training and recognition
title_fullStr Segmentation free Bangla OCR using HMM: Training and recognition
title_full_unstemmed Segmentation free Bangla OCR using HMM: Training and recognition
title_sort segmentation free bangla ocr using hmm: training and recognition
publisher BRAC University
publishDate 2010
url http://hdl.handle.net/10361/666
work_keys_str_mv AT hasnatmdabul segmentationfreebanglaocrusinghmmtrainingandrecognition
AT habibsmmurtoza segmentationfreebanglaocrusinghmmtrainingandrecognition
AT khanmumit segmentationfreebanglaocrusinghmmtrainingandrecognition
_version_ 1814308867294625792