Comprehensive analysis and development of deep learning models for Bengali character’s spectrogram image classification in child speech: introduction of spectro SETNet

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024.

Մատենագիտական մանրամասներ
Հիմնական հեղինակներ: Ahmed, Syed Istiaque, Hossain, Md. Jubayer, Hoque, Kayes Mohammad Bin, Tusher, Mahmadur Rahman, Islam, Sajedur
Այլ հեղինակներ: Alam, Md. Golam Rabiul
Ձևաչափ: Թեզիս
Լեզու:English
Հրապարակվել է: Brac University 2024
Խորագրեր:
Առցանց հասանելիություն:http://hdl.handle.net/10361/24029
id 10361-24029
record_format dspace
spelling 10361-240292024-09-10T06:30:46Z Comprehensive analysis and development of deep learning models for Bengali character’s spectrogram image classification in child speech: introduction of spectro SETNet Ahmed, Syed Istiaque Hossain, Md. Jubayer Hoque, Kayes Mohammad Bin Tusher, Mahmadur Rahman Islam, Sajedur Alam, Md. Golam Rabiul Nayla, Nishat Department of Computer Science and Engineering, Brac University Automatic speech recognition Character’s recognition Deep learning Mel-frequency spectrogram Spectro-SETNet Automatic speech recognition--Data processing. Deep learning (Machine learning). Spectrometer--Data processing. This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024. Cataloged from PDF version of thesis. Includes bibliographical references (pages 82-84). In a rapidly developing linguistic technology, the key role of phoneme recognition consists of understanding language and language learning. The research will be framed where a recognition system is developed for the language of Bangla—vowels, consonants, and numbers for children of age three to six years. By adopting ad vanced approaches like technological methods and classical phonetic education, the spectrogram images of the Bengali children we investigate are classified. Among the techniques associated with modern machine learning (ML) the pervasive techniques are image recognition and large language models (LLM) which have extended to the less explored domain of Bangla phoneme spectrogram image recognition. From our group of 21 participants, we have generated balanced 31,147 spectrogram images a new dataset that we have created from scratch. This is because the dataset was done meticulously to serve as a complete resource for researchers of Bangla-speaking children’s phoneme recognition. Therefore, we then trained ten pre-existing deep learning models that were capable of interpreting and optimizing their performance in Bangla phoneme recognition by using our dataset. Based on these, the SENet model stood out among other existing models with a high performance of 96. 89% accuracy on our testing data set. The ResNet50 and VGG19 models produced the best outcomes among the deep learning models tested which ranked second and third respectively with an accuracy of 88. 8% and 87%. Based on these findings, we propose a novel architecture, Spectrogram SE-Transformer Block Network (Spectro SETNet), which is a hybrid of the ResNet50 model to which the SE and Transformer blocks have been added, in order to cope with more complicated data and to limit the computational power. The original hypothesis is that the model not only im proves the accuracy of Bengali speech recognition for children but also offers a new standard for more complex data processing with less computational power. B.Sc in Computer Science 2024-09-09T05:00:39Z 2024-09-09T05:00:39Z ©2024 2024-05 Thesis ID 20101273 ID 20101470 ID 20101471 ID 20101005 ID 23141093 http://hdl.handle.net/10361/24029 en Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 84 pages application/pdf Brac University
institution Brac University
collection Institutional Repository
language English
topic Automatic speech recognition
Character’s recognition
Deep learning
Mel-frequency spectrogram
Spectro-SETNet
Automatic speech recognition--Data processing.
Deep learning (Machine learning).
Spectrometer--Data processing.
spellingShingle Automatic speech recognition
Character’s recognition
Deep learning
Mel-frequency spectrogram
Spectro-SETNet
Automatic speech recognition--Data processing.
Deep learning (Machine learning).
Spectrometer--Data processing.
Ahmed, Syed Istiaque
Hossain, Md. Jubayer
Hoque, Kayes Mohammad Bin
Tusher, Mahmadur Rahman
Islam, Sajedur
Comprehensive analysis and development of deep learning models for Bengali character’s spectrogram image classification in child speech: introduction of spectro SETNet
description This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024.
author2 Alam, Md. Golam Rabiul
author_facet Alam, Md. Golam Rabiul
Ahmed, Syed Istiaque
Hossain, Md. Jubayer
Hoque, Kayes Mohammad Bin
Tusher, Mahmadur Rahman
Islam, Sajedur
format Thesis
author Ahmed, Syed Istiaque
Hossain, Md. Jubayer
Hoque, Kayes Mohammad Bin
Tusher, Mahmadur Rahman
Islam, Sajedur
author_sort Ahmed, Syed Istiaque
title Comprehensive analysis and development of deep learning models for Bengali character’s spectrogram image classification in child speech: introduction of spectro SETNet
title_short Comprehensive analysis and development of deep learning models for Bengali character’s spectrogram image classification in child speech: introduction of spectro SETNet
title_full Comprehensive analysis and development of deep learning models for Bengali character’s spectrogram image classification in child speech: introduction of spectro SETNet
title_fullStr Comprehensive analysis and development of deep learning models for Bengali character’s spectrogram image classification in child speech: introduction of spectro SETNet
title_full_unstemmed Comprehensive analysis and development of deep learning models for Bengali character’s spectrogram image classification in child speech: introduction of spectro SETNet
title_sort comprehensive analysis and development of deep learning models for bengali character’s spectrogram image classification in child speech: introduction of spectro setnet
publisher Brac University
publishDate 2024
url http://hdl.handle.net/10361/24029
work_keys_str_mv AT ahmedsyedistiaque comprehensiveanalysisanddevelopmentofdeeplearningmodelsforbengalicharactersspectrogramimageclassificationinchildspeechintroductionofspectrosetnet
AT hossainmdjubayer comprehensiveanalysisanddevelopmentofdeeplearningmodelsforbengalicharactersspectrogramimageclassificationinchildspeechintroductionofspectrosetnet
AT hoquekayesmohammadbin comprehensiveanalysisanddevelopmentofdeeplearningmodelsforbengalicharactersspectrogramimageclassificationinchildspeechintroductionofspectrosetnet
AT tushermahmadurrahman comprehensiveanalysisanddevelopmentofdeeplearningmodelsforbengalicharactersspectrogramimageclassificationinchildspeechintroductionofspectrosetnet
AT islamsajedur comprehensiveanalysisanddevelopmentofdeeplearningmodelsforbengalicharactersspectrogramimageclassificationinchildspeechintroductionofspectrosetnet
_version_ 1814308848247242752