Classification of Bangla regional languages and recognition of artificial Bangla speech using deep learning
This thesis is submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering, 2022.
Auteur principal: | |
---|---|
Autres auteurs: | |
Format: | Thèse |
Langue: | English |
Publié: |
Brac University
2022
|
Sujets: | |
Accès en ligne: | http://hdl.handle.net/10361/17206 |
id |
10361-17206 |
---|---|
record_format |
dspace |
spelling |
10361-172062022-09-13T21:01:38Z Classification of Bangla regional languages and recognition of artificial Bangla speech using deep learning Hossain, Prommy Sultana Chakrabarty, Amitabha Department of Computer Science and Engineering, Brac University Convolutional autoencoder Extreme learning machine Bangla regional language Speech recognition Cognitive learning theory (Deep learning) Automatic speech recognition. Machine learning This thesis is submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering, 2022. Cataloged from PDF version of thesis. Includes bibliographical references (pages 72-77). Since 1970, researchers have been attempting to recognize and comprehend spontaneous speech. For an automatic voice recognition system, many techniques were employed. People always choose English for voice recognition since it has been the subject of the majority of study and implementation. However, Bangla is fifth most widely spoken languages in the world. Bangla regional language voice recognition has the potential to have a significant influence on human-computer interaction and internet of things applications. Majority of the research performed in the past decade in Bangla speech recognition involves classification of age, gender, speaker identification and detection of specific words. However, classification of regional Bangla language from Bangla speech and the identification of artificial Bangla speech has not been researched heavily before. Due to the limitation of grammatical and phonetic database with various Bangla regional language. Hence the author of this paper has created 30 hours of Bangla regional language speech dataset, that covers the dialect spoken by the locals in seven districts/division of Bangladesh. Bangla speech was generated, by first converting Bangla words to English word aberration (used often as text language) that would ultimately translate to a English phrase. Additionally, to classify the regional language spoken by the speaker in the audio signal and determine its authenticity, the suggested technique was used. Stacked convolutional autoencoder (SCAE) and sequence of multi-label extreme learning machines (MLELMs). SCAE section of the model creates a detailed feature map from Mel Frequency Energy Coefficients (MFECs) input data by identifying the spatial and temporal salient qualities. Feature vector is then fed to the first MLELM network to produce soft classification score for each data. Based on which the second MLELM network would generate hard labels. The suggested method was excessively trained and tested on unsupervised data which is the formation of new sentence from the unique Bangla/English abbreviation words. The model is also able to categorization speaker’s characteristic such as; age and gender. Through experimentation it was found that the model generates better accuracy score label for regional language with taking age class into consideration. As aging generates physiological changes in the brain that alter the processing of aural information, increasing classification accuracy from 75% to 92% without and with age class consideration, respectively. This was able to be achieved due to the usage of MLELMs networks, input data is a multi labeled dataset, that classify labels based on linked patterns between classes. The classification accuracy for synthesised Bangla speech labels 93%, age 95%, and gender class label 92%. The proposed methodology works well with English speech audio-set as well. Prommy Sultana Hossain M. Computer Science and Engineering 2022-09-13T06:09:35Z 2022-09-13T06:09:35Z 2022 2022-03 Thesis ID 20166014 http://hdl.handle.net/10361/17206 en Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 77 pages application/pdf Brac University |
institution |
Brac University |
collection |
Institutional Repository |
language |
English |
topic |
Convolutional autoencoder Extreme learning machine Bangla regional language Speech recognition Cognitive learning theory (Deep learning) Automatic speech recognition. Machine learning |
spellingShingle |
Convolutional autoencoder Extreme learning machine Bangla regional language Speech recognition Cognitive learning theory (Deep learning) Automatic speech recognition. Machine learning Hossain, Prommy Sultana Classification of Bangla regional languages and recognition of artificial Bangla speech using deep learning |
description |
This thesis is submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering, 2022. |
author2 |
Chakrabarty, Amitabha |
author_facet |
Chakrabarty, Amitabha Hossain, Prommy Sultana |
format |
Thesis |
author |
Hossain, Prommy Sultana |
author_sort |
Hossain, Prommy Sultana |
title |
Classification of Bangla regional languages and recognition of artificial Bangla speech using deep learning |
title_short |
Classification of Bangla regional languages and recognition of artificial Bangla speech using deep learning |
title_full |
Classification of Bangla regional languages and recognition of artificial Bangla speech using deep learning |
title_fullStr |
Classification of Bangla regional languages and recognition of artificial Bangla speech using deep learning |
title_full_unstemmed |
Classification of Bangla regional languages and recognition of artificial Bangla speech using deep learning |
title_sort |
classification of bangla regional languages and recognition of artificial bangla speech using deep learning |
publisher |
Brac University |
publishDate |
2022 |
url |
http://hdl.handle.net/10361/17206 |
work_keys_str_mv |
AT hossainprommysultana classificationofbanglaregionallanguagesandrecognitionofartificialbanglaspeechusingdeeplearning |
_version_ |
1814308823495606272 |