Into the heart of Bangla speech: advancing speech sentiment recognition with semi-supervised multimodal machine learning model leveraging an iterative SHAP-based feature selection

This thesis is submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science, 2024.

Detalhes bibliográficos
Autor principal:	Shruti, Abanti Chakraborty
Outros Autores:	Alam, Md. Golam Robiul
Formato:	Tese
Idioma:	English
Publicado em:	Brac University 2024
Assuntos:	Sentiment analysis Machine learning Bengali speech Decision tree Random forest regressor CNN BanglaBERT LSTM Automatic speech recognition > Bengali. Natural language processing (Computer science). Speech processing systems. Computational linguistics. Sentiment analysis > Data processing.
Acesso em linha:	http://hdl.handle.net/10361/24034

id	10361-24034
record_format	dspace
spelling	10361-240342024-09-09T21:04:34Z Into the heart of Bangla speech: advancing speech sentiment recognition with semi-supervised multimodal machine learning model leveraging an iterative SHAP-based feature selection Shruti, Abanti Chakraborty Alam, Md. Golam Robiul Department of Computer Science and Engineering, Brac University Sentiment analysis Machine learning Bengali speech Decision tree Random forest regressor CNN BanglaBERT LSTM Automatic speech recognition--Bengali. Natural language processing (Computer science). Speech processing systems. Computational linguistics. Sentiment analysis--Data processing. This thesis is submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science, 2024. Cataloged from the PDF version of the thesis. Includes bibliographical references (pages 76-81). Automatic sentiment recognition from speech data is crucial for various applications. As AI has grown in popularity, the application of the importance of speech sentiment analysis is increasing along with the amount of speech in every industry. Bengali is the seventh most spoken language in the world, yet research on voice sentiment analysis in this language is lacking. This thesis investigates novel techniques to enhance speech sentiment recognition in underresourced languages like Bengali. We explore the efficacy of both unimodal (speech only) and multimodal (speech, Image, and text) approaches for different fusion techniques. This research proposed a semi-supervised Random Forest model, which achieved consistent and robust performance across different modality combinations. This model demonstrated high accuracy with fewer features, showcasing the efficiency and effectiveness of SHAP-based semi-supervised learning in handling unlabeled data. Additionally, eight different feature extraction techniques have been employed to extract acoustic features and VGG19 and Bangla Word2Vec are used to extract image and text features. Moreover, this study has experimented with different modality-based methods such as LSTM, CNN, and BanglaBERT. We have used BanglaSER, SUBESCO, and KBES datasets for our experiments. Among the various models tested, early fusion techniques proved the most effective, achieving an accuracy of up to 83% when combining speech and text modalities with LSTM classifiers and the proposed semi-supervised model acquired the highest 77% accuracy for audio, text, and image modals. In contrast, late fusion techniques showed reduced performance, though including speech and image modalities improved accuracy to 62%. Detailed performance comparisons for unimodal systems indicate that traditional Random Forest models perform well with fully labeled datasets, but our semi-supervised model works comparatively well with only 20% labeled data. Moreover, our proposed semi-supervised AdaBoost model, using only 20 features and SHAP-based feature importance, outperformed the traditional model trained with 50 features. Remarkably, the proposed Random Forest model trained with 20% labeled and 80% unlabeled data achieved over 70% accuracy across different feature selection methods, with the weighted feature selection technique achieving the highest accuracy of 72%. We believe this thesis will contribute significantly to Bangla speech sentiment recognition by providing a robust, efficient, and interpretable framework. Abanti Chakraborty Shruti M.Sc. in Computer Science 2024-09-09T06:44:42Z 2024-09-09T06:44:42Z ©2024 2024-06 Thesis ID 22366034 http://hdl.handle.net/10361/24034 en Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 97 pages application/pdf Brac University
institution	Brac University
collection	Institutional Repository
language	English
topic	Sentiment analysis Machine learning Bengali speech Decision tree Random forest regressor CNN BanglaBERT LSTM Automatic speech recognition--Bengali. Natural language processing (Computer science). Speech processing systems. Computational linguistics. Sentiment analysis--Data processing.
spellingShingle	Sentiment analysis Machine learning Bengali speech Decision tree Random forest regressor CNN BanglaBERT LSTM Automatic speech recognition--Bengali. Natural language processing (Computer science). Speech processing systems. Computational linguistics. Sentiment analysis--Data processing. Shruti, Abanti Chakraborty Into the heart of Bangla speech: advancing speech sentiment recognition with semi-supervised multimodal machine learning model leveraging an iterative SHAP-based feature selection
description	This thesis is submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science, 2024.
author2	Alam, Md. Golam Robiul
author_facet	Alam, Md. Golam Robiul Shruti, Abanti Chakraborty
format	Thesis
author	Shruti, Abanti Chakraborty
author_sort	Shruti, Abanti Chakraborty
title	Into the heart of Bangla speech: advancing speech sentiment recognition with semi-supervised multimodal machine learning model leveraging an iterative SHAP-based feature selection
title_short	Into the heart of Bangla speech: advancing speech sentiment recognition with semi-supervised multimodal machine learning model leveraging an iterative SHAP-based feature selection
title_full	Into the heart of Bangla speech: advancing speech sentiment recognition with semi-supervised multimodal machine learning model leveraging an iterative SHAP-based feature selection
title_fullStr	Into the heart of Bangla speech: advancing speech sentiment recognition with semi-supervised multimodal machine learning model leveraging an iterative SHAP-based feature selection
title_full_unstemmed	Into the heart of Bangla speech: advancing speech sentiment recognition with semi-supervised multimodal machine learning model leveraging an iterative SHAP-based feature selection
title_sort	into the heart of bangla speech: advancing speech sentiment recognition with semi-supervised multimodal machine learning model leveraging an iterative shap-based feature selection
publisher	Brac University
publishDate	2024
url	http://hdl.handle.net/10361/24034
work_keys_str_mv	AT shrutiabantichakraborty intotheheartofbanglaspeechadvancingspeechsentimentrecognitionwithsemisupervisedmultimodalmachinelearningmodelleveraginganiterativeshapbasedfeatureselection
_version_	1814309389750763520

Into the heart of Bangla speech: advancing speech sentiment recognition with semi-supervised multimodal machine learning model leveraging an iterative SHAP-based feature selection

Registros relacionados