Identifying hate speech of Bangla language text using natural language processing

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2024.

Detalhes bibliográficos
Principais autores: Rahman, Mushfiqur, Jui, Razia Sultana, Sakib, Chowdhury Nazmuz, Ridoy, Fahim Alavi, Ananya, Taskiea Tabassum
Outros Autores: Rasel, Annajiat Alim
Formato: Tese
Idioma:English
Publicado em: Brac University 2024
Assuntos:
Acesso em linha:http://hdl.handle.net/10361/22864
id 10361-22864
record_format dspace
spelling 10361-228642024-05-19T21:05:32Z Identifying hate speech of Bangla language text using natural language processing Rahman, Mushfiqur Jui, Razia Sultana Sakib, Chowdhury Nazmuz Ridoy, Fahim Alavi Ananya, Taskiea Tabassum Rasel, Annajiat Alim Hossain, Muhammad Iqbal Karim, Dewan Ziaul Department of Computer Science and Engineering, Brac University Bangla language Natural language processing Machine learning Deep learning Offensive language Natural language processing (Computer science) Automatic speech recognition Deep learning (Machine learning) This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2024. Cataloged from PDF version of thesis. Includes bibliographical references (pages 22-24). In this era of the internet, sharing information through social media has provided significant benefits to humans. People can easily access and observe others’ lifestyles and work, as well as make comments or share thoughts about them. However, this practice also brings challenges, such as the spread of hate comments, abusive online criticism, spreading toxicity by giving hate comments etc. The internet’s flexibility and anonymity have created a culture where users find it easy to express themselves aggressively in communication. As the amount of hate speech is increasing, there is a need for a method to automatically detect hate speech. To tackle this concern, recent research has utilized diverse feature engineering methods and machine learning algorithms to autonomously identify hate speech messages across various datasets.Since it is related to Natural Language Processing (NLP), our goal is to utilize NLP to detect hate speeches and demonstrate how Deep Learning and ML can be used in this case.. Since there are more than 7,100 languages spoken throughout the world, we have chosen the Bengali language as our dataset language. Additionally, with the help of machine learning and deep learning, we will train our model to automatically detect hate speech. We are utilizing Multinomial Naive Bayes, RNN, Random Forest, Logistic Regression, Decision Tree Classifier, CNN-LSTM Hybrid algorithm and Multi lingual Bidirectional Encoder Representations(mBert) for result comparison and optimal outcomes and accuracy. After employing all the above algorithms, we found the highest accuracy using the mBert for the binary classification, which is 90.00%. On the other hand, for multiclass classifications, we have found the highest accuracy using CNN-LSTM Hybrid algorithm, which is 64% and the second highest is 62% using mBert. We are committed to further improving these results. Mushfiqur Rahman Razia Sultana Jui Chowdhury Nazmuz Sakib Fahim Alavi Ridoy Taskiea Tabassum Ananya B.Sc in Computer Science and Engineering 2024-05-19T06:21:47Z 2024-05-19T06:21:47Z ©2024 2024-01 Thesis ID: 18301121 ID: 18301021 ID: 18301109 ID: 19301071 ID: 19301192 http://hdl.handle.net/10361/22864 en Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 34 pages application/pdf Brac University
institution Brac University
collection Institutional Repository
language English
topic Bangla language
Natural language processing
Machine learning
Deep learning
Offensive language
Natural language processing (Computer science)
Automatic speech recognition
Deep learning (Machine learning)
spellingShingle Bangla language
Natural language processing
Machine learning
Deep learning
Offensive language
Natural language processing (Computer science)
Automatic speech recognition
Deep learning (Machine learning)
Rahman, Mushfiqur
Jui, Razia Sultana
Sakib, Chowdhury Nazmuz
Ridoy, Fahim Alavi
Ananya, Taskiea Tabassum
Identifying hate speech of Bangla language text using natural language processing
description This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2024.
author2 Rasel, Annajiat Alim
author_facet Rasel, Annajiat Alim
Rahman, Mushfiqur
Jui, Razia Sultana
Sakib, Chowdhury Nazmuz
Ridoy, Fahim Alavi
Ananya, Taskiea Tabassum
format Thesis
author Rahman, Mushfiqur
Jui, Razia Sultana
Sakib, Chowdhury Nazmuz
Ridoy, Fahim Alavi
Ananya, Taskiea Tabassum
author_sort Rahman, Mushfiqur
title Identifying hate speech of Bangla language text using natural language processing
title_short Identifying hate speech of Bangla language text using natural language processing
title_full Identifying hate speech of Bangla language text using natural language processing
title_fullStr Identifying hate speech of Bangla language text using natural language processing
title_full_unstemmed Identifying hate speech of Bangla language text using natural language processing
title_sort identifying hate speech of bangla language text using natural language processing
publisher Brac University
publishDate 2024
url http://hdl.handle.net/10361/22864
work_keys_str_mv AT rahmanmushfiqur identifyinghatespeechofbanglalanguagetextusingnaturallanguageprocessing
AT juiraziasultana identifyinghatespeechofbanglalanguagetextusingnaturallanguageprocessing
AT sakibchowdhurynazmuz identifyinghatespeechofbanglalanguagetextusingnaturallanguageprocessing
AT ridoyfahimalavi identifyinghatespeechofbanglalanguagetextusingnaturallanguageprocessing
AT ananyataskieatabassum identifyinghatespeechofbanglalanguagetextusingnaturallanguageprocessing
_version_ 1814309875229917184