Enhanced hate speech detection in social media using transformer-based models
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024.
Main Authors: | , , |
---|---|
其他作者: | |
格式: | Thesis |
语言: | English |
出版: |
Brac University
2024
|
主题: | |
在线阅读: | http://hdl.handle.net/10361/22841 |
id |
10361-22841 |
---|---|
record_format |
dspace |
spelling |
10361-228412024-10-21T05:54:34Z Enhanced hate speech detection in social media using transformer-based models Tabasshum, Anika Ashrafi, Fairuz Tasnim Afreen, Sadia Alam, Md Golam Rabiul Offensive language Neural network Machine learning Social media CNN Comment classification Neural networks (Computer science). Online social networks--Security measures. Social media. Natural language processing (Computer science). This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024. Cataloged from PDF version of thesis. Includes bibliographical references (pages 61-66). Hate speech on social media can escalate into ”cyber conflict,” detrimentally impacting social life. With the exponential growth of Internet users and media content, identifying abusive language in audio and video content has become increasingly challenging. The nuances of human communication mean that individuals might employ seemingly non-hateful language in derogatory ways, often accompanied by specific voice tones and gestures that aren’t captured when converting multimedia into text. This research delves deep into the realm of hate speech detection, aiming to automatically identify harmful content across various social media platforms. Initially focused on text, our study utilized remote supervision for automatically labeled dataset creation and employed word embeddings with a bias toward hate. We analyzed datasets from Twitter, testing various machine-learning models to gauge the representation of hate speech and abusive language. Any tweet or online post exhibiting racist or sexist sentiments was categorized as ”hate speech.” Our objective was to classify such messages for better content moderation systematically. With advancements in our research, we have extended our detection capabilities to audio content. By leveraging Simple Feed-forward Neural Networks, RNNs, and CNNs, we can now discern hate speech patterns in audio with enhanced accuracy. However, the vastness of content on social media platforms means not every piece can be manually moderated. This underscores the importance of our automated hate speech detection, especially when dealing with content in linguistically challenging languages. However, social media networks cannot control every piece of user content. Because of this, it is necessary to identify hate speech automatically. This desire is heightened when the content is written in challenging languages. Our study provides a unique transformer-based methodology for detecting hate speech in social media. The proposed model uses Natural Language Processing (NLP) approaches to assess text and audio input. To increase the accuracy of hate speech identification, we use sophisticated deep learning architectures such as attention methods and transformers. Our model is trained on a huge dataset of tweets and audio recordings, and its performance is measured using a variety of criteria. Our transformer-based approach beats existing state-of-the-art hate speech identification methods, according to the results. Our study makes an essential addition to the field of computer science and engineering by addressing the critical issue of hate speech on social media and proposing an effective solution based on modern machine learning techniques. Anika Tabasshum Fairuz Tasnim Ashrafi Sadia Afreen B.Sc. in Computer Science 2024-05-15T08:26:26Z 2024-05-15T08:26:26Z ©2024 2024-01 Thesis ID: 19201106 ID: 19201035 ID: 19201105 http://hdl.handle.net/10361/22841 en Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 76 pages application/pdf Brac University |
institution |
Brac University |
collection |
Institutional Repository |
language |
English |
topic |
Offensive language Neural network Machine learning Social media CNN Comment classification Neural networks (Computer science). Online social networks--Security measures. Social media. Natural language processing (Computer science). |
spellingShingle |
Offensive language Neural network Machine learning Social media CNN Comment classification Neural networks (Computer science). Online social networks--Security measures. Social media. Natural language processing (Computer science). Tabasshum, Anika Ashrafi, Fairuz Tasnim Afreen, Sadia Enhanced hate speech detection in social media using transformer-based models |
description |
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024. |
author2 |
Alam, Md Golam Rabiul |
author_facet |
Alam, Md Golam Rabiul Tabasshum, Anika Ashrafi, Fairuz Tasnim Afreen, Sadia |
format |
Thesis |
author |
Tabasshum, Anika Ashrafi, Fairuz Tasnim Afreen, Sadia |
author_sort |
Tabasshum, Anika |
title |
Enhanced hate speech detection in social media using transformer-based models |
title_short |
Enhanced hate speech detection in social media using transformer-based models |
title_full |
Enhanced hate speech detection in social media using transformer-based models |
title_fullStr |
Enhanced hate speech detection in social media using transformer-based models |
title_full_unstemmed |
Enhanced hate speech detection in social media using transformer-based models |
title_sort |
enhanced hate speech detection in social media using transformer-based models |
publisher |
Brac University |
publishDate |
2024 |
url |
http://hdl.handle.net/10361/22841 |
work_keys_str_mv |
AT tabasshumanika enhancedhatespeechdetectioninsocialmediausingtransformerbasedmodels AT ashrafifairuztasnim enhancedhatespeechdetectioninsocialmediausingtransformerbasedmodels AT afreensadia enhancedhatespeechdetectioninsocialmediausingtransformerbasedmodels |
_version_ |
1814308572667838464 |