Enhanced hate speech detection in social media using transformer-based models

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024.

书目详细资料
Main Authors:	Tabasshum, Anika, Ashrafi, Fairuz Tasnim, Afreen, Sadia
其他作者:	Alam, Md Golam Rabiul
格式:	Thesis
语言:	English
出版:	Brac University 2024
主题:	Offensive language Neural network Machine learning Social media CNN Comment classification Neural networks (Computer science). Online social networks > Security measures. Social media. Natural language processing (Computer science).
在线阅读:	http://hdl.handle.net/10361/22841

id	10361-22841
record_format	dspace
spelling	10361-228412024-10-21T05:54:34Z Enhanced hate speech detection in social media using transformer-based models Tabasshum, Anika Ashrafi, Fairuz Tasnim Afreen, Sadia Alam, Md Golam Rabiul Offensive language Neural network Machine learning Social media CNN Comment classification Neural networks (Computer science). Online social networks--Security measures. Social media. Natural language processing (Computer science). This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024. Cataloged from PDF version of thesis. Includes bibliographical references (pages 61-66). Hate speech on social media can escalate into ”cyber conflict,” detrimentally impacting social life. With the exponential growth of Internet users and media content, identifying abusive language in audio and video content has become increasingly challenging. The nuances of human communication mean that individuals might employ seemingly non-hateful language in derogatory ways, often accompanied by specific voice tones and gestures that aren’t captured when converting multimedia into text. This research delves deep into the realm of hate speech detection, aiming to automatically identify harmful content across various social media platforms. Initially focused on text, our study utilized remote supervision for automatically labeled dataset creation and employed word embeddings with a bias toward hate. We analyzed datasets from Twitter, testing various machine-learning models to gauge the representation of hate speech and abusive language. Any tweet or online post exhibiting racist or sexist sentiments was categorized as ”hate speech.” Our objective was to classify such messages for better content moderation systematically. With advancements in our research, we have extended our detection capabilities to audio content. By leveraging Simple Feed-forward Neural Networks, RNNs, and CNNs, we can now discern hate speech patterns in audio with enhanced accuracy. However, the vastness of content on social media platforms means not every piece can be manually moderated. This underscores the importance of our automated hate speech detection, especially when dealing with content in linguistically challenging languages. However, social media networks cannot control every piece of user content. Because of this, it is necessary to identify hate speech automatically. This desire is heightened when the content is written in challenging languages. Our study provides a unique transformer-based methodology for detecting hate speech in social media. The proposed model uses Natural Language Processing (NLP) approaches to assess text and audio input. To increase the accuracy of hate speech identification, we use sophisticated deep learning architectures such as attention methods and transformers. Our model is trained on a huge dataset of tweets and audio recordings, and its performance is measured using a variety of criteria. Our transformer-based approach beats existing state-of-the-art hate speech identification methods, according to the results. Our study makes an essential addition to the field of computer science and engineering by addressing the critical issue of hate speech on social media and proposing an effective solution based on modern machine learning techniques. Anika Tabasshum Fairuz Tasnim Ashrafi Sadia Afreen B.Sc. in Computer Science 2024-05-15T08:26:26Z 2024-05-15T08:26:26Z ©2024 2024-01 Thesis ID: 19201106 ID: 19201035 ID: 19201105 http://hdl.handle.net/10361/22841 en Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 76 pages application/pdf Brac University
institution	Brac University
collection	Institutional Repository
language	English
topic	Offensive language Neural network Machine learning Social media CNN Comment classification Neural networks (Computer science). Online social networks--Security measures. Social media. Natural language processing (Computer science).
spellingShingle	Offensive language Neural network Machine learning Social media CNN Comment classification Neural networks (Computer science). Online social networks--Security measures. Social media. Natural language processing (Computer science). Tabasshum, Anika Ashrafi, Fairuz Tasnim Afreen, Sadia Enhanced hate speech detection in social media using transformer-based models
description	This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024.
author2	Alam, Md Golam Rabiul
author_facet	Alam, Md Golam Rabiul Tabasshum, Anika Ashrafi, Fairuz Tasnim Afreen, Sadia
format	Thesis
author	Tabasshum, Anika Ashrafi, Fairuz Tasnim Afreen, Sadia
author_sort	Tabasshum, Anika
title	Enhanced hate speech detection in social media using transformer-based models
title_short	Enhanced hate speech detection in social media using transformer-based models
title_full	Enhanced hate speech detection in social media using transformer-based models
title_fullStr	Enhanced hate speech detection in social media using transformer-based models
title_full_unstemmed	Enhanced hate speech detection in social media using transformer-based models
title_sort	enhanced hate speech detection in social media using transformer-based models
publisher	Brac University
publishDate	2024
url	http://hdl.handle.net/10361/22841
work_keys_str_mv	AT tabasshumanika enhancedhatespeechdetectioninsocialmediausingtransformerbasedmodels AT ashrafifairuztasnim enhancedhatespeechdetectioninsocialmediausingtransformerbasedmodels AT afreensadia enhancedhatespeechdetectioninsocialmediausingtransformerbasedmodels
_version_	1814308572667838464

Enhanced hate speech detection in social media using transformer-based models

相似书籍