Comparative study of toxic comments classification using machine learning algorithms

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2021.

Bibliografiset tiedot
Päätekijät:	Razzak, Razia, Sadril, Md., Shakil, Mahmudul Hasan, Rahman, Mahfuzur, Taki, Sabiha Tul Omman
Muut tekijät:	Chakrabarty, Amitabha
Aineistotyyppi:	Opinnäyte
Kieli:	en_US
Julkaistu:	Brac University 2021
Aiheet:	Cyberbullying Natural Language Processing Word Embedding Convolutional Neural Networks XGBoost Support Vector Machine
Linkit:	http://hdl.handle.net/10361/14810

id	10361-14810
record_format	dspace
spelling	10361-148102022-01-26T10:15:49Z Comparative study of toxic comments classification using machine learning algorithms Razzak, Razia Sadril, Md. Shakil, Mahmudul Hasan Rahman, Mahfuzur Taki, Sabiha Tul Omman Chakrabarty, Amitabha Department of Computer Science and Engineering, Brac University Cyberbullying Natural Language Processing Word Embedding Convolutional Neural Networks XGBoost Support Vector Machine This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2021. Cataloged from PDF version of thesis. Includes bibliographical references (pages 54-56). The rapid growth of information technology and the disruptive transformation of social media have happened in recent years. Websites like Facebook, Twitter, Instagram, where people can express their thoughts or feelings by posting text, photos or videos, have become incredibly popular. But unfortunately, it has also become a place for hateful activity, abusive words, cyberbullying and anonymous threats. There are many existing works in this field but those are not fully successful yet to provide accuracy in satisfactory level. In this work, we employ natural language processing (NLP) with convolution neural networking (CNN), extreme gradient boosting (XGBoost) and support vector machine (SVM) for segmenting toxic comments at first and then classifying them in six types from a large pool of documents provided by Kaggle’s regarding Wikipedia’s talk page edits. Using this dataset, the hamming score of CNN model is 89% ,XGBoost model is 87% and SVM model is 84%. Razia Razzak Md. Sadril Mahmudul Hasan Shakil Mahfuzur Rahman Sabiha Tul Omman Taki B. Computer Science 2021-07-15T06:18:46Z 2021-07-15T06:18:46Z 2021 2021-01 Thesis ID: 16101291 ID: 16301032 ID: 16301026 ID: 16101206 ID: 17101519 http://hdl.handle.net/10361/14810 en_US Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 56 Pages application/pdf Brac University
institution	Brac University
collection	Institutional Repository
language	en_US
topic	Cyberbullying Natural Language Processing Word Embedding Convolutional Neural Networks XGBoost Support Vector Machine
spellingShingle	Cyberbullying Natural Language Processing Word Embedding Convolutional Neural Networks XGBoost Support Vector Machine Razzak, Razia Sadril, Md. Shakil, Mahmudul Hasan Rahman, Mahfuzur Taki, Sabiha Tul Omman Comparative study of toxic comments classification using machine learning algorithms
description	This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2021.
author2	Chakrabarty, Amitabha
author_facet	Chakrabarty, Amitabha Razzak, Razia Sadril, Md. Shakil, Mahmudul Hasan Rahman, Mahfuzur Taki, Sabiha Tul Omman
format	Thesis
author	Razzak, Razia Sadril, Md. Shakil, Mahmudul Hasan Rahman, Mahfuzur Taki, Sabiha Tul Omman
author_sort	Razzak, Razia
title	Comparative study of toxic comments classification using machine learning algorithms
title_short	Comparative study of toxic comments classification using machine learning algorithms
title_full	Comparative study of toxic comments classification using machine learning algorithms
title_fullStr	Comparative study of toxic comments classification using machine learning algorithms
title_full_unstemmed	Comparative study of toxic comments classification using machine learning algorithms
title_sort	comparative study of toxic comments classification using machine learning algorithms
publisher	Brac University
publishDate	2021
url	http://hdl.handle.net/10361/14810
work_keys_str_mv	AT razzakrazia comparativestudyoftoxiccommentsclassificationusingmachinelearningalgorithms AT sadrilmd comparativestudyoftoxiccommentsclassificationusingmachinelearningalgorithms AT shakilmahmudulhasan comparativestudyoftoxiccommentsclassificationusingmachinelearningalgorithms AT rahmanmahfuzur comparativestudyoftoxiccommentsclassificationusingmachinelearningalgorithms AT takisabihatulomman comparativestudyoftoxiccommentsclassificationusingmachinelearningalgorithms
_version_	1814308323533520896

Comparative study of toxic comments classification using machine learning algorithms

Samankaltaisia teoksia