Performance analysis of machine learning algorithms for Malware classification
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2022.
Главные авторы: | , , , , |
---|---|
Другие авторы: | |
Формат: | Диссертация |
Язык: | English |
Опубликовано: |
Brac University
2023
|
Предметы: | |
Online-ссылка: | http://hdl.handle.net/10361/21825 |
id |
10361-21825 |
---|---|
record_format |
dspace |
spelling |
10361-218252023-10-15T21:07:34Z Performance analysis of machine learning algorithms for Malware classification Bushra, Raisa Hasan Alam, Md Taukir Saha, Aniruddho Fahim, Nazmus Sakib Binty, Nabila Mourium Chakrabarty, Amitabha Rodoshi, Ahanaf Hassan Department of Computer Science and Engineering, Brac University Machine learning Trojan Adware Ransomware Classification Malware Zero-day Naïve Bayes Stochastic gradient descent Random forest Decision tree AdaBoost XGBoost Logistic regression Multi-layer perceptron K- nearest neighbour Support vector machine Regression analysis Computer algorithms This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2022. Cataloged from PDF version of thesis. Includes bibliographical references (pages 32-36). Malware detection research has been popular over the years as the variations and complexity of malware attacks are increasing daily. Using variously Supervised and Unsupervised machine learning algorithms to detect, identify, or classify malware attacks has been proven a very effective technique for some past years. Some com- mon and widely concerning malware attacks are Trojan, Adware, Ransomware, and Zero-day. In this paper, we used ten ML algorithms such as AdaBoost, Stochastic Gradient Descent (SGD), Naïve Bayes (NB), Decision Tree (DT), Random For- est (RF), XGBoost, Logistic Regression (LR), Multi-Layer Perceptron (MLP), K- Nearest Neighbour(KNN), Support Vector Machine (SVM) for classifying software- based Trojan attacks, Ransomware, Adware and Zero-day attacks. This research was conducted on a dataset having a total sample of 12863 malware, consisting of the malware categories mentioned above, to extract features and learn patterns. Also, we showed a comparison between these ML methods and analysis based on how they classify these popular malware in this paper after testing each classifier on the selected dataset. After implementation, RF achieved the highest accuracy of 86.97%, and Gaussian NB achieved the lowest accuracy of 47.84%. MLP, XGBoost, KNN, DT, AdaBoost, SVM, LR, SGD got 83.60%, 82.59%, 80.68%, 79.63%, 73.30%, 73.22%, 67.08%, 64.40% accuracy respectively. Other than accuracy, our analysis was based on individual accuracy, precision, and F1-score, TPR, TNR, FPR, and FNR of malware classes for each ML classifier. Raisa Hasan Bushra Md Taukir Alam Aniruddho Saha Nazmus Sakib Fahim Nabila Mourium Binty B.Sc. in Computer Science 2023-10-15T10:39:29Z 2023-10-15T10:39:29Z ©2022 2022-09-29 Thesis ID 18301064 ID 18301277 ID 18201117 ID 18201166 ID 19101082 http://hdl.handle.net/10361/21825 en Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 47 pages application/pdf Brac University |
institution |
Brac University |
collection |
Institutional Repository |
language |
English |
topic |
Machine learning Trojan Adware Ransomware Classification Malware Zero-day Naïve Bayes Stochastic gradient descent Random forest Decision tree AdaBoost XGBoost Logistic regression Multi-layer perceptron K- nearest neighbour Support vector machine Regression analysis Computer algorithms |
spellingShingle |
Machine learning Trojan Adware Ransomware Classification Malware Zero-day Naïve Bayes Stochastic gradient descent Random forest Decision tree AdaBoost XGBoost Logistic regression Multi-layer perceptron K- nearest neighbour Support vector machine Regression analysis Computer algorithms Bushra, Raisa Hasan Alam, Md Taukir Saha, Aniruddho Fahim, Nazmus Sakib Binty, Nabila Mourium Performance analysis of machine learning algorithms for Malware classification |
description |
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2022. |
author2 |
Chakrabarty, Amitabha |
author_facet |
Chakrabarty, Amitabha Bushra, Raisa Hasan Alam, Md Taukir Saha, Aniruddho Fahim, Nazmus Sakib Binty, Nabila Mourium |
format |
Thesis |
author |
Bushra, Raisa Hasan Alam, Md Taukir Saha, Aniruddho Fahim, Nazmus Sakib Binty, Nabila Mourium |
author_sort |
Bushra, Raisa Hasan |
title |
Performance analysis of machine learning algorithms for Malware classification |
title_short |
Performance analysis of machine learning algorithms for Malware classification |
title_full |
Performance analysis of machine learning algorithms for Malware classification |
title_fullStr |
Performance analysis of machine learning algorithms for Malware classification |
title_full_unstemmed |
Performance analysis of machine learning algorithms for Malware classification |
title_sort |
performance analysis of machine learning algorithms for malware classification |
publisher |
Brac University |
publishDate |
2023 |
url |
http://hdl.handle.net/10361/21825 |
work_keys_str_mv |
AT bushraraisahasan performanceanalysisofmachinelearningalgorithmsformalwareclassification AT alammdtaukir performanceanalysisofmachinelearningalgorithmsformalwareclassification AT sahaaniruddho performanceanalysisofmachinelearningalgorithmsformalwareclassification AT fahimnazmussakib performanceanalysisofmachinelearningalgorithmsformalwareclassification AT bintynabilamourium performanceanalysisofmachinelearningalgorithmsformalwareclassification |
_version_ |
1814309589868347392 |