Urban sound classification using convolutional Neural Network and long short term memory based on multiple features

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2020.

Bibliographic Details
Main Authors:	Das, Joy Krishan, Ghosh, Arka, Pal, Abhijit Kumar, Dutta, Sumit
Other Authors:	Chakraborty, Amitabha
Format:	Thesis
Language:	English
Published:	Brac University 2021
Subjects:	Sound classi cation Spectrograms Urbansound8k CNN LSTM LibROSA Neural networks (Computer science)
Online Access:	http://dspace.bracu.ac.bd/xmlui/handle/10361/14444

id	10361-14444
record_format	dspace
spelling	10361-144442022-01-26T10:19:57Z Urban sound classification using convolutional Neural Network and long short term memory based on multiple features Das, Joy Krishan Ghosh, Arka Pal, Abhijit Kumar Dutta, Sumit Chakraborty, Amitabha Department of Computer Science and Engineering, Brac University Sound classi cation Spectrograms Urbansound8k CNN LSTM LibROSA Neural networks (Computer science) This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2020. Cataloged from PDF version of thesis. Includes bibliographical references (pages 43-46). There are many sounds all around us and our brain can easily and clearly identify them. Furthermore, our brain processes the received sound signals continuously and provides us with relevant environmental knowledge. Although not up to the level of accuracy of the brain, there are some smart devices which can extract necessary information from an audio signal, with the help of di erent algorithms. And as the days pass by more, more research is being conducted to ensure that accuracy level of this information extraction increases. Over the years several models like the CNN, ANN, RCNN and many machine learning techniques have been adopted to classify sound accurately and these have shown promising results in the recent years in distinguishing spectra- temporal pictures. For our research purpose, we are using seven features which are Chromagram, Mel-spectrogram, Spectral contrast, Tonnetz, MFCC, Chroma CENS and Chroma cqt.We have employed two models for the classi cation process of audio signals which are LSTM and CNN and the dataset used for the research is the UrbanSound8K. The novelty of the research lies in showing that the LSTM shows a better result in classi cation accuracy compared to CNN, when the MFCC feature is used. Furthermore, we have augmented the UrbanSound8K dataset to ensure that the accuracy of the LSTM is higher than the CNN in case of both the original dataset as well as the augmented one. Moreover, we have tested the accuracy of the models based on the features used. This has been done by using each of the features separately on each of the models, in addition to the two forms of feature stacking that we have performed. The rst form of feature stacking contains the features Chromagram, Mel-spectrogram, Spectral contrast, Tonnetz, MFCC, while the second form of feature stacking contains MFCC, Melspectrogram, Chroma cqt and Chroma stft. Likewise, we have stacked features using di erent combinations to expand our research.In such a way it was possible, with our LSTM model, to reach an accuracy of 98.80%, which is state-of-the-art performance. Joy Krishan Das Arka Ghosh Abhijit Kumar Pal Sumit Dutta B. Computer Science 2021-05-29T10:04:59Z 2021-05-29T10:04:59Z 2020 2020-04 Thesis ID 17301218 ID 16201007 ID 16301148 ID 16301104 http://dspace.bracu.ac.bd/xmlui/handle/10361/14444 en Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 47 pages application/pdf Brac University
institution	Brac University
collection	Institutional Repository
language	English
topic	Sound classi cation Spectrograms Urbansound8k CNN LSTM LibROSA Neural networks (Computer science)
spellingShingle	Sound classi cation Spectrograms Urbansound8k CNN LSTM LibROSA Neural networks (Computer science) Das, Joy Krishan Ghosh, Arka Pal, Abhijit Kumar Dutta, Sumit Urban sound classification using convolutional Neural Network and long short term memory based on multiple features
description	This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2020.
author2	Chakraborty, Amitabha
author_facet	Chakraborty, Amitabha Das, Joy Krishan Ghosh, Arka Pal, Abhijit Kumar Dutta, Sumit
format	Thesis
author	Das, Joy Krishan Ghosh, Arka Pal, Abhijit Kumar Dutta, Sumit
author_sort	Das, Joy Krishan
title	Urban sound classification using convolutional Neural Network and long short term memory based on multiple features
title_short	Urban sound classification using convolutional Neural Network and long short term memory based on multiple features
title_full	Urban sound classification using convolutional Neural Network and long short term memory based on multiple features
title_fullStr	Urban sound classification using convolutional Neural Network and long short term memory based on multiple features
title_full_unstemmed	Urban sound classification using convolutional Neural Network and long short term memory based on multiple features
title_sort	urban sound classification using convolutional neural network and long short term memory based on multiple features
publisher	Brac University
publishDate	2021
url	http://dspace.bracu.ac.bd/xmlui/handle/10361/14444
work_keys_str_mv	AT dasjoykrishan urbansoundclassificationusingconvolutionalneuralnetworkandlongshorttermmemorybasedonmultiplefeatures AT ghosharka urbansoundclassificationusingconvolutionalneuralnetworkandlongshorttermmemorybasedonmultiplefeatures AT palabhijitkumar urbansoundclassificationusingconvolutionalneuralnetworkandlongshorttermmemorybasedonmultiplefeatures AT duttasumit urbansoundclassificationusingconvolutionalneuralnetworkandlongshorttermmemorybasedonmultiplefeatures
_version_	1814308964294197248

Urban sound classification using convolutional Neural Network and long short term memory based on multiple features

Similar Items