Bengali isolated speech recognition : a comparative analysis of the effects of data augmentation on HMM and DNN based acoustic models
This thesis report is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2017.
Egile Nagusiak: | , |
---|---|
Beste egile batzuk: | |
Formatua: | Thesis |
Hizkuntza: | English |
Argitaratua: |
BRAC Univeristy
2018
|
Gaiak: | |
Sarrera elektronikoa: | http://hdl.handle.net/10361/9059 |
id |
10361-9059 |
---|---|
record_format |
dspace |
spelling |
10361-90592022-01-26T10:13:18Z Bengali isolated speech recognition : a comparative analysis of the effects of data augmentation on HMM and DNN based acoustic models Rashid, Warida Reza, Mohi Mostakim, Moin Department of Computer Science and Engineering, BRAC University Data augmentation Speech recognition This thesis report is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2017. Cataloged from PDF version of thesis report. Includes bibliographical references (pages 31-33). We have created an isolated-word dataset - Prodorshok 1, which consists of 34 Bengali words related to navigation with 1011 voice samples. The word set is intended to help design speaker dependent/independent, voice-command driven automated speech recognition (ASR) systems that can potentially improve human-computer interaction. This paper presents the results of an objective analysis that was undertaken using a subset of words from Prodorshok I to help assess its reliability in ASR systems that utilize Hidden Markov Models (HMM) with Gaussian emissions and Deep Neural Networks (DNN). The results show that simple data augmentation involving a small pitch shift can make surprisingly tangible improvements to accuracy levels in speech recognition, even when working with small datasets. Prodorshok I will be expanded upon and made publicly available for others to use under an Open Data License (ODbL). Warida Rashid Mohi Reza B. Computer Science and Engineering 2018-01-15T05:23:18Z 2018-01-15T05:23:18Z 2017 2017 Thesis ID 14301026 ID 14101040 http://hdl.handle.net/10361/9059 en BRAC University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 34 pages application/pdf BRAC Univeristy |
institution |
Brac University |
collection |
Institutional Repository |
language |
English |
topic |
Data augmentation Speech recognition |
spellingShingle |
Data augmentation Speech recognition Rashid, Warida Reza, Mohi Bengali isolated speech recognition : a comparative analysis of the effects of data augmentation on HMM and DNN based acoustic models |
description |
This thesis report is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2017. |
author2 |
Mostakim, Moin |
author_facet |
Mostakim, Moin Rashid, Warida Reza, Mohi |
format |
Thesis |
author |
Rashid, Warida Reza, Mohi |
author_sort |
Rashid, Warida |
title |
Bengali isolated speech recognition : a comparative analysis of the effects of data augmentation on HMM and DNN based acoustic models |
title_short |
Bengali isolated speech recognition : a comparative analysis of the effects of data augmentation on HMM and DNN based acoustic models |
title_full |
Bengali isolated speech recognition : a comparative analysis of the effects of data augmentation on HMM and DNN based acoustic models |
title_fullStr |
Bengali isolated speech recognition : a comparative analysis of the effects of data augmentation on HMM and DNN based acoustic models |
title_full_unstemmed |
Bengali isolated speech recognition : a comparative analysis of the effects of data augmentation on HMM and DNN based acoustic models |
title_sort |
bengali isolated speech recognition : a comparative analysis of the effects of data augmentation on hmm and dnn based acoustic models |
publisher |
BRAC Univeristy |
publishDate |
2018 |
url |
http://hdl.handle.net/10361/9059 |
work_keys_str_mv |
AT rashidwarida bengaliisolatedspeechrecognitionacomparativeanalysisoftheeffectsofdataaugmentationonhmmanddnnbasedacousticmodels AT rezamohi bengaliisolatedspeechrecognitionacomparativeanalysisoftheeffectsofdataaugmentationonhmmanddnnbasedacousticmodels |
_version_ |
1814308204055625728 |