Real time bengali speech to text conversion using CMU sphinx

This thesis report is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2017.

Bibliografski detalji
Glavni autori: Kabir, Humayun, Ahmed, Ruhan, Nasib, Abdullah Umar
Daljnji autori: Uddin, Dr. Jia
Format: Disertacija
Jezik:English
Izdano: BRAC University 2018
Teme:
Online pristup:http://hdl.handle.net/10361/9546
id 10361-9546
record_format dspace
spelling 10361-95462022-01-26T10:15:44Z Real time bengali speech to text conversion using CMU sphinx Kabir, Humayun Ahmed, Ruhan Nasib, Abdullah Umar Uddin, Dr. Jia Department of Computer Science and Engineering, BRAC University Speech-to-text technology Bengali UNICODE Sphinx 4 Carnegie Melon University Baum-Welch Signal-to-noise-ratio This thesis report is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2017. Cataloged from PDF version of thesis report. Includes bibliographical references (pages 35-37). This paper aims to demonstrate the use of Speech-to-Text technology to convert Bangla spoken in a natural and continuous state into Bengali UNICODE font with good accuracy. This achievement required the usage of the open sourced framework Sphinx 4 created by Carnegie Melon University (CMU) which was written in Java and provides the required procedural coding tools to develop an acoustic model for a custom language like Bangla. It takes help of algorithms like Baum-Welch to create an Acoustic Model from training data which we gathered ourselves. Our main objective was to ensure that the system was adequately trained on a word by word basis from various speakers so that it could recognize new speakers fluently. We used a free digital audio workstation (DAW) called Audacity to manipulate the collected recording data via techniques like continuous frequency profiling to reduce the Signal-to-Noise-Ratio (SNR), vocal levelling, normalization and syllable splitting as well as merging to ensure an error free 1:1-word mapping of each utterance with its mirror transcription file text. The result is a speech to text recognition system with an acceptable accuracy of around 75% that was trained using recorded speech data from 10 individual speakers consisting of both males and females using custom transcript files that we wrote. Humayun Kabir Ruhan Ahmed Abdullah Umar Nasib B. Computer Science and Engineering 2018-02-25T08:00:00Z 2018-02-25T08:00:00Z 2017 2017-12 Thesis ID 14141004 ID 14101042 ID 17341001 http://hdl.handle.net/10361/9546 en BRAC University thesis reports are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 37 pages application/pdf BRAC University
institution Brac University
collection Institutional Repository
language English
topic Speech-to-text technology
Bengali UNICODE
Sphinx 4
Carnegie Melon University
Baum-Welch
Signal-to-noise-ratio
spellingShingle Speech-to-text technology
Bengali UNICODE
Sphinx 4
Carnegie Melon University
Baum-Welch
Signal-to-noise-ratio
Kabir, Humayun
Ahmed, Ruhan
Nasib, Abdullah Umar
Real time bengali speech to text conversion using CMU sphinx
description This thesis report is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2017.
author2 Uddin, Dr. Jia
author_facet Uddin, Dr. Jia
Kabir, Humayun
Ahmed, Ruhan
Nasib, Abdullah Umar
format Thesis
author Kabir, Humayun
Ahmed, Ruhan
Nasib, Abdullah Umar
author_sort Kabir, Humayun
title Real time bengali speech to text conversion using CMU sphinx
title_short Real time bengali speech to text conversion using CMU sphinx
title_full Real time bengali speech to text conversion using CMU sphinx
title_fullStr Real time bengali speech to text conversion using CMU sphinx
title_full_unstemmed Real time bengali speech to text conversion using CMU sphinx
title_sort real time bengali speech to text conversion using cmu sphinx
publisher BRAC University
publishDate 2018
url http://hdl.handle.net/10361/9546
work_keys_str_mv AT kabirhumayun realtimebengalispeechtotextconversionusingcmusphinx
AT ahmedruhan realtimebengalispeechtotextconversionusingcmusphinx
AT nasibabdullahumar realtimebengalispeechtotextconversionusingcmusphinx
_version_ 1814308309650374656