Real time bengali speech to text conversion using CMU sphinx
This thesis report is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2017.
Glavni autori: | , , |
---|---|
Daljnji autori: | |
Format: | Disertacija |
Jezik: | English |
Izdano: |
BRAC University
2018
|
Teme: | |
Online pristup: | http://hdl.handle.net/10361/9546 |
id |
10361-9546 |
---|---|
record_format |
dspace |
spelling |
10361-95462022-01-26T10:15:44Z Real time bengali speech to text conversion using CMU sphinx Kabir, Humayun Ahmed, Ruhan Nasib, Abdullah Umar Uddin, Dr. Jia Department of Computer Science and Engineering, BRAC University Speech-to-text technology Bengali UNICODE Sphinx 4 Carnegie Melon University Baum-Welch Signal-to-noise-ratio This thesis report is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2017. Cataloged from PDF version of thesis report. Includes bibliographical references (pages 35-37). This paper aims to demonstrate the use of Speech-to-Text technology to convert Bangla spoken in a natural and continuous state into Bengali UNICODE font with good accuracy. This achievement required the usage of the open sourced framework Sphinx 4 created by Carnegie Melon University (CMU) which was written in Java and provides the required procedural coding tools to develop an acoustic model for a custom language like Bangla. It takes help of algorithms like Baum-Welch to create an Acoustic Model from training data which we gathered ourselves. Our main objective was to ensure that the system was adequately trained on a word by word basis from various speakers so that it could recognize new speakers fluently. We used a free digital audio workstation (DAW) called Audacity to manipulate the collected recording data via techniques like continuous frequency profiling to reduce the Signal-to-Noise-Ratio (SNR), vocal levelling, normalization and syllable splitting as well as merging to ensure an error free 1:1-word mapping of each utterance with its mirror transcription file text. The result is a speech to text recognition system with an acceptable accuracy of around 75% that was trained using recorded speech data from 10 individual speakers consisting of both males and females using custom transcript files that we wrote. Humayun Kabir Ruhan Ahmed Abdullah Umar Nasib B. Computer Science and Engineering 2018-02-25T08:00:00Z 2018-02-25T08:00:00Z 2017 2017-12 Thesis ID 14141004 ID 14101042 ID 17341001 http://hdl.handle.net/10361/9546 en BRAC University thesis reports are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 37 pages application/pdf BRAC University |
institution |
Brac University |
collection |
Institutional Repository |
language |
English |
topic |
Speech-to-text technology Bengali UNICODE Sphinx 4 Carnegie Melon University Baum-Welch Signal-to-noise-ratio |
spellingShingle |
Speech-to-text technology Bengali UNICODE Sphinx 4 Carnegie Melon University Baum-Welch Signal-to-noise-ratio Kabir, Humayun Ahmed, Ruhan Nasib, Abdullah Umar Real time bengali speech to text conversion using CMU sphinx |
description |
This thesis report is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2017. |
author2 |
Uddin, Dr. Jia |
author_facet |
Uddin, Dr. Jia Kabir, Humayun Ahmed, Ruhan Nasib, Abdullah Umar |
format |
Thesis |
author |
Kabir, Humayun Ahmed, Ruhan Nasib, Abdullah Umar |
author_sort |
Kabir, Humayun |
title |
Real time bengali speech to text conversion using CMU sphinx |
title_short |
Real time bengali speech to text conversion using CMU sphinx |
title_full |
Real time bengali speech to text conversion using CMU sphinx |
title_fullStr |
Real time bengali speech to text conversion using CMU sphinx |
title_full_unstemmed |
Real time bengali speech to text conversion using CMU sphinx |
title_sort |
real time bengali speech to text conversion using cmu sphinx |
publisher |
BRAC University |
publishDate |
2018 |
url |
http://hdl.handle.net/10361/9546 |
work_keys_str_mv |
AT kabirhumayun realtimebengalispeechtotextconversionusingcmusphinx AT ahmedruhan realtimebengalispeechtotextconversionusingcmusphinx AT nasibabdullahumar realtimebengalispeechtotextconversionusingcmusphinx |
_version_ |
1814308309650374656 |