A double metaphone encoding for approximate name searching and matching in Bangla

Includes bibliographical references (page 6).

Detaylı Bibliyografya
Asıl Yazarlar: Naushad UzZaman, Khan, Mumit
Diğer Yazarlar: Center for Research on Bangla Language Processing (CRBLP), BRAC University
Materyal Türü: Makale
Dil:English
Baskı/Yayın Bilgisi: BRAC University 2010
Konular:
Online Erişim:http://hdl.handle.net/10361/312
id 10361-312
record_format dspace
spelling 10361-3122019-09-29T05:27:18Z A double metaphone encoding for approximate name searching and matching in Bangla Naushad UzZaman, Khan, Mumit Center for Research on Bangla Language Processing (CRBLP), BRAC University Name searching Name encoding, Phonetic encoding Double metaphone encoding Bangla language Includes bibliographical references (page 6). Almost any word can be a Bangali name, and the name in turn is often spelled in many different ways, all of which are considered correct and interchangeable. The reason for the spelling complication is two-fold: (1) there is a large gap between the script and pronunciation in Bangla, largely attributed to the large scale Sanskritization process that started in the 12th century and continued throughout the middle ages, and (2) typical Bangla names have very different origins, from the indigenous names derived primarily from Sanskrit, to the imported Muslim names from Persian and Arabic, Christian names from Portuguese, and even the names from popular Western TV soap-operas. However, there is always a large degree of phonetic similarity in the spelling variants of a name, which is the key to searching and matching names in records. We present a Double Metaphone encoding for Bangla names, taking into account the various spelling and phonetic rules in use, which can be used by applications to search for and match names. We encode the spelling variants of a large number of names found in the literature to demonstrate that the encoding does indeed show that the variants of a name are equivalent. A name searching algorithm may employ various figures of merit to narrow the list of possibilities when searching for similar names; we demonstrate one such figure of merit using name encoding and edit distance that has shown good promise. 2010-10-04T05:22:33Z 2010-10-04T05:22:33Z 2005 2005 Article http://hdl.handle.net/10361/312 en 6 pages application/pdf BRAC University
institution Brac University
collection Institutional Repository
language English
topic Name searching
Name encoding,
Phonetic encoding
Double metaphone encoding
Bangla language
spellingShingle Name searching
Name encoding,
Phonetic encoding
Double metaphone encoding
Bangla language
Naushad UzZaman,
Khan, Mumit
A double metaphone encoding for approximate name searching and matching in Bangla
description Includes bibliographical references (page 6).
author2 Center for Research on Bangla Language Processing (CRBLP), BRAC University
author_facet Center for Research on Bangla Language Processing (CRBLP), BRAC University
Naushad UzZaman,
Khan, Mumit
format Article
author Naushad UzZaman,
Khan, Mumit
author_sort Naushad UzZaman,
title A double metaphone encoding for approximate name searching and matching in Bangla
title_short A double metaphone encoding for approximate name searching and matching in Bangla
title_full A double metaphone encoding for approximate name searching and matching in Bangla
title_fullStr A double metaphone encoding for approximate name searching and matching in Bangla
title_full_unstemmed A double metaphone encoding for approximate name searching and matching in Bangla
title_sort double metaphone encoding for approximate name searching and matching in bangla
publisher BRAC University
publishDate 2010
url http://hdl.handle.net/10361/312
work_keys_str_mv AT naushaduzzaman adoublemetaphoneencodingforapproximatenamesearchingandmatchinginbangla
AT khanmumit adoublemetaphoneencodingforapproximatenamesearchingandmatchinginbangla
AT naushaduzzaman doublemetaphoneencodingforapproximatenamesearchingandmatchinginbangla
AT khanmumit doublemetaphoneencodingforapproximatenamesearchingandmatchinginbangla
_version_ 1814307451736948736