History (Forward N-Gram) or future (Backward N-Gram)? Which model to consider for N-Gram analysis in Bangla?

Includes bibliographical references (page 5).

Bibliografiset tiedot
Päätekijät: Khan, Naira, Habib, Md. Tarek, Alam, Md. Jahangir, Rahman, Rajib, UzZaman, Naushad, Khan, Mumit
Muut tekijät: Center for Research on Bangla Language Processing, BRAC University
Aineistotyyppi: Artikkeli
Kieli:English
Julkaistu: BRAC University 2010
Aiheet:
Linkit:http://hdl.handle.net/10361/627
id 10361-627
record_format dspace
spelling 10361-6272019-09-29T05:27:31Z History (Forward N-Gram) or future (Backward N-Gram)? Which model to consider for N-Gram analysis in Bangla? Khan, Naira Habib, Md. Tarek Alam, Md. Jahangir Rahman, Rajib UzZaman, Naushad Khan, Mumit Center for Research on Bangla Language Processing, BRAC University N-Gram analysis Includes bibliographical references (page 5). This paper presents a directional advantage of n-gram modeling in terms of backward or forward n-gram modeling in Bangla. The most commonly used n-gram analysis is predominantly a forward n-gram. However in Bangla it appears that a backward n-gram is repeatedly more successful and yields more grammatical results than a forward n-gram. This paper hypothesizes that the rationale behind this success is the syntactic ordering of constituents in Bangla. Bangla is a head-final specifier-initial language as opposed to English, which is head-initial specifier-initial. Hence in Bangla, the head comes after its argument in a phrase. If an n-gram analysis begins with a head and moves backwards it will stretch to its own argument but if you move for-wards then you'll probably grab the argument of an-other head. As probability of occurrence of heads is higher, probability of depending on a head is also higher and hence a backward n-gram will probably have a greater chance of yielding grammatical results. We carried out several experiments to compare different directional results in different applications with an advantage in the backward direction. This will prove a useful linguistic insight in terms of n-gram based analysis depending upon variations of constituent analysis. Naira Khan Md. Tarek Habib Md. Jahangir Alam Rajib Rahman Naushad UzZaman Mumit Khan 2010-10-24T04:28:50Z 2010-10-24T04:28:50Z 2006 Article http://hdl.handle.net/10361/627 en 5 pages application/pdf BRAC University
institution Brac University
collection Institutional Repository
language English
topic N-Gram analysis
spellingShingle N-Gram analysis
Khan, Naira
Habib, Md. Tarek
Alam, Md. Jahangir
Rahman, Rajib
UzZaman, Naushad
Khan, Mumit
History (Forward N-Gram) or future (Backward N-Gram)? Which model to consider for N-Gram analysis in Bangla?
description Includes bibliographical references (page 5).
author2 Center for Research on Bangla Language Processing, BRAC University
author_facet Center for Research on Bangla Language Processing, BRAC University
Khan, Naira
Habib, Md. Tarek
Alam, Md. Jahangir
Rahman, Rajib
UzZaman, Naushad
Khan, Mumit
format Article
author Khan, Naira
Habib, Md. Tarek
Alam, Md. Jahangir
Rahman, Rajib
UzZaman, Naushad
Khan, Mumit
author_sort Khan, Naira
title History (Forward N-Gram) or future (Backward N-Gram)? Which model to consider for N-Gram analysis in Bangla?
title_short History (Forward N-Gram) or future (Backward N-Gram)? Which model to consider for N-Gram analysis in Bangla?
title_full History (Forward N-Gram) or future (Backward N-Gram)? Which model to consider for N-Gram analysis in Bangla?
title_fullStr History (Forward N-Gram) or future (Backward N-Gram)? Which model to consider for N-Gram analysis in Bangla?
title_full_unstemmed History (Forward N-Gram) or future (Backward N-Gram)? Which model to consider for N-Gram analysis in Bangla?
title_sort history (forward n-gram) or future (backward n-gram)? which model to consider for n-gram analysis in bangla?
publisher BRAC University
publishDate 2010
url http://hdl.handle.net/10361/627
work_keys_str_mv AT khannaira historyforwardngramorfuturebackwardngramwhichmodeltoconsiderforngramanalysisinbangla
AT habibmdtarek historyforwardngramorfuturebackwardngramwhichmodeltoconsiderforngramanalysisinbangla
AT alammdjahangir historyforwardngramorfuturebackwardngramwhichmodeltoconsiderforngramanalysisinbangla
AT rahmanrajib historyforwardngramorfuturebackwardngramwhichmodeltoconsiderforngramanalysisinbangla
AT uzzamannaushad historyforwardngramorfuturebackwardngramwhichmodeltoconsiderforngramanalysisinbangla
AT khanmumit historyforwardngramorfuturebackwardngramwhichmodeltoconsiderforngramanalysisinbangla
_version_ 1814308040364523520