Sentiment analysis for Bangla microblog posts

This thesis report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2014.

Бібліографічні деталі
Автори: Shaika, Chowdhury, Chowdhury, Wasifa
Інші автори: Department of Computer Science and Engineering, BRAC University
Формат: Дисертація
Мова:English
Опубліковано: BRAC University 2014
Предмети:
Онлайн доступ:http://hdl.handle.net/10361/2902
id 10361-2902
record_format dspace
spelling 10361-29022022-01-26T10:21:44Z Sentiment analysis for Bangla microblog posts Shaika, Chowdhury Chowdhury, Wasifa Department of Computer Science and Engineering, BRAC University Computer science and engineering This thesis report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2014. Cataloged from PDF version of thesis report. Includes bibliographical references (page 47). Sentiment analysis has received great attention recently due to the huge amount of user-generated information on the microblogging sites, such as Twitter [1], which are utilized for many applications like product review mining and making future predictions of events such as predicting election results. Much of the research work on sentiment analysis has been applied to the English language, but construction of resources and tools for sentiment analysis in languages other than English is a growing need since the microblog posts are not just posted in English, but in other languages as well. Work on Bangla (or Bengali language) is necessary as it is one of the most spoken languages, ranked seventh in the world [13]. In this paper, we aim to automatically extract the sentiments or opinions conveyed by users from Bangla microblog posts and then identify the overall polarity of texts as either negative or positive. We use a semi-supervised bootstrapping approach for the development of the training corpus which avoids the need for labor intensive manual annotation. For classification, we use Support Vector Machines (SVM) and Maximum Entropy (MaxEnt) and do a comparative analysis on the performance of these two machine learning algorithms by experimenting with a combination of various sets of features. We also construct a Twitter-specific Bangla sentiment lexicon, which is utilized for the rule-based classifier and as a binary feature in the classifiers used. For our work, we choose Twitter as the microblogging site as it is one of the most popular microblogging platforms in the world. Chowdhury, Shaika Chowdhury, Wasifa B. Computer Science and Engineering 2014-01-29T06:58:08Z 2014-01-29T06:58:08Z 2014-01 Thesis ID 10101037 ID 10101038 http://hdl.handle.net/10361/2902 en BRAC University thesis are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 47 pages application/pdf BRAC University
institution Brac University
collection Institutional Repository
language English
topic Computer science and engineering
spellingShingle Computer science and engineering
Shaika, Chowdhury
Chowdhury, Wasifa
Sentiment analysis for Bangla microblog posts
description This thesis report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2014.
author2 Department of Computer Science and Engineering, BRAC University
author_facet Department of Computer Science and Engineering, BRAC University
Shaika, Chowdhury
Chowdhury, Wasifa
format Thesis
author Shaika, Chowdhury
Chowdhury, Wasifa
author_sort Shaika, Chowdhury
title Sentiment analysis for Bangla microblog posts
title_short Sentiment analysis for Bangla microblog posts
title_full Sentiment analysis for Bangla microblog posts
title_fullStr Sentiment analysis for Bangla microblog posts
title_full_unstemmed Sentiment analysis for Bangla microblog posts
title_sort sentiment analysis for bangla microblog posts
publisher BRAC University
publishDate 2014
url http://hdl.handle.net/10361/2902
work_keys_str_mv AT shaikachowdhury sentimentanalysisforbanglamicroblogposts
AT chowdhurywasifa sentimentanalysisforbanglamicroblogposts
_version_ 1814309391534391296