Quality assessment of extracted information from newspaper comment sections using natural language processing
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024.
Main Authors: | , , , |
---|---|
Drugi avtorji: | |
Format: | Thesis |
Jezik: | English |
Izdano: |
Brac University
2024
|
Teme: | |
Online dostop: | http://hdl.handle.net/10361/22782 |
id |
10361-22782 |
---|---|
record_format |
dspace |
spelling |
10361-227822024-05-09T21:03:17Z Quality assessment of extracted information from newspaper comment sections using natural language processing Deb, Arnob Islam, Maidul Hossain, Sadab Sifar Alam, Farjana Sadeque, Farig Yousuf Department of Computer Science and Engineering, Brac University Natural language processing Information extraction S-BERT RoBERTa Similarity Natural language processing (Computer science) This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024. Cataloged from PDF version of thesis. Includes bibliographical references (pages 39-40). Newspaper comment section– where readers can leave their opinions– can be an excellent source of information embellishment if used properly. Although there is a risk of fake news and misinformation being spread through the comment section, quality information can also be extracted from these comments that may supplement the original news. From recently performed research, a comment can range between irrelevant to informative– and in our thesis, we would like to identify informative news comments that will further be used to supplement the original news article. We will also identify the level of informativeness of a newspaper comment to figure out whether the task of assigning the Editor’s Pick flag (which is currently done by hand at every large news outlet) with the help of state-of-the-art natural language processing and information extraction techniques. We evaluated the similarity between comments and their respective news articles using transformer models like Sentence BERT. Furthermore, we checked if a comment logically entails using different models, from Simple RNN and LSTM to advanced ones like Roberta and big models like Electra. The final model for Textual Entailment (RoBERTa) task outperformed all the other models by achieving an accuracy of 88.60% and the final model for Textual Similarity (SBERT) task outperformed all the similarity models with an accuracy of 68.49%. Arnob Deb Maidul Islam Sadab Sifar Hossain Farjana Alam B.Sc. in Computer Science 2024-05-09T03:23:08Z 2024-05-09T03:23:08Z ©2024 2024-01 Thesis ID: 23241076 ID: 20101309 ID: 23341064 ID: 20101022 http://hdl.handle.net/10361/22782 en Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 52 pages application/pdf Brac University |
institution |
Brac University |
collection |
Institutional Repository |
language |
English |
topic |
Natural language processing Information extraction S-BERT RoBERTa Similarity Natural language processing (Computer science) |
spellingShingle |
Natural language processing Information extraction S-BERT RoBERTa Similarity Natural language processing (Computer science) Deb, Arnob Islam, Maidul Hossain, Sadab Sifar Alam, Farjana Quality assessment of extracted information from newspaper comment sections using natural language processing |
description |
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024. |
author2 |
Sadeque, Farig Yousuf |
author_facet |
Sadeque, Farig Yousuf Deb, Arnob Islam, Maidul Hossain, Sadab Sifar Alam, Farjana |
format |
Thesis |
author |
Deb, Arnob Islam, Maidul Hossain, Sadab Sifar Alam, Farjana |
author_sort |
Deb, Arnob |
title |
Quality assessment of extracted information from newspaper comment sections using natural language processing |
title_short |
Quality assessment of extracted information from newspaper comment sections using natural language processing |
title_full |
Quality assessment of extracted information from newspaper comment sections using natural language processing |
title_fullStr |
Quality assessment of extracted information from newspaper comment sections using natural language processing |
title_full_unstemmed |
Quality assessment of extracted information from newspaper comment sections using natural language processing |
title_sort |
quality assessment of extracted information from newspaper comment sections using natural language processing |
publisher |
Brac University |
publishDate |
2024 |
url |
http://hdl.handle.net/10361/22782 |
work_keys_str_mv |
AT debarnob qualityassessmentofextractedinformationfromnewspapercommentsectionsusingnaturallanguageprocessing AT islammaidul qualityassessmentofextractedinformationfromnewspapercommentsectionsusingnaturallanguageprocessing AT hossainsadabsifar qualityassessmentofextractedinformationfromnewspapercommentsectionsusingnaturallanguageprocessing AT alamfarjana qualityassessmentofextractedinformationfromnewspapercommentsectionsusingnaturallanguageprocessing |
_version_ |
1814308915313115136 |