Automated reference validation for scholarly publications using NLP

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2024.

Bibliographic Details
Main Authors: Khan, A S M Nasim, Khan, Mohammad Nasif Sadique, Howlader, MD. Adnan, Roy, Ayan
Other Authors: Alam, Md. Golam Rabiul
Format: Thesis
Language:English
Published: Brac University 2024
Subjects:
Online Access:http://hdl.handle.net/10361/22863
id 10361-22863
record_format dspace
spelling 10361-228632024-05-19T21:04:57Z Automated reference validation for scholarly publications using NLP Khan, A S M Nasim Khan, Mohammad Nasif Sadique Howlader, MD. Adnan Roy, Ayan Alam, Md. Golam Rabiul Sadeque, Farig Yousuf Rahman, Rafeed Department of Computer Science and Engineering, Brac University Automated referencing validation Natural language processing Context similarity Scholarly publications XLNet NER Natural language processing (Computer science) This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2024. Cataloged from PDF version of thesis. Includes bibliographical references (pages 41-43). Accurate references in scholarly publications are a crucial aspect of scientific writing. The manual validation of references can be a time-consuming and error-prone process. This research introduces an updated version of the automated referencing validation model that makes the peer review process efficient. The proposed model utilizes the capabilities of Natural Language Processing generating sentence embeddings which uses an efficient algorithm. Our model first breaks down the scholarly article into sections and uses topic modeling to group every section according to their context properly. After that, It generates sentence embeddings for each section. By making sets of embeddings, they are used to calculate the semantic similarity between the query and the referred article. Additionally, this methodology addresses the valid references for non-contextual scenarios such as having common name entities. Lastly, strategic feature engineering is also being used for better performance. We have created a dataset of scholarly papers with manually verified references to evaluate the efficiency and accuracy of our model. This improved version of the referencing validation model aims to outperform traditional models such as Document-BERT, BERT, and SBERT regarding efficiency and accuracy. The model can be used in interactive real-time systems, providing quick and reliable feedback to peer reviewers. This study aims to make a contribution to the field of automated referencing validation in scholarly publications. The model offers a solution to the limitations of manual validation which makes it a valuable tool for peer reviewers and researchers. A S M Nasim Khan Mohammad Nasif Sadique Khan MD. Adnan Howlader Ayan Roy B.Sc in Computer Science and Engineering 2024-05-19T05:49:46Z 2024-05-19T05:49:46Z ©2024 2024-01 Thesis ID: 19101623 ID: 19201084 ID: 19201076 ID: 19201043 http://hdl.handle.net/10361/22863 en Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 51 pages application/pdf Brac University
institution Brac University
collection Institutional Repository
language English
topic Automated referencing validation
Natural language processing
Context similarity
Scholarly publications
XLNet
NER
Natural language processing (Computer science)
spellingShingle Automated referencing validation
Natural language processing
Context similarity
Scholarly publications
XLNet
NER
Natural language processing (Computer science)
Khan, A S M Nasim
Khan, Mohammad Nasif Sadique
Howlader, MD. Adnan
Roy, Ayan
Automated reference validation for scholarly publications using NLP
description This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2024.
author2 Alam, Md. Golam Rabiul
author_facet Alam, Md. Golam Rabiul
Khan, A S M Nasim
Khan, Mohammad Nasif Sadique
Howlader, MD. Adnan
Roy, Ayan
format Thesis
author Khan, A S M Nasim
Khan, Mohammad Nasif Sadique
Howlader, MD. Adnan
Roy, Ayan
author_sort Khan, A S M Nasim
title Automated reference validation for scholarly publications using NLP
title_short Automated reference validation for scholarly publications using NLP
title_full Automated reference validation for scholarly publications using NLP
title_fullStr Automated reference validation for scholarly publications using NLP
title_full_unstemmed Automated reference validation for scholarly publications using NLP
title_sort automated reference validation for scholarly publications using nlp
publisher Brac University
publishDate 2024
url http://hdl.handle.net/10361/22863
work_keys_str_mv AT khanasmnasim automatedreferencevalidationforscholarlypublicationsusingnlp
AT khanmohammadnasifsadique automatedreferencevalidationforscholarlypublicationsusingnlp
AT howladermdadnan automatedreferencevalidationforscholarlypublicationsusingnlp
AT royayan automatedreferencevalidationforscholarlypublicationsusingnlp
_version_ 1814309593649512448