BnText2Table – dataset and Text-to-Table generation in Bangla

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024.

Detalles Bibliográficos
Autores principales: Zariyat, Tahreema Rahman, Ahmed, Fahim Irfan, Oishi, Tahsina Tajrim, Morshed, Maruf
Otros Autores: Islam, Md Saiful
Formato: Tesis
Lenguaje:English
Publicado: Brac University 2024
Materias:
Acceso en línea:http://hdl.handle.net/10361/23795
id 10361-23795
record_format dspace
spelling 10361-237952024-08-19T21:05:03Z BnText2Table – dataset and Text-to-Table generation in Bangla Zariyat, Tahreema Rahman Ahmed, Fahim Irfan Oishi, Tahsina Tajrim Morshed, Maruf Islam, Md Saiful Department of Computer Science and Engineering, Brac University Bangla NLP Text2Table Summarizer mBART Transformer Information extraction T5 mT5 Computation and Language This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024. Cataloged from PDF version of thesis. Includes bibliographical references (pages 32-34). "In this fast-paced world, everyone relies on technology to get their work done quickly and efficiently, since using technology greatly simplifies every task that needs to be done. The majority of the publications are lengthy and packed with crucial data. However, in many instances, extra words are also added to boost the word count, which causes a number of difficulties when trying to get the desired information. For the English language, numerous tools are available to summarize the text and present it in tabular form. However, it is not the same for our mother tongue, Bangla. Despite being the 5th most-spoken native language in the world, there is no tool available to ease the workload in Bengali language. Our research will assist in such circumstances by summarizing the given information in tabular form within the shortest possible time. Since there is no dataset available that will be suitable for our research, we have prepared the dataset ourselves. Then, we have used the mBART-50-large, mT5-base, mT5-m2m-CrossSum and BanglaT5 models for the implementation. Finding the appropriate table headers in light of the context and order of the data is the most important task in this study. To sum up, our main goal is to develop a benchmark dataset for a text-to-table model for the betterment of the NLP research community." Tahreema Rahman Zariyat Fahim Irfan Ahmed Tahsina Tajrim Oishi Maruf Morshed B.Sc. in Computer Science 2024-08-19T06:13:34Z 2024-08-19T06:13:34Z 2024 2024-01 Thesis ID 20101433 ID 20101508 ID 20101394 ID 20101299 http://hdl.handle.net/10361/23795 en Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 34 pages application/pdf Brac University
institution Brac University
collection Institutional Repository
language English
topic Bangla NLP
Text2Table
Summarizer
mBART
Transformer
Information extraction
T5
mT5
Computation and Language
spellingShingle Bangla NLP
Text2Table
Summarizer
mBART
Transformer
Information extraction
T5
mT5
Computation and Language
Zariyat, Tahreema Rahman
Ahmed, Fahim Irfan
Oishi, Tahsina Tajrim
Morshed, Maruf
BnText2Table – dataset and Text-to-Table generation in Bangla
description This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024.
author2 Islam, Md Saiful
author_facet Islam, Md Saiful
Zariyat, Tahreema Rahman
Ahmed, Fahim Irfan
Oishi, Tahsina Tajrim
Morshed, Maruf
format Thesis
author Zariyat, Tahreema Rahman
Ahmed, Fahim Irfan
Oishi, Tahsina Tajrim
Morshed, Maruf
author_sort Zariyat, Tahreema Rahman
title BnText2Table – dataset and Text-to-Table generation in Bangla
title_short BnText2Table – dataset and Text-to-Table generation in Bangla
title_full BnText2Table – dataset and Text-to-Table generation in Bangla
title_fullStr BnText2Table – dataset and Text-to-Table generation in Bangla
title_full_unstemmed BnText2Table – dataset and Text-to-Table generation in Bangla
title_sort bntext2table – dataset and text-to-table generation in bangla
publisher Brac University
publishDate 2024
url http://hdl.handle.net/10361/23795
work_keys_str_mv AT zariyattahreemarahman bntext2tabledatasetandtexttotablegenerationinbangla
AT ahmedfahimirfan bntext2tabledatasetandtexttotablegenerationinbangla
AT oishitahsinatajrim bntext2tabledatasetandtexttotablegenerationinbangla
AT morshedmaruf bntext2tabledatasetandtexttotablegenerationinbangla
_version_ 1814309388890931200