Automatic Bengali image captioning using efficientNet-transformer network and vision transformer

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2023.

Библиографические подробности
Главные авторы:	Kabir, Muhammad Khubayeeb, Labonno, Anindita, Amin, Sofia, Tahsin, Fariha
Другие авторы:	Rahman, Md. Khalilur
Формат:	Диссертация
Язык:	English
Опубликовано:	Brac University 2023
Предметы:	Image captioning Image encoders EfficientNet Vision transformer BanglaLekhaImageCaptions BLEU Transformer architecture Image analysis. Image processing > Digital techniques.
Online-ссылка:	http://hdl.handle.net/10361/19358

id	10361-19358
record_format	dspace
spelling	10361-193582023-08-08T21:02:11Z Automatic Bengali image captioning using efficientNet-transformer network and vision transformer Kabir, Muhammad Khubayeeb Labonno, Anindita Amin, Sofia Tahsin, Fariha Rahman, Md. Khalilur Mostakim, Moin Department of Computer Science and Engineering, Brac University Image captioning Image encoders EfficientNet Vision transformer BanglaLekhaImageCaptions BLEU Transformer architecture Image analysis. Image processing--Digital techniques. This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2023. Cataloged from PDF version of thesis. Includes bibliographical references (pages 35-38). The task of image captioning is a complex process that involves generating textual descriptions for images. This technology is extremely beneficial for a wide range of applications, such as assisting people with visual impairments, monitoring surveil lance systems, content generation, image indexing, and automatic annotation of images for producing data for training AI-based image generation models. Much of the research done in this particular domain, especially using transformer models, has been focused on English language. However, there has been relatively little research dedicated to the context of the Bengali language. This study addresses the lack of research in the context of Bengali language and proposes a novel approach to auto matic image captioning that involves a multi-modal, transformer-based, end-to-end model with an encoder-decoder architecture. Our approach utilizes pre-trained Ef ficientNet Transformer Network. To evaluate the effectiveness of our approach, we compare our model with a Vision Transformer that utilizes a non-convolutional en coder pre-trained on ImageNet.The two models were tested on the BanglaLekhaIm ageCaptions dataset and evaluated using BLEU metrics. Muhammad Khubayeeb Kabir Anindita Labonno Sofia Amin Fariha Tahsin B. Computer Science 2023-08-08T05:48:24Z 2023-08-08T05:48:24Z 2023 2023-01 Thesis ID: 19101168 ID: 19101149 ID: 19101232 ID: 19101170 http://hdl.handle.net/10361/19358 en Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. 38 pages application/pdf Brac University
institution	Brac University
collection	Institutional Repository
language	English
topic	Image captioning Image encoders EfficientNet Vision transformer BanglaLekhaImageCaptions BLEU Transformer architecture Image analysis. Image processing--Digital techniques.
spellingShingle	Image captioning Image encoders EfficientNet Vision transformer BanglaLekhaImageCaptions BLEU Transformer architecture Image analysis. Image processing--Digital techniques. Kabir, Muhammad Khubayeeb Labonno, Anindita Amin, Sofia Tahsin, Fariha Automatic Bengali image captioning using efficientNet-transformer network and vision transformer
description	This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2023.
author2	Rahman, Md. Khalilur
author_facet	Rahman, Md. Khalilur Kabir, Muhammad Khubayeeb Labonno, Anindita Amin, Sofia Tahsin, Fariha
format	Thesis
author	Kabir, Muhammad Khubayeeb Labonno, Anindita Amin, Sofia Tahsin, Fariha
author_sort	Kabir, Muhammad Khubayeeb
title	Automatic Bengali image captioning using efficientNet-transformer network and vision transformer
title_short	Automatic Bengali image captioning using efficientNet-transformer network and vision transformer
title_full	Automatic Bengali image captioning using efficientNet-transformer network and vision transformer
title_fullStr	Automatic Bengali image captioning using efficientNet-transformer network and vision transformer
title_full_unstemmed	Automatic Bengali image captioning using efficientNet-transformer network and vision transformer
title_sort	automatic bengali image captioning using efficientnet-transformer network and vision transformer
publisher	Brac University
publishDate	2023
url	http://hdl.handle.net/10361/19358
work_keys_str_mv	AT kabirmuhammadkhubayeeb automaticbengaliimagecaptioningusingefficientnettransformernetworkandvisiontransformer AT labonnoanindita automaticbengaliimagecaptioningusingefficientnettransformernetworkandvisiontransformer AT aminsofia automaticbengaliimagecaptioningusingefficientnettransformernetworkandvisiontransformer AT tahsinfariha automaticbengaliimagecaptioningusingefficientnettransformernetworkandvisiontransformer
_version_	1814308903793459200

Automatic Bengali image captioning using efficientNet-transformer network and vision transformer

Схожие документы