AIT Asian Institute of Technology

1 AIT Asian Institute of Technology

> > >

Multi-modal person retrieval : bridging text, images, and re-identification
Author	Ati Tesakulsiri
Call Number	AIT Thesis no.DSAI-24-04
Subject(s)	Pattern recognition systems Computer vision
Note	A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Data Science and Artificial Intelligence
Publisher	Asian Institute of Technology
Abstract	Person Re-Identification (Re-ID) attempts to recognize the same subject from many cameras even when there are changes in lighting, posture, and point of view. Motivated by the remarkable outcomes of CLIP (Contrastive Language-Image Pre-training), we in vestigate the possibility of utilizing Contrastive Learning in conjunction with a blend of text encoders and Re-ID models. With Lock image text tuning method, the resource and time needed for training the model are not the issue. With 5X less time for training and 3X- 18X less resources consumption (GPU ram), we able to train the model to learn an attribute of the person with 83.98% of 4/5 matches accuracy (2616 classes) on unseen dataset which higher than current SOTA model on Person retrieval. We also leverage multilingual retrieval in this study. However, This network combination is un able to learn a view, posture and changes in lighting make this approach lack behind the person retrieval benchmarks.
Year	2024
Type	Thesis
School	School of Engineering and Technology
Department	Department of Information and Communications Technologies (DICT)
Academic Program/FoS	Data Science and Artificial Intelligence (DSAI)
Chairperson(s)	Mongkol Ekpanyapong;
Examination Committee(s)	Huynh, Trung Luong;Chaklam Silpasuwanchai;
Scholarship Donor(s)	Royal Thai Government Fellowship;
Degree	Thesis (M. Sc.) - Asian Institute of Technology, 2024