1 AIT Asian Institute of Technology

Comparative evaluation of transformer-based language models for Nepali language

AuthorTamrakar, Suyogya Ratna
Call NumberAIT RSPR no.CS-22-06
Subject(s)Natural language processing (Computer science)
Artificial intelligence
Machine learning
NoteA research study submitted in partial fulfillment of the requirements for the degree of Master of Engineering in Computer Science
PublisherAsian Institute of Technology
AbstractLarge pre-trained transformer models using self-supervised learning have achieved state of-the-art performances in various NLP tasks. However, for low-resource language like Nepali, pre-training of monolingual models remains a problem due to lack of training data and well-designed and balanced benchmark datasets. Furthermore, several mul tilingual pre-trained models such as mBERT and XLM-RoBERTa have been released, but their performance remains unknown for Nepali language. Nepali monolingual pre trained transformer models were compared with multilingual models to determine their performance using a Nepali text classification dataset as a downstream task based on dif ferent number of classes and data sizes, taking machine learning (ML) and deep learning (DL) algorithms as baselines. Under-representation of Nepali language in mBERT resulted in overall poor performance, but, XLM-RoBERTa, which has a larger vocab ulary size, produced state-of-the-art performance which is relatively similar to that of Nepali DistilBERT and DeBERTa, which outperformed all of the baseline algorithms. Bi-LSTM and SVM from the baselines also performed very well in variety of settings. Moreover, to assess the cross-language knowledge transfer for the cases when monolin gual models are not available, HindiRoBERTa, a monolingual Indian language model was also evaluated on Nepali text dataset. This research mainly contributes to the Nepali NLP community by creation of news classification dataset with 20 classes, with over 200,000 articles and performance evaluation of various pre-trained monolingual Nepali transformers with multilingual transformers, DL and ML algorithms.
Year2022
TypeResearch Study Project Report (RSPR)
SchoolSchool of Engineering and Technology
DepartmentDepartment of Information and Communications Technologies (DICT)
Academic Program/FoSComputer Science (CS)
Chairperson(s)Chaklam Silpasuwanchai
Examination Committee(s)Dailey, Mathew N.;Mongkol Ekpanyapong
Scholarship Donor(s)AIT Partial Scholarship
DegreeResearch Studies Project Report (M. Eng.) - Asian Institute of Technology, 2022


Usage Metrics
View Detail0
Read PDF0
Download PDF0