1
Comparative evaluation of transformer-based language models for Nepali language | |
Author | Tamrakar, Suyogya Ratna |
Call Number | AIT RSPR no.CS-22-06 |
Subject(s) | Natural language processing (Computer science) Artificial intelligence Machine learning |
Note | A research study submitted in partial fulfillment of the requirements for the degree of Master of Engineering in Computer Science |
Publisher | Asian Institute of Technology |
Abstract | Large pre-trained transformer models using self-supervised learning have achieved state of-the-art performances in various NLP tasks. However, for low-resource language like Nepali, pre-training of monolingual models remains a problem due to lack of training data and well-designed and balanced benchmark datasets. Furthermore, several mul tilingual pre-trained models such as mBERT and XLM-RoBERTa have been released, but their performance remains unknown for Nepali language. Nepali monolingual pre trained transformer models were compared with multilingual models to determine their performance using a Nepali text classification dataset as a downstream task based on dif ferent number of classes and data sizes, taking machine learning (ML) and deep learning (DL) algorithms as baselines. Under-representation of Nepali language in mBERT resulted in overall poor performance, but, XLM-RoBERTa, which has a larger vocab ulary size, produced state-of-the-art performance which is relatively similar to that of Nepali DistilBERT and DeBERTa, which outperformed all of the baseline algorithms. Bi-LSTM and SVM from the baselines also performed very well in variety of settings. Moreover, to assess the cross-language knowledge transfer for the cases when monolin gual models are not available, HindiRoBERTa, a monolingual Indian language model was also evaluated on Nepali text dataset. This research mainly contributes to the Nepali NLP community by creation of news classification dataset with 20 classes, with over 200,000 articles and performance evaluation of various pre-trained monolingual Nepali transformers with multilingual transformers, DL and ML algorithms. |
Year | 2022 |
Type | Research Study Project Report (RSPR) |
School | School of Engineering and Technology |
Department | Department of Information and Communications Technologies (DICT) |
Academic Program/FoS | Computer Science (CS) |
Chairperson(s) | Chaklam Silpasuwanchai |
Examination Committee(s) | Dailey, Mathew N.;Mongkol Ekpanyapong |
Scholarship Donor(s) | AIT Partial Scholarship |
Degree | Research Studies Project Report (M. Eng.) - Asian Institute of Technology, 2022 |