1 AIT Asian Institute of Technology

Topic modeling and sentiment analysis on hotel reviews : a comparison of machine learning models for sentiment

AuthorG.C., Dipesh Dhoj
Call NumberAIT Thesis no.DSAI-22-08
Subject(s)Sentiment analysis
Natural language processing (Computer science)
Hotels--Marketing--Data processing
NoteA thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Data Science and Artificial Intelligence
PublisherAsian Institute of Technology
AbstractThe enormous amount of user-generated content has given potential and possibilities to extract valuable knowledge from the content. The hotel industry is one of the industries with a high volume of content in the form of reviews. The extraction of the topics and sentiments expressed in the reviews can be beneficial to hoteliers. Thus, the main objec tive of this study is to compare different topic modeling techniques and sentiment analy sis models to find the best-performing model on hotel reviews and use them to make in ferences. The best-performing model is used to build a system to find the topics and sen timents of the hotel reviews. There are six steps in methodology: data collection, data preprocessing and labeling, topic modeling, model design, classification, and system development. Two topic modeling algorithms i.e. LDA(Latent Dirichlet Allocation) and BERTopic and three machine learning models BiLSTM(Bidirectional Long Short Term Memory), BiGRU(Bidirectional Gated Recurrent Unit) and BERT(Bidirectional Encoder Representations from Transformers) are compared on the data of hotel reviews based in Nepal. BERTopic generated more qualitative contextual topics and Distil BERT worked best for sentiment analysis with 88.65% test accuracy. The BERTopic and DistilBERT models are implemented in the backend of the web application devel oped using the flask micro web framework for backend and html/css for the frontend. The application is evaluated with the user’s input to test whether the desired output is obtained. The application is provided with the input either in the form of a single re view, a file, or a link and the application returns sentiment of the review or a dashboard containing different plots depending on the input given by the user.
Year2022
TypeThesis
SchoolSchool of Engineering and Technology
DepartmentDepartment of Information and Communications Technologies (DICT)
Academic Program/FoSData Science and Artificial Intelligence (DSAI)
Chairperson(s)Vatcharaporn Esichaikul
Examination Committee(s)Dailey, Matthew N.;Chaklam Silpasuwanchai
Scholarship Donor(s)AIT Scholarships
DegreeThesis (M. Sc.) - Asian Institute of Technology, 2022


Usage Metrics
View Detail0
Read PDF0
Download PDF0