1
Topic modeling and sentiment analysis on hotel reviews : a comparison of machine learning models for sentiment | |
Author | G.C., Dipesh Dhoj |
Call Number | AIT Thesis no.DSAI-22-08 |
Subject(s) | Sentiment analysis Natural language processing (Computer science) Hotels--Marketing--Data processing |
Note | A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Data Science and Artificial Intelligence |
Publisher | Asian Institute of Technology |
Abstract | The enormous amount of user-generated content has given potential and possibilities to extract valuable knowledge from the content. The hotel industry is one of the industries with a high volume of content in the form of reviews. The extraction of the topics and sentiments expressed in the reviews can be beneficial to hoteliers. Thus, the main objec tive of this study is to compare different topic modeling techniques and sentiment analy sis models to find the best-performing model on hotel reviews and use them to make in ferences. The best-performing model is used to build a system to find the topics and sen timents of the hotel reviews. There are six steps in methodology: data collection, data preprocessing and labeling, topic modeling, model design, classification, and system development. Two topic modeling algorithms i.e. LDA(Latent Dirichlet Allocation) and BERTopic and three machine learning models BiLSTM(Bidirectional Long Short Term Memory), BiGRU(Bidirectional Gated Recurrent Unit) and BERT(Bidirectional Encoder Representations from Transformers) are compared on the data of hotel reviews based in Nepal. BERTopic generated more qualitative contextual topics and Distil BERT worked best for sentiment analysis with 88.65% test accuracy. The BERTopic and DistilBERT models are implemented in the backend of the web application devel oped using the flask micro web framework for backend and html/css for the frontend. The application is evaluated with the user’s input to test whether the desired output is obtained. The application is provided with the input either in the form of a single re view, a file, or a link and the application returns sentiment of the review or a dashboard containing different plots depending on the input given by the user. |
Year | 2022 |
Type | Thesis |
School | School of Engineering and Technology |
Department | Department of Information and Communications Technologies (DICT) |
Academic Program/FoS | Data Science and Artificial Intelligence (DSAI) |
Chairperson(s) | Vatcharaporn Esichaikul |
Examination Committee(s) | Dailey, Matthew N.;Chaklam Silpasuwanchai |
Scholarship Donor(s) | AIT Scholarships |
Degree | Thesis (M. Sc.) - Asian Institute of Technology, 2022 |