1 AIT Asian Institute of Technology

Development of a machine learning model for urban flood warning using accumulated rainfall data and socil media report

AuthorYanisa Chanprasert
Call NumberAIT Thesis no.RS-25-05
Subject(s)Flood warning systems--Bangkok--Thailand
Machine learning
NoteA thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Remote Sensing and Geographic Information Systems
PublisherAsian Institute of Technology
AbstractThis study aims to develop a machine learning-based model for urban flood warning by utilizing accumulated rainfall data and social media reports, specifically from Twitter, within the Bangkok metropolitan area. The study consists of three main components: (1) identifying flood-related tweets, (2) geolocating flood events using partial address information, and (3) developing and evaluating machine learning models to determine the most suitable algorithm for flood prediction. The researchers collected a total of 44,048 tweets and applied four identification approaches: manual labeling (used as reference), ChatGPT, Gemini, and keyword matching. ChatGPT achieved the highest accuracy (0.9806) but demonstrated relatively low precision, while keyword matching, although simple, achieved the highest recall (1.0). For geocoding, the Google Geocoding API outperformed both ChatGPT and Gemini, achieving an accuracy of 94.94%, compared to less than 18% for the latter two methods. The study selected five districts with more than 20 flood-related tweets for analysis: Bang Kapi, Chatuchak, Bang Khae, Wang Thonglang, and Bang Na. Only tweets posted between 08:00 and 23:00 that could be matched with rainfall data from nearby stations were used, resulting in a final labeled dataset of 28 tweets. The researchers used rainfall data from 2019–2020 to train four machine learning models—Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), and Extreme Gradient Boosting (XGBoost)—and validated the models using 2021 data. During training, RF achieved the highest accuracy (0.97), whereas SVM demonstrated the highest prediction accuracy (0.96) when tested on unseen data from 2021. These results indicate that, although RF performed well on the training set, SVM generalized more effectively to real-world conditions. Overall, the findings suggest that Twitter can serve as a valuable source for real-time flood detection and, when integrated with rainfall data, can enhance the accuracy and responsiveness of urban flood warning systems.
Year2025
TypeThesis
SchoolSchool of Engineering and Technology
DepartmentDepartment of Information and Communications Technologies (DICT)
Academic Program/FoSRemote Sensing and Geographic Information Systems (RS)
Chairperson(s)Sarawut Ninsawat
Examination Committee(s)Tripathi, Nitin Kumar;Sanit Arunplod
Scholarship Donor(s)Royal Thai Government Fellowship
DegreeThesis (M.Sc.) - Asian Institute of Technology, 2025


Usage Metrics
View Detail0
Read PDF0
Download PDF0