AIT Asian Institute of Technology

1 AIT Asian Institute of Technology

> > >

Development of a machine learning model for urban flood warning using accumulated rainfall data and socil media report
Author	Yanisa Chanprasert
Call Number	AIT Thesis no.RS-25-05
Subject(s)	Flood warning systems--Bangkok--Thailand Machine learning
Note	A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Remote Sensing and Geographic Information Systems
Publisher	Asian Institute of Technology
Abstract	This study aims to develop a machine learning-based model for urban flood warning by utilizing accumulated rainfall data and social media reports, specifically from Twitter, within the Bangkok metropolitan area. The study consists of three main components: (1) identifying flood-related tweets, (2) geolocating flood events using partial address information, and (3) developing and evaluating machine learning models to determine the most suitable algorithm for flood prediction. The researchers collected a total of 44,048 tweets and applied four identification approaches: manual labeling (used as reference), ChatGPT, Gemini, and keyword matching. ChatGPT achieved the highest accuracy (0.9806) but demonstrated relatively low precision, while keyword matching, although simple, achieved the highest recall (1.0). For geocoding, the Google Geocoding API outperformed both ChatGPT and Gemini, achieving an accuracy of 94.94%, compared to less than 18% for the latter two methods. The study selected five districts with more than 20 flood-related tweets for analysis: Bang Kapi, Chatuchak, Bang Khae, Wang Thonglang, and Bang Na. Only tweets posted between 08:00 and 23:00 that could be matched with rainfall data from nearby stations were used, resulting in a final labeled dataset of 28 tweets. The researchers used rainfall data from 2019–2020 to train four machine learning models—Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), and Extreme Gradient Boosting (XGBoost)—and validated the models using 2021 data. During training, RF achieved the highest accuracy (0.97), whereas SVM demonstrated the highest prediction accuracy (0.96) when tested on unseen data from 2021. These results indicate that, although RF performed well on the training set, SVM generalized more effectively to real-world conditions. Overall, the findings suggest that Twitter can serve as a valuable source for real-time flood detection and, when integrated with rainfall data, can enhance the accuracy and responsiveness of urban flood warning systems.
Year	2025
Type	Thesis
School	School of Engineering and Technology
Department	Department of Information and Communications Technologies (DICT)
Academic Program/FoS	Remote Sensing and Geographic Information Systems (RS)
Chairperson(s)	Sarawut Ninsawat
Examination Committee(s)	Tripathi, Nitin Kumar;Sanit Arunplod
Scholarship Donor(s)	Royal Thai Government Fellowship
Degree	Thesis (M.Sc.) - Asian Institute of Technology, 2025