1
Comparing selective masking methods for depression detection in social media | |
Author | Chanapa Pananookooln |
Call Number | AIT Thesis no.DSAI-22-04 |
Subject(s) | Social media--Data processing Machine learning Neural networks (Computer science) |
Note | A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Data Science and Artificial Intelligence |
Publisher | Asian Institute of Technology |
Abstract | Identifying those at risk for depression is a crucial issue in which social media provides an excellent platform for examining the linguistic patterns of depressed individuals. A significant challenge in a depression classification problem is ensuring that the predic tion model is not overly dependent on keywords, such that it fails to predict when key words are unavailable. One promising approach is masking, i.e., by masking important words selectively and asking the model to predict the masked words, the model is forced to learn the context rather than the keywords. This study evaluates seven masking tech niques, such as random masking, log-odds ratio, and the use of attention scores. In ad dition, whether to predict the masked words during pretraining or fine-tuning phase was also examined. Last, six class imbalance ratios were compared to determine the robust ness of the masked selection methods. Key findings demonstrated that selective masking generally outperforms random masking in terms of classification accuracy. In addition, the most accurate and robust models were identified. Our research also indicated that re constructing the masked words during the pre-training phase is more advantageous than during the fine-tuning phase. Further discussion and implications were made. This is the first study to comprehensively compare masking selection methods, which has broad implications for the field of depression classification and the general NLP. Our code can be found in: https://github.com/chanapapan/Depression-Detection |
Year | 2022 |
Type | Thesis |
School | School of Engineering and Technology |
Department | Department of Information and Communications Technologies (DICT) |
Academic Program/FoS | Data Science and Artificial Intelligence (DSAI) |
Chairperson(s) | Chaklam Silpasuwanchai |
Examination Committee(s) | Dailey, Matthew N.;Mongkol Ekpanyapong |
Scholarship Donor(s) | His Majesty the King’s Scholarships (Thailand) |
Degree | Thesis (M. Sc.) - Asian Institute of Technology, 2022 |