1 AIT Asian Institute of Technology

An attention and concept hierarchy-based approach to dataset category and tag recommendation

AuthorNatnaree Sornkongdang
Call NumberAIT Thesis no.DSAI-22-05
Subject(s)Machine learning
Information retrieval
Recommender systems (Information filtering)

NoteA thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Data Science and Artificial Intelligence
PublisherAsian Institute of Technology
AbstractThe improper tag organization has been derived by data providers who provide data categories and tags for a dataset to be published on the ThOGD portal. They have currently guided by the available autocomplete function in the portal. With this application, data categories and tags to be suggested to data providers are forecasted from the historical data that was provided by previous data providers. This results to a consequence of several datasets with similar contents but are labeled with different tags in similar meaning are found in the portal. Besides, data consumers cannot get the information being matched to their preference according to the filtering of data category and tag. In this study, our contributions for overcoming the above-mentioned challenges have two main sections, including the attention-based categorical identifier and the topic hierarchy-based categorical concept hierarchies. With the use of Attentive Deep Supervision, there is a weighted effect on loss optimization of the categorical identifier. With the use of Topic Hierarchy, Latent Dirichlet Allocation (LDA) topic modeling is utilized for potential tag term extraction, Heterogeneous Evidences are exploited for relation identification, and Anytree is employed for hierarchy construction. By applying these approaches, the macro average of precision and F1-score of the attention-based identifier improves by 0.6640 % and 0.5570 %, respectively. The micro average improves by 0.8060 %, and 0.6980 %, successively. Meanwhile, the concept hierarchy based categorical concept hierarchies can provide comprehensive tags related to a dataset to be published because of the recommendation strategy that assigning tags with the same highest important weight to the same rank.
Year2022
TypeThesis
SchoolSchool of Engineering and Technology
DepartmentDepartment of Information and Communications Technologies (DICT)
Academic Program/FoSData Science and Artificial Intelligence (DSAI)
Chairperson(s)Chutiporn Anutariya;
Examination Committee(s)Dailey, Matthew N.;Nuttapong Sanglerdsinlapachai;
Scholarship Donor(s)AIT Scholarships;
DegreeThesis (M. Sc.) - Asian Institute of Technology, 2022


Usage Metrics
View Detail0
Read PDF0
Download PDF0