1 AIT Asian Institute of Technology

An empirical evaluation of some data mining techniques

AuthorRonnachai Lorjaroenmongkol
Call NumberAIT Thesis no.CS-97-30
Subject(s)Data mining

NoteA thesis submitted in partial fulfillment of the requirements for the degree of Master of Science
PublisherAsian Institute of Technology
AbstractEmpirical comparisons and evaluations of data mining techniques and algorithms are often useful to the selection of appropriate tools to perform on specific tasks. One particularly attractive methodology in KDD and data mining is the theory of rough sets which is a relatively new research direction concerned with analysis and modeling of classifications and decision problems involving vague and uncertain information. However, there is a need to evaluate and characterize the types of practical problems, especially in large databases, to which the rough sets approach is suitable. The comparative experiments were performed on both real and public repository data sets. Two experimental series were conducted to evaluate empirically the behavior and performance of algorithms. The first experiment included a rough-set-based data mining system (Rosetta) and other two well-known methods named CN2 and PEBLS. The second experiment was to compare among various algorithms for rule extraction within the framework of rough sets. In this experiment, a number of algorithms for extracting rules involved Local covering, All global covering, Local certain and possible rules, and All rules methods. Moreover, two sampling strategies, simple random sampling and stratified simple random sampling, were employed when producing training data sets. An appropriate statistical hypothesis testing was used to determine whether there exists a significant difference between their estimated accuracy. Moreover, the extracted rules were manually investigated and reported. Finally, strengths and weaknesses of algorithms being compared were pointed out, and the condition under which a particular algorithm could be expected to outperform or under-perform the others were indicated.
Year1997
TypeThesis
SchoolSchool of Advanced Technologies (SAT)
DepartmentDepartment of Information and Communications Technologies (DICT)
Academic Program/FoSComputer Science (CS)
Chairperson(s)Phan, Minh Dung
Examination Committee(s)Batanov, Dentcho N.;Devadason, Francis J.
Scholarship Donor(s)Asian Institute of Technology (Partial);
DegreeThesis (M.Sc.) - Asian Institute of Technology, 1997


Usage Metrics
View Detail0
Read PDF0
Download PDF0