1 AIT Asian Institute of Technology

Evaluating the effectiveness of truncation and extractive approaches in text summarization

AuthorPranisaa Charnparttaravanit
Call NumberAIT Thesis no.DSAI-22-06
Subject(s)Natural language processing (Computer science)
Computational linguistics
Semantics--Data processing

Note A thesis submitted in partial fulfillment of the requirements for the degree of Master of Engineering in Data Science and Artificial Intelligence
PublisherAsian Institute of Technology
AbstractTransformer-based models still struggle to accommodate long inputs due to high memory requirement of full self-attention mechanism. Such long inputs are normally truncated as suming that important information is located at particular location of the document, which obviously does not make much sense. One promising approach is document extraction i.e. selecting important sentences based on lexicon or semantic overlaps. This study evaluates different extraction approaches such as luhn’s algorithm, latent semantic analysis, textrank and k-means clustering on sentence embedding and compared the results to truncation ap proaches. In addition, we investigated whether these approaches were robust when order of sentences in the document is randomly shuffled. The results showed that extraction ap proaches outperformed truncation approaches in shuffled condition. Among the extraction approaches, textrank and luhn achieved the best performance. In contrast, truncation ap proaches generally outperformed extraction approaches in unshuffled condition, suggest ing that truncation approaches might be suitable when location of important information is known in prior. Further discussion and implications were made. This study comprehensively evaluates and compares extraction approaches which can be applied to existing summarization systems.
Year2022
TypeThesis
SchoolSchool of Engineering and Technology
DepartmentDepartment of Information and Communications Technologies (DICT)
Academic Program/FoSData Science and Artificial Intelligence (DSAI)
Chairperson(s)Chaklam Silpasuwanchai;
Examination Committee(s)Dailey, Matthew N.;Mongkol Ekpanyapong;
Scholarship Donor(s)Royal Thai Government Fellowship;
DegreeThesis (M. Eng.) - Asian Institute of Technology, 2022


Usage Metrics
View Detail0
Read PDF0
Download PDF0