1
A Thai syntactic analyzer | |
Author | Ampai Pornprasertsakul |
Call Number | AIT Diss. no.CS-94-1 |
Subject(s) | Thai language--Syntax Artificial intelligence |
Note | A dissertation submitted in partial fulfillment of the requirements for the degree of Doctoral Technical Science |
Publisher | Asian Institute of Technology |
Abstract | The syntax of Thai sentences and its features are first expressed compact into Generalized Phrase Snucmaa Grammar (GPSG). Thai syntactic rules are expressed in 31 immediate Dominance (ID) rules. 4 Linear Precedence (LP) rules and 6 Feature SpecificatiOn Defaults (FSD). Since implementation of an efficient parser based on GPSG is difficult and a conversion algorithm from GPSG to PATR grammar. which is a unification-based grammar. is available. Thai GPSG syntactic rules are converted into PATR grammar which also allows semantics to be declared in Syntactic rules. Two parsers are implemented. The firsr one is a modified Chart parser. the second is a beam Exception Logic (LEL) parser. The former comprises three modules of sequential rule processing; pre-rule processing; and feature-based parsing. Several ambiguous parse trees are obtained by means of the modified Chart parser. since it generates all possible structures. The semantics of sixteen verbal subcategories. extracted from more than 2,000 Thai verbs are added to resolve syntactic ambiguities; it nevertheless still generates syntactic ambiguities By introduction of LEL which employs the Integer Linear Programming to minimize the total weights of exceptions expressed in the objective function, parse trees are obtained which depend totally on exception weights. In order to determine these weights of Thai syntactic rules, a Log-Likelihood Ratio statistical approach is applied to a corpus of more than 1,000 Thai sentences. A test of several Thai sentences shows that a Thai Syntactic Parser with these improvements successfully parses and disambiguates simple sentences. Moreover, applications of the final parser to complex and compound sentences yield the best three structures which always include a correct one. |
Year | 1994 |
Type | Dissertation |
School | School of Engineering and Technology (SET) |
Department | Department of Information and Communications Technologies (DICT) |
Academic Program/FoS | Computer Science (CS) |
Chairperson(s) | Vilas Wuwongse |
Examination Committee(s) | Huynh Ngoc Phien;Weber, Karl;Kanchit Malaivongs;Peansiri E. Vongvipanond |
Degree | Thesis (Ph.D.) - Asian Institute of Technology, 1994 |