AIT Asian Institute of Technology

1 AIT Asian Institute of Technology

> > >

Recognition of syllables in tone languages
Author	Tanee Demeechai
Call Number	AIT Diss. no.TC-00-01
Subject(s)	Tone (Phonetics) Speech perception Speech processing systems
Note	A dissertation submitted in partial fulfilment of the requirements the Degree of Doctor of Engineering, School of Engineering and Technology
Publisher	Asian Institute of Technology
Series Statement	Dissertation ; no. TC-00-01
Abstract	Spcech recognition of tone languages requires detection of the tone in addition to detection of the consonants and vowels of a syllable. Two approaches for recognition of tonal syllables have been proposed in the literature: joint detection and sequential detection. In joint detection, recognition is done by employing a hidden Markov model (HMM) of connected tonal syllasbles, in which the pitch and its time derivative are included into the feaure vector in addition to the phonetic features. In sequential detection, base syllables (syllables ignoring their tones) are recognized by using a HMM of connected base syllables only; the estimated syllable boundaries are then used for subsequent tone recognition in a separate HMM of tones. Joint detection performs better than sequential detection, but its computational complexity is higher. In this thesis, a new approach caled linked detection is proposed to achieve performance close to that of joint detection with computational complexity close to that of sequential detection. In linked detection, the recognition in the HMM of connected base sykkabkes is modified to periodically take into account also tonal likelihood computed form a HMM of tones. Likeed detection can provide performance that is comparable to the performance of joint detection and superior to that of sequential detection. For a large vocabulary task, the computational complexity of linked detection is much lower than that of joint detection while it is only slightly higher than that of sequential detection. In our experiment on recognition of 173 Thai-language syllables, the worst recognition rate obtained from sequential detection is 72% This result is comparable to results of the IBM Mandarin Call Home system (LIU et al., 1996), where a syllable recognition rate of 50% is reported for an experiment, in which the speech data are conversational telephone speech data and a word-pair language model is applied so that the perplexity is 15.
Year	2000
Corresponding Series Added Entry	Asian Institute of Technology. Dissertation ; no. TC-00-01
Type	Dissertation
School	School of Engineering and Technology
Department	Department of Information and Communications Technologies (DICT)
Academic Program/FoS	Telecommunications (TC)
Chairperson(s)	Makelainen, Kimmo;
Examination Committee(s)	Sadanada, Ramakoti;Ahmed, Kazi M.;Rajatheva, R. M. A. P.;Lee, Chin-Hui;
Scholarship Donor(s)	Royal Thai Government (RTG);
Degree	Thesis (Ph.D.) - Asian Institute of Technology