1 AIT Asian Institute of Technology

Tone recognition of speech using hidden Markov models

AuthorHasan, Md. Khairul
Call NumberAIT Thesis no.TC-97-15
Subject(s)Voice frequency
Automatic speech recognition
Markov processes
NoteA thesis submitted in partial fulfillment of the requirements for the degree of Master of Engineering.
PublisherAsian Institute of Technology
Series StatementThesis ; no. TC-97-15
AbstractThe variation of fundamental frequency (Fo) over the duration of a syllable is commonly referred to as tone. In tone languages, a complete speech recognition system requires a tone recognition subsystem to properly recognize the meaning of a syllable. Here, in this thesis work, a tone recognition system based on hidden Markov model technology is implemented successfully. By using subharmonic summation algorithm pitch frequency and corresponding peak of subharmonic sum are calculated form the speech after every 10 ms. The peak of subharmonic sum in the voiced portion of a syllable is considerably higher than in the unvoiced portion. Therefore, pitch contour corresponding to the voiced portion is separated from the unvoiced part by utilizing the peak of subharmonic sum. Z-score normalization technique is applied to the resulting pitch contour to remove inter- and intra-speaker variation in the pitch values. A sequence of observation vectors is generated form the pitch contour, and fed to the trained HMMs for final tone identification. As a case study for Thai language, a total of five hidden Markov models are trained for five tones in Thai language. To compromise between accuracy and computational complexity semi-continuous HMMs are used. Six isolated syllables, each with five tones and spoken by two males and two female speakers are used to train the reference models. Recognition performances of these reference models are evaluated for twelve different syllables each with five tones and spoken by four speakers different from those who trained the models. A comparison is made between 3-state, 4-state and 5-state HMMs. The initial test in MATLAB environment is showing recognition accuracy of 97.3%, 99% and 97.8% respectively for 3, 4 and 5-state hidden Markov models. Finally, the complete tone recognition system is implemented in real-time environment by using TMS320C30 digital signal processor.
Year1997
Corresponding Series Added EntryAsian Institute of Technology.|tThesis ; no. TC-97-15
TypeThesis
SchoolSchool of Engineering and Technology (SET)
DepartmentDepartment of Information and Communications Technologies (DICT)
Academic Program/FoSTelecommunications (TC)
Chairperson(s)Makelainen, Kimmo
Examination Committee(s)Ahmed, Kazi Mohiuddin;Rajatheva, RM.AP.
Scholarship Donor(s)Government of Japan.
DegreeThesis (M.Eng.) - Asian Institute of Technology, 1997


Usage Metrics
View Detail0
Read PDF0
Download PDF0