1 AIT Asian Institute of Technology

Building speaker recognition module for VNPT ecabinet using deep learning technique

AuthorVu Minh Duc
Call NumberAIT Project no.PMDS-23-02
Subject(s)Automatic speech recognition
Deep learning (Machine learning)
NoteA project study submitted in partial fulfillment of the requirements for the degree of Professional Master in Data Science and Artificial Intelligence Applications
PublisherAsian Institute of Technology
AbstractSpeaker Recognition is an added value in the paperless meeting system as VNPT eCabinet, where it can release the effort of help desk staff from listening to recognize and tag speaker names in recorded content of meetings. Moreover, in the AI era, smart features like this are the keys to impress customers and win the bid in Vietnam. This project aims to apply deep learning techniques for building an added value module for VNPT eCabinet - a speaker recognition module, in which we compared the accuracy of two DNN models: original RNN and LSTM; using MFCC to extract features from speeches in recognizing speaker tasks. An experiment was conducted with VIVOS Corpus – free Vietnamese speeches dataset of 65 speakers with 12420 utterances. Our key findings are that LSTM Bidirectional with EER = 0.136 is better to resolve this problem than original RNN (EER = 0.210). Our work could help VNPT to leverage machine learning models to reduce help desk efforts and increase the competition of VNPT eCabinet with other Vietnam paperless meeting systems.
Year2023
TypeProject
SchoolSchool of Engineering and Technology
DepartmentDepartment of Information and Communications Technologies (DICT)
Academic Program/FoSProfessional Master in Data Science and Artificial Intelligence Applications (PMDS)
Chairperson(s)Chaklam Silpasuwanchai
Examination Committee(s)Vatcharaporn Esichaikul;Chantri Polprasert
DegreeProfessional Master in Data Science and Artificial Intelligence Applications - Asian Institute of Technology, 2023


Usage Metrics
View Detail0
Read PDF0
Download PDF0