Low resource multi-asr speech command recognition

dc.contributor.authorMohamed, I
dc.contributor.authorThayasivam, U
dc.contributor.editorRathnayake, M
dc.contributor.editorAdhikariwatte, V
dc.contributor.editorHemachandra, K
dc.date.accessioned2022-10-27T08:39:40Z
dc.date.available2022-10-27T08:39:40Z
dc.date.issued2022-07
dc.description.abstractThere are several applications when comes to spoken language understanding (SLU) such as topic identification and intent detection. One of the primary underlying components used in SLU studies are ASR (Automatic Speech Recognition). In recent years we have seen a major improvement in the ASR system to recognize spoken utterances. But it is still a challenging task for low resource languages as it requires 100’s hours of audio input to train an ASR model. To overcome this issue recent studies have used transfer learning techniques. However, the errors produced by the ASR models significantly affect the downstream natural language understanding (NLU) models used for intent or topic identification. In this work, we have proposed a multi-ASR setup to overcome this issue. We have shown that combining outputs from multiple ASR models can significantly increase the accuracy of low-resource speech-command transfer-learning tasks than using the output from a single ASR model. We have come up with CNN based setups that can utilize outputs from pre-trained ASR models such as DeepSpeech2 and Wav2Vec 2.0. The experiment result shows an 8% increase in accuracy over the current state-of-the-art low resource speech-command phoneme-based speech intent classification methodology.en_US
dc.identifier.citationI. Mohamed and U. Thayasivam, "Low Resource Multi-ASR Speech Command Recognition," 2022 Moratuwa Engineering Research Conference (MERCon), 2022, pp. 1-6, doi: 10.1109/MERCon55799.2022.9906230.en_US
dc.identifier.conferenceMoratuwa Engineering Research Conference 2022en_US
dc.identifier.departmentEngineering Research Unit, University of Moratuwaen_US
dc.identifier.doi10.1109/MERCon55799.2022.9906230en_US
dc.identifier.emailjazeem.20@cse.mrt.ac.lk
dc.identifier.emailrtuthaya@cse.mrt.ac.lk
dc.identifier.facultyEngineeringen_US
dc.identifier.placeMoratuwa, Sri Lankaen_US
dc.identifier.proceedingProceedings of Moratuwa Engineering Research Conference 2022en_US
dc.identifier.urihttp://dl.lib.uom.lk/handle/123/19270
dc.identifier.year2022en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.relation.urihttps://ieeexplore.ieee.org/document/9906230en_US
dc.subjectSpeech Intent Classificationen_US
dc.subjectLow-Resourceen_US
dc.subjectDeepSpeech2en_US
dc.subjectWav2Vec2.0en_US
dc.subjectTamilen_US
dc.titleLow resource multi-asr speech command recognitionen_US
dc.typeConference-Full-texten_US

Files

Collections