NSU Research Contributions
Title : Bangla Short Speech Commands Recognition Using Convolutional Neural Networks
Authors : Shakil Ahmed Sumon , Joydip Chowdhury ,Sujit Debnath ,Nabeel Mohammed , Sifat Momen
Abstract : Despite being one of the most widely spoken languages of the world, no significant efforts have been made in Bangla speech recognition. Speech recognition is a difficult task, particularly if the demand is to do so in noisy real-life conditions. In this study, Bangla short speech commands data set has been reported, where all the samples are taken in the real-life setting. Three different convolutional neural network (CNN) architectures have been designed to recognize those short speech commands. Mel-frequency cepstral coefficients (MFCC) features have been extracted from the audio files in one approach whereas only the raw audio files have been used in another CNN architecture. Lastly, a pre-trained model which is trained on a large English short speech commands data set has been fine-tuned by retraining on Bangla data set. Experimental results reveal that the MFCC model shows better accuracy in recognizing Bangla short speech commands where, surprisingly, the model predicting on raw audio data is very competitive. The models have shown proficiency in identifying single syllable words but encounter difficulties in recognizing multi-syllable commands.
|Journal :||Volume :||Year : 2018||Issue :|
|Pages :||City : Sylhet||Edition :||Editors :|
|Publisher :||ISBN :||Book :||Chapter :|
|Proceeding Title : 2018 International Conference on Bangla Speech and Language Processing (ICBSLP)||Institution :||Issuer :||Number :|