FEATURE SELECTION FOR ARABIC MISPRONUNCIATION DETECTION BASED ON SEQUENTIAL FLOATING FORWARD SELECTION AND DATA MINING CLASSIFIERS
DOI:
https://doi.org/10.57041/vol68iss4pp%25pKeywords:
Sequential Floating Forward Selection and Mispronunciation detection, Acoustic-phonetic features, Feature selectionAbstract
Feature selection process is used to reduce the feature vector length and identify the discriminative features. Many acoustic-phonetic features including Mel-Frequency Cepstral Coefficient (MFCC), Energy, Pitch, Zero-crossing, spectrum were tested individually for Arabic mispronunciation detection using three classifiers; Random Forest, Bayesian classifier, and Bagged Support Vector Machine (SVM). The results for Bagged SVM were better than the other two classifiers. Top three individual features with highest accuracies were identified for each isolated Arabic consonant. To validate the results, a modified form of Sequential Floating Forward Selection (SFFS) process was used. Results showed that MFCC along with its first and second derivatives, energy, spectrum, and zero-crossing were the most suitable acoustic features for Arabic mispronunciation detection system. The proposed approach provided an average accuracy of 94.9% which was better than the previous best 92.95% for Arabic consonants.

