Abstract:
ASR technology has made significant advancements,
approaching performance levels comparable to humans. However,
developing ASR models for all languages, especially Low
Resource Languages (LRLs), presents challenges due to limited
resources. Recent studies on intent classification in LRLs employ
transfer learning techniques, leveraging state-of-the-art English
ASR models to improve performance in these domains. Additionally,
the emerging trend of self-supervised learning has proven
advantageous in developing sophisticated ASR models, requiring
less labeled data to achieve high performance. In our research, we
utilized a self-supervised ASR model to classify intent in an LRL,
specifically the Tamil language. We compared two methods for
the same ASR framework: a fine-tuning approach and a transfer
learning approach with a limited amount of labeled data in Tamil.
Our findings indicate that the fine-tuning method outperforms
the transfer learning technique. Moreover, the model exhibited
a noteworthy increase in accuracy compared to the established
phoneme-based speech intent classification methodology in Tamil.
This study represents a significant step forward in enhancing
speech recognition capabilities for LRLs.
Citation:
I. Tharmakulasingham and U. Thayasivam, "Speech Command Recognition Using Self-Supervised ASR in Tamil," 2023 Moratuwa Engineering Research Conference (MERCon), Moratuwa, Sri Lanka, 2023, pp. 137-142, doi: 10.1109/MERCon60487.2023.10355402.