Speech to intent mapping system for low resourced languages

Karunanayake Y

UoM IR
→
Thesis & Dissertation
→
Faculty of Engineering, Computer Science & Engineering
→
Master of Science By Research
→
View Item

dc.contributor.advisor	Thayasivam U
dc.contributor.author	Karunanayake Y
dc.date.accessioned	2020
dc.date.available	2020
dc.date.issued	2020
dc.identifier.uri	http://dl.lib.mrt.ac.lk/handle/123/16188
dc.description.abstract	Today we can find many use cases for content-based speech classification. These include speech topic identification and speech command recognition. Among these, speech command-based user interfaces are becoming popular since they allow humans to interact with digital devices using natural language. Such interfaces are capable of identifying the intent of the given query. Automatic Speech Recognition (ASR) sits underneath all of these applications to convert speech into textual format. However, creating an ASR system for a language is a resource-consuming task. Even though there are more than 6000 languages in the world, all of these speech-related applications are limited to the most well-known languages such as English, because of the high data requirement of ASR. There is some past research that looked into classifying speech while addressing the data scarcity. However, all of these methods have their limitations. This study presents a direct speech intent identification method for low-resource languages with the use of a transfer learning mechanism. It makes use of three different audio-based feature generation techniques that can represent semantic information presented in the speech. They are unsupervised acoustic unit features, character and phoneme features. The proposed method is evaluated using Sinhala and Tamil language datasets in the banking domain. Among these, phoneme based features that can be extracted from Automatic Speech Recognizers (ASRs) yield the best results in intent identification. The experiment results show that this method can have more than 80% accuracy for a 0.5-hour limited speech dataset in both languages.	en_US
dc.language.iso	en	en_US
dc.subject	COMPUTER SCIENCE AND ENGINEERING-Dissertations	en_US
dc.subject	LANGUAGE AND LANGUAGES-Low-Resourced Languages	en_US
dc.subject	SPEECH-Recognition	en_US
dc.subject	SPEECH-Intent Identification	en_US
dc.subject	NATURAL LANGUAGE PROCESSING	en_US
dc.title	Speech to intent mapping system for low resourced languages	en_US
dc.type	Thesis-Full-text	en_US
dc.identifier.faculty	Engineering	en_US
dc.identifier.degree	MSc in Computer Science and Engineering by research	en_US
dc.identifier.department	Department of Computer Science & Engineering	en_US
dc.date.accept	2020
dc.identifier.accno	TH4168	en_US