dc.contributor.advisor |
Sooriyaarachchi S |
|
dc.contributor.author |
Mihiranga JPM |
|
dc.date.accessioned |
2021 |
|
dc.date.available |
2021 |
|
dc.date.issued |
2021 |
|
dc.identifier.citation |
Mihiranga, J.P.M. (2021). Acoustic event detection in polyphonic environments using artificial neural networks [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/22278 |
|
dc.identifier.uri |
http://dl.lib.uom.lk/handle/123/22278 |
|
dc.description.abstract |
Our environment is a mixture of hundreds of sounds that are emitted by different sound
sources. These sounds are overlapped in both time and frequency domains in an unstructured
manner composing a polyphonic environment. Identification of acoustic events
in a polyphonic environment has become an emerging topic with many applications
such as surveillance, context-aware computing, automatic audio indexing, health care
monitoring and bioacoustics monitoring.
Polyphonic acoustic event detection is a challenging task aimed at detecting the
presence of multiple sound events that are overlapped at a particular time instance and
labeling. It requires a large amount of training data with a complex machine learning
architecture thus making it a highly resource-consuming task. Hence, the accuracy of
this research area is still not at a satisfactory level.
This study presents a neural networks-based classifier architecture with data augmentation
and post-processing methods to improve accuracy. Two neural network architectures
as a multi-label and combined single label are implemented and compared
in the study. Previous literature reveals that Mel frequency cepstral coefficients and log
Mel-band energies are the widely used features in the state of the art research in the area.
Different data augmentation methods were used to ensure that the neural networks are
trained for even the slight variations of the environmental sounds. A novel binarization
method based on the signal energy is proposed to calculate the threshold value for binarizing
the source presence predictions. Finally, the median filter based post processing
was implemented to smoothen the detection results. The experimental results show
that the proposed binarizing method improved the detection accuracy and recorded a
maximum of 62.5% combined with the data augmentation and post-processing. |
en_US |
dc.language.iso |
en |
en_US |
dc.subject |
POLYPHONIC ACOUSTIC EVENT DETECTION |
en_US |
dc.subject |
DYNAMIC THRESHOLD BINARIZATION |
en_US |
dc.subject |
DEEP NEURAL NETWORKS |
en_US |
dc.subject |
COMPUTER SCIENCE & ENGINEERING - Dissertation |
en_US |
dc.subject |
COMPUTER SCIENCE- Dissertation |
en_US |
dc.title |
Acoustic event detection in polyphonic environments using artificial neural networks |
en_US |
dc.type |
Thesis-Abstract |
en_US |
dc.identifier.faculty |
Engineering |
en_US |
dc.identifier.degree |
MSc in Computer Science & Engineering By research |
en_US |
dc.identifier.department |
Department of Computer Science & Engineering |
en_US |
dc.date.accept |
2021 |
|
dc.identifier.accno |
H4864 |
en_US |