Abstract:
There are numerous scenarios where similar acoustic events occur multiple times.
Acoustic monitoring of migratory birds is an ideal example. Birds make a type
of call known as flight calls during migration. A flight call can be considered
as an acoustic event because it is a short-term, intuitively distinct sound. It
is challenging to identify multiple occurrences of extremely short-range acoustic
events such as flight calls in real-world recordings using classification techniques
that require more computational power. It is mainly due to background noise
and complex acoustic environments. This research aims at developing a classification
model that reduces the effect of background noise, extract ROIs from
continuous recordings, extract suitable features of flight calls and detect multiple
occurrences of flight calls. An improved algorithm that can extract features has
been developed in this research—by combining a well known Maximally Stable
Extremal Regions (MSER) technique with state of the art traditional techniques.
Namely Spectral and Temporal Features(SATF) and a combination of SATF and
Spectrogram-based Image Frequency Statistics(SIFS). We name this novel algorithm
as Spectrogram-based Maximally Stable Extremal Regions (SMSER).
Three distinct feature sets have formed such that Featureset-1 created using
SATF. Featureset-2 is a blend of SATF and SIFS. Featureset-3 is a combination
of SATF, SIFS, and SMSER. The kNN, RF, SVM, and DNN classification
techniques evaluated a real-world dataset using the extracted feature sets. Research
carried out several tests to find out the best performing combination of
classification model and feature set. The results showed that the flight calls’ detection
accuracy increased when the number of features increased, although high
computational power requirement is a disadvantage. The performance of SMSER
feature set was the best among almost every classification technique above. It
should be because the SMSER Feature set has the highest number of features.
Classification of the SMSER feature set from the DNN classifier showed the highest
accuracy of 87.67%.
Citation:
Egodage, D. (2021). Automatic classification of multiple acoustic events using artificial neural networks [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/22276