Music generation for scene emotion using generative and CNN model

Jayawardena DIDD

UoM IR
→
Thesis & Dissertation
→
Faculty of IT, Computational Mathematics
→
Master of Science in Artificial Intelligence
→
View Item

Music generation for scene emotion using generative and CNN model

Jayawardena DIDD

URI: http://dl.lib.mrt.ac.lk/handle/123/15811

Abstract:

Generate music using emotional semantics of an image is quiet challenging task due to the complexity of extracting emotional features of an image and generate music according to the emotion.This paper proposes an enhanced deep neural network backed by Generative adversarial network for scene emotion categorization and LSTM based music generator for music generation. In developed system system functions in three parts.Initially we have generated fake images which more looks like real images using the generator of generative adversarial network which will help to enrich the dataset and increase the size of the dataset.Our dataset contains mainly three emotion categories (Happy,Angry,Sad). Second part of the system is image classifier developed using convolutional neural network which is trained using enhanced image dataset of scene emotions.Image classifier helps to identify the probabilities of the input scene which fed in to music generator for creation of training music dataset for each uploaded scene.Third and the last part of the system is the music generator which is developed using convolutional neural network with Long short term memory model.With the use of LSTM model developed deep neural network model got the capability of remember and predict next step.MIDI dataset from raw music files of songs created for each category to train the music generator. Since music composing is more human centric task,best way to evaluate the system is using musicians.So we have tested the system with two musicians and single listener.And also we have compare the image classifier using dataset which contains GAN generated images and without GAN generated images. After improving the dataset using generated images by GAN,we were able to achieve 80% of categorical accuracy and 85% of validation accuracy in image classification.Based on the evaluation done by musicians on generated sounds more than 50% of the sounds were in good quality and they have confirmed the musics were appealing to hear.

Citation:

Jayawardena, D.I.D.D. (2019). Music generation for scene emotion using generative and CNN model [Master’s theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.mrt.ac.lk/handle/123/15811

Show full item record