Show simple item record

dc.contributor.advisor Rodrigo, R
dc.contributor.author Ramasinghe, SC
dc.date.accessioned 2019-01-31T01:27:54Z
dc.date.available 2019-01-31T01:27:54Z
dc.identifier.uri http://dl.lib.mrt.ac.lk/handle/123/13876
dc.description.abstract In this study, we investigate the problem of automatic action recognition and classification of videos. First, we present a convolutional neural network architecture, which takes both motion and static information as inputs in a single stream. We show the network is able to treat motion and static information as different feature maps and extract features off them, even though stacked together. By our results, we justify the use of optic flows as the raw information of motion. We demonstrate that our network is able to surpass state-of-the-art hand-engineered feature methods. Furthermore, the effect of providing static information to the network, in the task of action recognition, is also studied and compared here. Then, a novel pipeline is proposed, in order to recognize complex actions. A complex activity is a temporal composition of subevents, and a sub-event typically consists of several low level micro-actions, such as body movement, done by different actors. Extracting these micro actions explicitly is beneficial for complex activity recognition due to actor selectivity, higher discriminative power, and motion clutter suppression. Moreover, considering both static and motion features is vital for activity recognition. However, how to control the contribution from each feature domain optimally still remains uninvestigated. In this work, we extract motion features in micro level, preserving the actor identity, to later obtain a high-level motion descriptor using a probabilistic model. Furthermore, we propose two novel schemas for combining static and motion features: Cholesky transformation based and entropy based. The former allows to control the contribution ratio precisely, while the latter uses the optimal ratio mathematically. The ratio given by an entropy based method matches well with the experimental values obtained by a Choleksy transformation based method. This analysis also provides the ability to characterize a dataset, according to its richness in motion information. Finally, we study the effectiveness of modeling the temporal evolution of sub-event using an LSTM network. Experimental results demonstrate that the proposed technique outperforms state- of-the-art, when tested against two popular datasets. en_US
dc.language.iso en en_US
dc.subject Human action recognition en_US
dc.subject Convolutional Neural Networks (CNN) en_US
dc.subject Recurrent Neural Networks (RNN) en_US
dc.subject Long Short-Term Memory (LSTM) en_US
dc.subject Dense trajecories en_US
dc.subject BoVW en_US
dc.title Activity recognition combined with scene context and action sequence en_US
dc.type Thesis-Full-text en_US
dc.identifier.faculty Engineering en_US
dc.identifier.degree Master of Philosophy (MPhil) en_US
dc.identifier.department Department of Electronic and Telecommunication Engineering en_US
dc.date.accept 2017-09
dc.identifier.accno TH3526 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record