dc.description.abstract |
The performance of an Artificial Neural Network (ANN) strongly depends on its hidden layer architecture. The generated solution by an ANN does not guarantee that it has always been devised with the simplest neural network architecture suitable for modeling the particular problem. This results in computational complexity of training of an ANN, deployment, and usage of the trained network. Therefore, modeling the hidden layer architecture of an ANN remains as a research challenge. This thesis presents a theoretically-based approach to prune hidden layers of trained artificial neural networks, ensuring better or the same performance of a simpler network as compared with the original network.
The method described in the thesis is inspired by the finding from neuroscience that the human brain has a neural network with nearly 100 billion neurons, yet our activities are performed by a much simpler neural network with a much lesser number of neurons. Furthermore, in biological neural networks, the neurons which do not significantly contribute to the performance of the network will naturally be disregarded. According to neuroplasticity, biological neural networks can also solicit activations of neurons in the proximity of the active neural network to improve the performance of the network. On the same token, it is hypothesized that for a given complex-trained ANN, we can discover an ANN, which is much more simplified than the original given architecture.
This research has discovered a theory to reduce certain number of hidden layers and to eliminate disregarding neurons from the remaining hidden layers of a given ANN architecture. The procedure begins with a complex neural network architecture trained with backpropagation algorithm and reach to the optimum solution by two phases. First, the number of hidden layers is determined by using a peak search algorithm discovered by this research. The newly discovered simpler network with lesser number of hidden layers and highest generalization power considered for pruning of its hidden neurons. The pruning of neurons in the hidden layers has been theorized by identifying the neurons, which give least contribution to the network performances. These neurons are identified by detecting the correlations regarding minimization of error in training. Experiments have shown that the simplified network architecture generated by this approach exhibits same or better performance as compared with the original large network architecture. Generally, it reduces more than 80% of neurons while increasing the generalization by about 30%. As such, the proposed approach can be used to discover simple network architecture relevant to a given complex architecture of an ANN solution. Due to its architectural simplicity, the new architecture has been computationally efficient in training, usage and further training. |
en_US |