Improved particle swarm optimization for optimizing the deep convolutional neural network

Atugoda, AWCK; Fernando, S

UoM IR
→
Research Publications
→
Conference Proceedings
→
UoM Conferences
→
Information Technology Research Unit (ITRU & ICITR)
→
ICITR - 2023
→
View Item

Improved particle swarm optimization for optimizing the deep convolutional neural network

Atugoda, AWCK; Fernando, S

URI: http://dl.lib.uom.lk/handle/123/22188

Abstract:

In recent years, Deep Neural Networks (DNN) have been employed in different types of fields for recognizing, classifying, detecting and sorting, etc. Thus, optimizing the DNN is very essential to obtain a potential solution with high accuracy. Neural network(NN) can be optimized by optimizing the weight values of the network. Many studies have been done utilizing conventional optimization techniques such as Stochastic Gradient Descent(SGD), Adam, Ada Delta, and so on. Employing traditional optimization approaches in optimizing the deep neural network, on the other hand, results in poor performance due to trapping at local extremes and premature convergence. As a result, researchers looked into Swarm Intelligence(SI) optimization algorithms, which are fast and robust global optimization methods that have gained a lot of attention due to their capability to deal with complicated optimization problems. Among different types of SI algorithms, Particle Swarm Optimization (PSO) is mostly used in NN optimization as it has a few parameters to be tuned, and no derivative for simplification. However, recent studies have shown that the standard PSO is not the best tool for tackling all engineering problems since it is slow in some contexts, such as biomedical engineering and building construction, and converges to local optima. Therefore, improving the PSO algorithm is critical for obtaining a feasible solution to NN optimization problems. Hence, the main goal of this study is to make advanced enhancements to the PSO algorithm to optimize DNN while addressing several concerns, such as minimizing the computational cost or Graphical Processing Unit (GPU) dependency and having large input data in Deep Convolutional Neural Network (DCNN) training.

Show full item record