An Advisor-Based Architecture for a Sample-Efficient Training of Autonomous Navigation Agents with Reinforcement Learning

Wijesinghe, R. D.; Tissera, D.; Vithanage, M. K.; Xavier, A.; Fernando, S; Samarawickrama, J.

UoM IR
→
Research Publications
→
Journals and Magazines
→
Articles authored by UoM staff (Publish in scimago's Q1 journals)
→
View Item

dc.contributor.author	Wijesinghe, R. D.
dc.contributor.author	Tissera, D.
dc.contributor.author	Vithanage, M. K.
dc.contributor.author	Xavier, A.
dc.contributor.author	Fernando, S
dc.contributor.author	Samarawickrama, J.
dc.date.accessioned	2023-12-01T05:37:46Z
dc.date.available	2023-12-01T05:37:46Z
dc.date.issued	2023
dc.identifier.citation	Wijesinghe, R. D., Tissera, D., Vithanage, M. K., Xavier, A., Fernando, S., & Samarawickrama, J. (2023). An Advisor-Based Architecture for a Sample-Efficient Training of Autonomous Navigation Agents with Reinforcement Learning. Robotics, 12(5), Article 5. https://doi.org/10.3390/robotics12050133	en_US
dc.identifier.issn	2218-6581	en_US
dc.identifier.uri	http://dl.lib.uom.lk/handle/123/21862
dc.description.abstract	Recent advancements in artificial intelligence have enabled reinforcement learning (RL) agents to exceed human-level performance in various gaming tasks. However, despite the state-of-the-art performance demonstrated by model-free RL algorithms, they suffer from high sample complexity. Hence, it is uncommon to find their applications in robotics, autonomous navigation, and self-driving, as gathering many samples is impractical in real-world hardware systems. Therefore, developing sample-efficient learning algorithms for RL agents is crucial in deploying them in real-world tasks without sacrificing performance. This paper presents an advisor-based learning algorithm, incorporating prior knowledge into the training by modifying the deep deterministic policy gradient algorithm to reduce the sample complexity. Also, we propose an effective method of employing an advisor in data collection to train autonomous navigation agents to maneuver physical platforms, minimizing the risk of collision. We analyze the performance of our methods with the support of simulation and physical experimental setups. Experiments reveal that incorporating an advisor into the training phase significantly reduces the sample complexity without compromising the agent’s performance compared to various benchmark approaches. Also, they show that the advisor’s constant involvement in the data collection process diminishes the agent’s performance, while the limited involvement makes training more effective.	en_US
dc.language.iso	en	en_US
dc.subject	advisor-based architecture	en_US
dc.subject	autonomous agents	en_US
dc.subject	reinforcement learning	en_US
dc.title	An Advisor-Based Architecture for a Sample-Efficient Training of Autonomous Navigation Agents with Reinforcement Learning	en_US
dc.type	Article-Full-text	en_US
dc.identifier.year	2023	en_US
dc.identifier.journal	Robotics	en_US
dc.identifier.issue	5	en_US
dc.identifier.volume	12	en_US
dc.identifier.database	MDPI	en_US
dc.identifier.pgnos	1-27	en_US
dc.identifier.doi	https://doi.org/10.3390/robotics12050133	en_US