A Decentralized Multi-Agent Reinforcement Learning Framework for Cooperative UAV Search: Navigating Non-Stationarity via Reward Shaping

Auteurs

DOI :

https://doi.org/10.63944/e1p7kr68

Mots-clés :

Multi Agent Reinforcement Learning, Cooperative Search, Decentralized Partially Observable Markov Decision Process, Reward Shaping, Non-stationarity

Résumé

Due to the complex coupling between high-dimensional state spaces and the stringent constraints of local perception, collaborative search by multi UAV swarms in unknown environments poses a formidable challenge within the field of autonomous systems. Although early algorithmic attempts often assumed stationary targets or perfect communication networks, the inherent non-stationarity of real-world dynamic environments renders traditional independent learning paradigms highly inefficient. Sometimes even leading to complete divergence during training. In an effort to overcome these analytical bottlenecks, this paper explores a Decentralized Partially Observable Markov Decision Process framework and introduces specific Multi Agent Reinforcement Learning methods under a Centralized Training with Decentralized Execution architecture. To fully validate and elucidate these emergent collaborative behaviors, extensive verification in real-world physical environments, as well as further research specifically addressing communication-constrained settings.

Téléchargements

Publiée

2026-04-02

Numéro

Rubrique

Research Articles

Catégories

Comment citer

A Decentralized Multi-Agent Reinforcement Learning Framework for Cooperative UAV Search: Navigating Non-Stationarity via Reward Shaping. (2026). International Journal of Computer Science and Engineering, 1(03), 101-109. https://doi.org/10.63944/e1p7kr68