A Graph-Based PPO Approach in Multi-UAV Navigation for Communication Coverage
DOI:
https://doi.org/10.15837/ijccc.2023.6.5505Keywords:
UAV Swarm Intelligence, Communication Coverage, Graph Learning, Multi-Agent Reinforcement LearningAbstract
Multi-Agent Reinforcement Learning (MARL) is widely used to solve various problems in real life. In the multi-agent reinforcement learning tasks, there are multiple agents in the environment, the existing Proximal Policy Optimization (PPO) algorithm can be applied to multi-agent reinforcement learning. However, it cannot deal with the communication problem between agents. In order to resolve this issue, we propose a Graph-based PPO algorithm, this approach can solve the communication problem between agents and it can enhance the exploration efficiency of agents in the environment and speed up the learning process. We apply our algorithms to the task of multi-UAV navigation for communication coverage to verify the functionality and performance of our proposed algorithms.References
Brody, Shaked.; Alon, Uri.; Yahav, Eran. (2021). How attentive are graph attention networks?, arXiv preprint arXiv:2105.14491, 2021.
Canese, Lorenzo.; Cardarilli, Gian Carlo.; Di Nunzio, Luca.; Fazzolari, Rocco.; Giardino, Daniele.; Re, Marco.; Spanò, Sergio. (2021). Multi-agent reinforcement learning: A review of challenges and applications, Applied Sciences, 11(11), 4948, 2021.
https://doi.org/10.3390/app11114948
Gronauer, Sven.; Diepold, Klaus. (2022). Multi-agent deep reinforcement learning: a survey, Artificial Intelligence Review, 1-49, 2022.
Gupta, Lav.; Jain, Raj.; Vaszkun, Gabor. (2015). Survey of important issues in UAV communication networks, IEEE communications surveys & tutorials, 18(2), 2015.
https://doi.org/10.1109/COMST.2015.2495297
Haarnoja, Tuomas.; Zhou, Aurick.; Abbeel, Pieter.; Levine, Sergey. (2018). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor, International conference on machine learning, 1861-1870, 2018.
Hamilton, Will.; Ying, Zhitao.; Leskovec, Jure. (2017). Inductive representation learning on large graphs, Advances in neural information processing systems, 30, 2017.
Hart, Patrick.; Knoll, Alois. (2020). Graph neural networks and reinforcement learning for behavior generation in semantic environments, 2020 IEEE Intelligent Vehicles Symposium (IV), 1589-1594, 2020.
https://doi.org/10.1109/IV47402.2020.9304738
Jiang, Jiechuan.; Dun, Chen.; Huang, Tiejun.; Lu, Zongqing. (2018). Graph convolutional reinforcement learning, arXiv preprint arXiv:1810.09202, 2018.
Jiang, Zhiling.; Chen, Yining.; Song, Guanghua.; Yang, Bowei.; Jiang, Xiaohong. (2023). Cooperative planning of multi-UAV logistics delivery by multi-graph reinforcement learning, International Conference on Computer Application and Information Security (ICCAIS 2022), 12609, 129-137, 2023.
https://doi.org/10.1117/12.2671868
Kipf, Thomas N.; Welling, Max. (2016). Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv:1609.02907, 2016.
Mnih, Volodymyr.; Kavukcuoglu, Koray.; Silver, David.; Graves, Alex.; Antonoglou, Ioannis.; Wierstra, Daan.; Riedmiller, Martin. (2013). Playing atari with deep reinforcement learning, arXiv preprint arXiv:1312.5602, 2013.
Mnih, Volodymyr.; Kavukcuoglu, Koray.; Silver, David.; Rusu, Andrei A.; Veness, Joel.; Bellemare, Marc G.; Graves, Alex.; Riedmiller, Martin.; Fidjeland, Andreas K.; Ostrovski, Georg. (2015). Human-level control through deep reinforcement learning, nature, 518(7540), 529-533, 2015.
https://doi.org/10.1038/nature14236
Moradi, Mehrdad.; Sundaresan, Karthikeyan.; Chai, Eugene.; Rangarajan, Sampath.; Mao, Z Morley. (2018). SkyCore: Moving core to the edge for untethered and reliable UAV-based LTE networks, Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, 35-49, 2018.
https://doi.org/10.1145/3351422.3351431
Oroojlooy, Afshin.; Hajinezhad, Davood. (2022). A review of cooperative multi-agent deep reinforcement learning, Applied Intelligence, 1-46, 2022.
Pan, Wei.; Liu, Cheng. (2023). A Graph-Based Soft Actor Critic Approach in Multi-Agent Reinforcement Learning, INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 18(1), 2023.
https://doi.org/10.15837/ijccc.2023.1.5062
Ren, Yixiang.; Ye, Zhenhui.; Song, Guanghua.; Jiang, Xiaohong. (2022). Space-Air-Ground Integrated Mobile Crowdsensing for Partially Observable Data Collection by Multi-Scale Convolutional Graph Reinforcement Learning, Entropy, 24(5), 638, 2022.
https://doi.org/10.3390/e24050638
Ruan, Lang.; Wang, Jinlong.; Chen, Jin.; Xu, Yitao.; Yang, Yang.; Jiang, Han.; Zhang, Yuli.; Xu, Yuhua. (2018). Energy-efficient multi-UAV coverage deployment in UAV networks: A gametheoretic framework, China Communications, 15(10), 194-209, 2018.
https://doi.org/10.1109/CC.2018.8485481
Ryu, Heechang.; Shin, Hayong.; Park, Jinkyoo. (2020). Multi-agent actor-critic with hierarchical graph attention network, Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 7236-7243, 2020.
https://doi.org/10.1609/aaai.v34i05.6214
Schulman, John.; Wolski, Filip.; Dhariwal, Prafulla.; Radford, Alec.; Klimov, Oleg. (2017). Proximal policy optimization algorithms, arXiv preprint arXiv:1707.06347, 2017.
Veličković, Petar.; Cucurull, Guillem.; Casanova, Arantxa.; Romero, Adriana.; Lio, Pietro.; Bengio, Yoshua. (2017). Graph attention networks, arXiv preprint arXiv:1710.10903, 2017.
Wang, Enshu.; Liu, Bingyi.; Lin, Songrong.; Shen, Feng.; Bao, Tianyu.; Zhang, Jun.; Wang, Jianping.; Sadek, Adel W.; Qiao, Chunming. (2023). Double graph attention actor-critic framework for urban bus-pooling system, IEEE Transactions on Intelligent Transportation Systems, 2023.
https://doi.org/10.1109/TITS.2023.3238055
Wang, Yi.; Qiu, Dawei.; Wang, Yu.; Sun, Mingyang.; Strbac, Goran. (2023). Graph Learning- Based Voltage Regulation in Distribution Networks with Multi-Microgrids, IEEE Transactions on Power Systems, 2023.
https://doi.org/10.1109/TPWRS.2023.3242715
Watkins, Christopher JCH.; Dayan, Peter. (1992). Q-learning, Machine learning, 8, 279-292, 1992.
https://doi.org/10.1023/A:1022676722315
Xu, Xiaohan.; Zhang, Peng.; He, Yongquan.; Chao, Chengpeng.; Yan, Chaoyang. (2022). Subgraph neighboring relations infomax for inductive link prediction on knowledge graphs, arXiv preprint arXiv:2208.00850, 2022.
https://doi.org/10.24963/ijcai.2022/325
You, Jiaxuan.; Liu, Bowen.; Ying, Zhitao.; Pande, Vijay.; Leskovec, Jure. (2018). Graph convolutional policy network for goal-directed molecular graph generation, Advances in neural information processing systems, 31, 2018.
Additional Files
Published
Issue
Section
License
Copyright (c) 2023 Zhiling Jiang, Guanghua Song, Bowei Yang, Yining Chen, Ke Wang
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
ONLINE OPEN ACCES: Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.
You are free to:
-Share: copy and redistribute the material in any medium or format;
-Adapt: remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
DISCLAIMER: The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.