Tampere University of Technology

TUTCRIS Research Portal

Reinforcement learning for improved UAV-based integrated access and backhaul operation

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review


Original languageEnglish
Title of host publication2020 IEEE International Conference on Communications Workshops, ICC Workshops 2020 - Proceedings
Number of pages7
ISBN (Electronic)9781728174402
ISBN (Print)978-1-7281-7441-9
Publication statusPublished - 2020
Publication typeA4 Article in a conference publication
EventIEEE International Conference on Communications Workshops - Dublin, Ireland
Duration: 7 Jun 202011 Jun 2020

Publication series

NameIEEE/CIC international conference on communications in China - workshops
ISSN (Print)2474-9133
ISSN (Electronic)2474-9141


ConferenceIEEE International Conference on Communications Workshops


There is a strong interest in utilizing commercial cellular networks to support unmanned aerial vehicles (UAVs) to send control commands and communicate heavy traffic. Cellular networks are well suited for offering reliable and secure connections to the UAVs as well as facilitating traffic management systems to enhance safe operation. However, for the full-scale integration of UAVs that perform critical and high-risk tasks, more advanced solutions are required to improve wireless connectivity in mobile networks. In this context, integrated access and backhaul (IAB) is an attractive approach for the UAVs to enhance connectivity and traffic forwarding. In this paper, we study a novel approach to dynamic associations based on reinforcement learning at the edge of the network and compare it to alternative association algorithms. Considering the average data rate, our results indicate that the reinforcement learning methods improve the achievable data rate. The optimal parameters of the introduced algorithm are highly sensitive to the donor next generation node base (DgNB) and UAV IAB node densities, and need to be identified beforehand or estimated via a stateful search. However, its performance nearly converges to that of the ideal scheme with a full knowledge of the data rates in dense deployments of DgNBs.