Home > Published Issues > 2025 > Volume 20, No. 4, 2025 >
JCM 2025 Vol.20(4): 446-456
Doi: 10.12720/jcm.20.4.446-456

Deep Reinforcement Learning for Dynamic Routing, Modulation, and Spectrum Assignment in Elastic Optical Networks

Mohammed B. S. Jaber and Omar Y. Shaaban *
Department of Information and Communications Engineering, Al-Khwarizmi College of Engineering, University of Baghdad, Baghdad, Iraq
Email: mohammed.abd2203@kecbu.uobaghdad.edu.iq (M.B.S.J.); omar.yousif@kecbu.uobaghdad.edu.iq (O.Y.S.)
*Corresponding author

Manuscript received February 8, 2025; revised March 25, 2025; accepted May 7, 2025; published August 1, 2025.

Abstract—Routing, Modulation and Spectrum Assignment (RMSA) is a pivotal problem in Elastic Optical Networks (EONs), where efficient resource allocation is essential for enhancing spectral efficiency and minimizing blocking probability. This paper presents an online Deep Reinforcement Learning (DRL)-RMSA framework that enables adaptive decision-making in dynamic and unpredictable network environments. Unlike traditional heuristics and optimization-based methods, which often rely on static policies, the proposed DRL-RMSA framework continuously learns and adjusts in real time by interacting with the network. Three DRL agents ((Deep Q Network (DQN), Advantage Actor-Critic (A2C), and Proximal Policy Optimization (PPO)) were used. Agents acquire the best routing, Spectrum Allocation (SA), and modulation format without requiring pre-collected training data. The approach dynamically adapts to varying traffic patterns and network conditions, ensuring robust and efficient performance. Simulation results demonstrate superior adaptability, reduced SBR, and improved resource utilization. Evaluations on National Science Foundation Network and Underwater Sensor Network Simulation Tool (USNET) show Service Blocking Ratio (SBR) reductions exceeding 51.12% compared to traditional methods. The proposed DRL-RMSA framework has promising applications in real-world optical networks, such as metro, data center, and backbone networks, where dynamic and unpredictable traffic demands require intelligent resource management.

Keywords—elastic optical networks, routing, modulation and spectrum assignment, deep reinforcement learning, proactive defragmentation, service blocking ratio, Deep Q Network (DQN), Advantage Actor-Critic (A2C), and Proximal Policy Optimization (PPO)

Cite: Mohammed B. S. Jaber and Omar Y. Shaaban, “Deep Reinforcement Learning for Dynamic Routing, Modulation, and Spectrum Assignment in Elastic Optical Networks," Journal of Communications, vol. 20, no. 4, pp. 446-456, 2025.


Copyright © 2025 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Article Metrics in Dimensions