Articles | Open Access | https://doi.org/10.55640/ijiot-05-01-10

Automated Firewall Policy Generation with Reinforcement Learning

Ashutosh Chandra Jha , Network Security Engineer, NewYork, USA

Abstract

Network security would be incomplete without firewalls that control traffic flow through rule-based policies. The manual way to configure and manage firewall rules, however, is prone to various pitfalls; rules tend to become overly complex, human error occurs, and cyber threats continue to evolve. This work investigates the reinforcement learning (RL) - driven method for firewall policy generation, utilizing RL as an automated means for policy generation to increase adaptability and reduce administrative overhead. The proposed system utilizes RL agents that learn an optimal policy from real-time network traffic and dynamically update firewall rules to maximize security while minimizing false positives and latency. The key contributions of this work include a novel system architecture that integrates reinforcement learning (RL) with existing firewall frameworks, as well as methodologies for data collection, feature engineering, and reward function design. Additionally, the system is evaluated using simulated network environments and benchmark datasets. It is demonstrated that the RL-based system achieves better accuracy in threat detection compared to traditional static or heuristic approaches, as well as improved policy effectiveness and network performance. Computational cost, explainability, exploration risks, and model generalization are discussed, and future research directions in transfer learning, multi-agent coordination, and integration with broader security frameworks are addressed. This work moves the field closer to realizing real-time, intelligent, and adaptive firewalls that can handle today's cybersecurity challenges. It motivates further exploration of more secure, interpretable, and production-ready RL-driven security solutions.

Keywords

Firewall automation, Reinforcement learning, Network security, Adaptive policies, Cyber threat detection

References

Abbas, N. N., Ahmed, T., Shah, S. H. U., Omar, M., & Park, H. W. (2019). Investigating the applications of artificial intelligence in cyber security. Scientometrics, 121, 1189-1211. https://link.springer.com/article/10.1007/s11192-019-03222-9

Bangash, Y. A., Zeng, L. F., & Feng, D. (2017). MimiBS: Mimicking base-station to provide location privacy protection in wireless sensor networks. Journal of Computer Science and Technology, 32, 991-1007. https://link.springer.com/article/10.1007/s11390-017-1777-0

Brass, I., & Sowell, J. H. (2021). Adaptive governance for the Internet of Things: Coping with emerging security risks. Regulation & Governance, 15(4), 1092-1110. https://doi.org/10.1111/rego.12343

Campazas-Vega, A., Crespo-Martínez, I. S., Guerrero-Higueras, Á. M., Álvarez-Aparicio, C., Matellán, V., & Fernández-Llamas, C. (2023). Malicious traffic detection on sampled network flow data with novelty-detection-based models. Scientific Reports, 13(1), 15446. https://www.nature.com/articles/s41598-023-42618-9

Chavan, A. (2023). Managing scalability and cost in microservices architecture: Balancing infinite scalability with financial constraints. Journal of Artificial Intelligence & Cloud Computing, 2, E264. http://doi.org/10.47363/JAICC/2023(2)E264

Chavan, A., & Romanov, Y. (2023). Managing scalability and cost in microservices architecture: Balancing infinite scalability with financial constraints. Journal of Artificial Intelligence & Cloud Computing, 5, E102. https://doi.org/10.47363/JMHC/2023(5)E102

S. K. Gunda, "Machine Learning Approaches for Software Fault Diagnosis: Evaluating Decision Tree and KNN Models," 2024 Global Conference on Communications and Information Technologies (GCCIT), BANGALORE, India, 2024, pp. 1-5,

https://ieeexplore.ieee.org/document/10861953

Dhanagari, M. R. (2024). MongoDB and data consistency: Bridging the gap between performance and reliability. Journal of Computer Science and Technology Studies, 6(2), 183-198. https://doi.org/10.32996/jcsts.2024.6.2.21

Dhanagari, M. R. (2024). Scaling with MongoDB: Solutions for handling big data in real-time. Journal of Computer Science and Technology Studies, 6(5), 246-264. https://doi.org/10.32996/jcsts.2024.6.5.20

Dulac-Arnold, G., Levine, N., Mankowitz, D. J., Li, J., Paduraru, C., Gowal, S., & Hester, T. (2021). Challenges of real-world reinforcement learning: definitions, benchmarks and analysis. Machine Learning, 110(9), 2419-2468. https://link.springer.com/article/10.1007/s10994-021-05961-4

Faruque, M. O., Strasser, T., Lauss, G., Jalili-Marandi, V., Forsyth, P., Dufour, C., ... & Paolone, M. (2015). Real-time simulation technologies for power systems design, testing, and analysis. IEEE Power and Energy Technology Systems Journal, 2(2), 63-73. https://doi.org/10.1109/JPETS.2015.2427370

Gilbert, T. K., Dean, S., Zick, T., & Lambert, N. (2022). Choices, risks, and reward reports: Charting public policy for reinforcement learning systems. arXiv preprint arXiv:2202.05716.

Goel, G., & Bhramhabhatt, R. (2024). Dual sourcing strategies. International Journal of Science and Research Archive, 13(2), 2155. https://doi.org/10.30574/ijsra.2024.13.2.2155

Greenhalgh, T., Jackson, C., Shaw, S., & Janamian, T. (2016). Achieving research impact through co‐creation in community‐based health services: literature review and case study. The Milbank Quarterly, 94(2), 392-429. https://doi.org/10.1111/1468-0009.12197

Karunamurthy, A., Kiruthivasan, R., & Gauthamkrishna, S. (2023). Human-in-the-Loop Intelligence: Advancing AI-Centric Cybersecurity for the Future. Quing: International Journal of Multidisciplinary Scientific Research and Development, 2(3), 20-43. https://doi.org/10.54368/qijmsrd.2.3.0011

Karwa, K. (2023). AI-powered career coaching: Evaluating feedback tools for design students. Indian Journal of Economics & Business. https://www.ashwinanokha.com/ijeb-v22-4-2023.php

Karwa, K. (2024). Navigating the job market: Tailored career advice for design students. International Journal of Emerging Business, 23(2). https://www.ashwinanokha.com/ijeb-v23-2-2024.php

S. K. Gunda, "Analyzing Machine Learning Techniques for Software Defect Prediction: A Comprehensive Performance Comparison," 2024 Asian Conference on Intelligent Technologies (ACOIT), KOLAR, India, 2024, pp. 1-5.

https://ieeexplore.ieee.org/document/10939610

Kumar, A. (2019). The convergence of predictive analytics in driving business intelligence and enhancing DevOps efficiency. International Journal of Computational Engineering and Management, 6(6), 118-142. Retrieved from https://ijcem.in/wp-content/uploads/THE-CONVERGENCE-OF-PREDICTIVE-ANALYTICS-IN-DRIVING-BUSINESS-INTELLIGENCE-AND-ENHANCING-DEVOPS-EFFICIENCY.pdf

Liu, Q., Hagenmeyer, V., & Keller, H. B. (2021). A review of rule learning-based intrusion detection systems and their prospects in smart grids. Ieee Access, 9, 57542-57564. https://doi.org/10.1109/ACCESS.2021.3071263

MacGlashan, J., Ho, M. K., Loftin, R., Peng, B., Wang, G., Roberts, D. L., ... & Littman, M. L. (2017, July). Interactive learning from policy-dependent human feedback. In International conference on machine learning (pp. 2285-2294). PMLR. https://proceedings.mlr.press/v70/macglashan17a

Nguyen, T. T., Nguyen, N. D., & Nahavandi, S. (2020). Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications. IEEE transactions on cybernetics, 50(9), 3826-3839. https://doi.org/10.1109/TCYB.2020.2977374

Nyati, S. (2018). Transforming telematics in fleet management: Innovations in asset tracking, efficiency, and communication. International Journal of Science and Research (IJSR), 7(10), 1804-1810. Retrieved from https://www.ijsr.net/getabstract.php?paperid=SR24203184230

Peltomäki, J. (2023). LiDAR Place Recognition with Image Retrieval (Doctoral dissertation, University of Tampere, Finland). https://trepo.tuni.fi/bitstream/handle/10024/146010/978-952-03-2788-0.pdf?sequen

Raju, R. K. (2017). Dynamic memory inference network for natural language inference. International Journal of Science and Research (IJSR), 6(2). https://www.ijsr.net/archive/v6i2/SR24926091431.pdf

Ramesh, D., Harshith, R. P. V., Mahendrakar, S. G., Srinivasan, K., & Kumar, N. P. (2024). Exploring Contemporary Perspectives on the Implementation of Firewall Policies: A Comprehensive Review of Literature. Indiana Journal of Multidisciplinary Research, 4(3), 218-222. https://doi.org/10.5281/zenodo.12679953

Rathore, M. M. U., Paul, A., Ahmad, A., Chen, B. W., Huang, B., & Ji, W. (2015). Real-time big data analytical architecture for remote sensing application. IEEE journal of selected topics in applied earth observations and remote sensing, 8(10), 4610-4621. https://doi.org/10.1109/JSTARS.2015.2424683

S. K. Gunda, "A Deep Dive into Software Fault Prediction: Evaluating CNN and RNN Models," 2024 International Conference on Electronic Systems and Intelligent Computing (ICESIC), Chennai, India, 2024, pp. 224-228, https://ieeexplore.ieee.org/document/10846549

Sardana, J. (2022). The role of notification scheduling in improving patient outcomes. International Journal of Science and Research Archive. Retrieved from https://ijsra.net/content/role-notification-scheduling-improving-patient

Scherer, W. T., Adams, S., & Beling, P. A. (2018). On the practical art of state definitions for Markov decision process construction. IEEE Access, 6, 21115-21128. https://doi.org/10.1109/ACCESS.2018.2819940

Singh, V. (2022). Visual question answering using transformer architectures: Applying transformer models to improve performance in VQA tasks. Journal of Artificial Intelligence and Cognitive Computing, 1(E228). https://doi.org/10.47363/JAICC/2022(1)E228

Singh, V., Doshi, V., Dave, M., Desai, A., Agrawal, S., Shah, J., & Kanani, P. (2020). Answering Questions in Natural Language About Images Using Deep Learning. In Futuristic Trends in Networks and Computing Technologies: Second International Conference, FTNCT 2019, Chandigarh, India, November 22–23, 2019, Revised Selected Papers 2 (pp. 358-370). Springer Singapore. https://link.springer.com/chapter/10.1007/978-981-15-4451-4_28

Sukhadiya, J., Pandya, H., & Singh, V. (2018). Comparison of Image Captioning Methods. INTERNATIONAL JOURNAL OF ENGINEERING DEVELOPMENT AND RESEARCH, 6(4), 43-48. https://rjwave.org/ijedr/papers/IJEDR1804011.pdf

Vihervaara, J., & Monnonen, M. (2024). ARTIFICIAL INTELLIGENCE IN MODERN FIREWALLS. https://trepo.tuni.fi/bitstream/handle/10024/161576/AhmadWajeh.pdf?sequence=2

Vouros, G. A. (2022). Explainable deep reinforcement learning: state of the art and challenges. ACM Computing Surveys, 55(5), 1-39. https://dl.acm.org/doi/abs/10.1145/3527448

Article Statistics

Downloads

Download data is not yet available.

Copyright License

Download Citations

How to Cite

Automated Firewall Policy Generation with Reinforcement Learning. (2025). International Journal of IoT, 5(01), 190-211. https://doi.org/10.55640/ijiot-05-01-10