Multi-agent cooperative control of a three-tank liquid system using reinforcement learning algorithms

Lincoln Nobert Munhenzwa (1) , Adlen Kerboua (2) , Godfrey Murairidzi Gotora (3)
(1) Agriculture and Biosystems Engineering, University of Zimbabwe, Harare, Zimbabwe, Zimbabwe,
(2) Department of Petrochemistry LGMM Laboratory, University of Skikda, Skikda, Algeria, Algeria,
(3) Electrical and Electronics Engineering, University of Zimbabwe, Harare, Zimbabwe, Zimbabwe

Abstract

This research paper aims to present a multi-agent cooperative reinforcement learning approach for controlling the highly nonlinear dynamics of a three-tank liquid system. Meaning to say the system’s complexity arises from the interdependence of the tanks, where precise control is required to maintain stable fluid levels. Initially, we deploy two twin-delayed deep deterministic policy Gradient agents to manage the inflow valves, then followed by two proximal policy optimization agents tasked with controlling the valves. The main goal is to compare the performance of these two models of agents against traditional proportional-integral-derivative controllers. In addition to promote effective collaboration between agents, a cooperative reward structure is implemented, encouraging agents to work together to maintain balanced fluid levels within all three tanks. The reward function penalizes deviations from target levels, accounting for both local performance and system-wide stability. The proposed method also addresses key challenges in multi-agent systems, such as non-stationarity and coordination in decentralized control, by integrating a centralized critic during training with decentralized execution. Experimental results reveal that the twin-delayed deep deterministic policy agents outperform the proportional integral derivative control system in terms of settling time, rise time, and robustness, showcasing their ability to handle the nonlinear nature of the system with minimal tuning. 

Full text article

Generated from XML file

References

[1] Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347. 2017 Jul 20

[2] Fujimoto S, Hoof H, Meger D. Addressing function approximation error in actor-critic methods. International conference on machine learning 2018 Jul 3 (pp. 1587-1596). PMLR.

[3] Gupta JK, Egorov M, Kochenderfer M. Cooperative multi-agent control using deep reinforcement learning. International conference on autonomous agents and multiagent systems 2017 May 8 (pp. 66-83). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-71682-4_5

[4] Suzuki A, Kawahara R, Harada S. Cooperative multi-agent deep reinforcement learning for dynamic virtual network allocation with traffic fluctuations. IEEE Transactions on Network and Service Management. 2022 Feb 7;19(3):1982-2000. https://doi.org/10.1109/TNSM.2022.3149243

[5] Yun WJ, Park S, Kim J, Shin M, Jung S, Mohaisen DA, Kim JH. Cooperative multiagent deep reinforcement learning for reliable surveillance via autonomous multi-UAV control. IEEE Transactions on Industrial Informatics. 2022 Jan 14;18(10):7086-96. https://doi.org/10.1109/TII.2022.3143175

[6] Dittrich MA, Fohlmeister S. Cooperative multi-agent system for production control using reinforcement learning. CIRP Annals. 2020 Jan 1;69(1):389-92. https://doi.org/10.1016/j.cirp.2020.04.005

[7] Nguyen ND, Nguyen T, Nahavandi S. Multi-agent behavioral control system using deep reinforcement learning. Neurocomputing. 2019 Sep 24;359:58-68. https://doi.org/10.1016/j.neucom.2019.05.062

[8] Zhang Y, Quinones-Grueiro M, Barbour W, Zhang Z, Scherer J, Biswas G, Work D. Cooperative multi-agent reinforcement learning for large scale variable speed limit control. In2023 IEEE International Conference on Smart Computing (SMARTCOMP) 2023 Jun 26 (pp. 149-156). IEEE. https://doi.org/10.1109/SMARTCOMP58114.2023.00036

[9] Rivera C, Staley E, Llorens A. Learning multi-agent cooperation. Frontiers in Neurorobotics. 2022 Oct 14;16:932671. https://doi.org/10.3389/fnbot.2022.932671

[10] Lillicrap TP, Hunt JJ, Pritzel A, Heess NM, Erez T, Tassa Y, Silver D, Wierstra DP. Continuous control with deep reinforcement learning. Google Patents. US Patent. 2020;10.

[11] Lowe R, Wu YI, Tamar A, Harb J, Pieter Abbeel O, Mordatch I. Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems. 2017;30.

[12] Amato C. An introduction to centralized training for decentralized execution in cooperative multi-agent reinforcement learning. arXiv preprint arXiv:2409.03052. 2024 Sep 4.

[13] Hernandez-Leal P, Kartal B, Taylor ME. A survey and critique of multiagent deep reinforcement learning. Autonomous Agents and Multi-Agent Systems. 2019 Nov;33(6):750-97. https://doi.org/10.1007/s10458-019-09421-1

[14] Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347. 2017 Jul 20.

[15] Fujimoto S, Hoof H, Meger D. Addressing function approximation error in actor-critic methods. InInternational conference on machine learning 2018 Jul 3 (pp. 1587-1596). PMLR.

[16] Gupta JK, Egorov M, Kochenderfer M. Cooperative multi-agent control using deep reinforcement learning. InInternational conference on autonomous agents and multiagent systems 2017 May 8 (pp. 66-83). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-71682-4_5

[17] Suzuki A, Kawahara R, Harada S. Cooperative multi-agent deep reinforcement learning for dynamic virtual network allocation with traffic fluctuations. IEEE Transactions on Network and Service Management. 2022 Feb 7;19(3):1982-2000. https://doi.org/10.1109/TNSM.2022.3149243

[18] Yun WJ, Park S, Kim J, Shin M, Jung S, Mohaisen DA, Kim JH. Cooperative multiagent deep reinforcement learning for reliable surveillance via autonomous multi-UAV control. IEEE Transactions on Industrial Informatics. 2022 Jan 14;18(10):7086-96. https://doi.org/10.1109/TII.2022.3143175

[19] Dittrich MA, Fohlmeister S. Cooperative multi-agent system for production control using reinforcement learning. CIRP Annals. 2020 Jan 1;69(1):389-92. https://doi.org/10.1016/j.cirp.2020.04.005

[20] Nguyen ND, Nguyen T, Nahavandi S. Multi-agent behavioral control system using deep reinforcement learning. Neurocomputing. 2019 Sep 24;359:58-68. https://doi.org/10.1016/j.neucom.2019.05.062

[21] Zhang Y, Quinones-Grueiro M, Barbour W, Zhang Z, Scherer J, Biswas G, Work D. Cooperative multi-agent reinforcement learning for large scale variable speed limit control. In2023 IEEE International Conference on Smart Computing (SMARTCOMP) 2023 Jun 26 (pp. 149-156). IEEE. https://doi.org/10.1109/SMARTCOMP58114.2023.00036

[22] Lillicrap TP, Hunt JJ, Pritzel A, Heess NM, Erez T, Tassa Y, Silver D, Wierstra DP, inventors; DeepMind Technologies Ltd, assignee. Continuous control with deep reinforcement learning. United States patent US 10,776,692. 2020 Sep 15.

[23] Lowe R, Wu YI, Tamar A, Harb J, Pieter Abbeel O, Mordatch I. Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems. 2017;30.

[24] Zhang K, Yang Z, Başar T. Multi-agent reinforcement learning: A selective overview of theories and algorithms. Handbook of reinforcement learning and control. 2021 Jun 24:321-84. https://doi.org/10.1007/978-3-030-60990-0_12

[25] Amato C. An introduction to centralized training for decentralized execution in cooperative multi-agent reinforcement learning. arXiv preprint arXiv:2409.03052. 2024 Sep 4.

[26] Rajasekhar N, Radhakrishnan, T K, Samsudeen N, Decentralized multi-agent control of a three-tank hybrid system based on twin delayed deep deterministic policy gradient reinforcement learning algorithm. Int. J. Dynam. Control 12, 1098-1115 (2024). https://doi.org/10.1007/s40435-023-01227-0

[27] Zhang Y, Zhao Y, Zhang C, Feng C, Multi-agent reinforcement learning-based method for demand response of building HVAC systems, Journal of Building Engineering, Volume 108, 2025, 112734, ISSN 2352-7102. https://doi.org/10.1016/j.jobe.2025.112734

[28] Sheikh A, Machine Learning for Optimization and Decision Support in Complex System-of-Systems: Applications in Microelectronics, Healthcare, Smart Cities, and the Circular Economy, Colorado State University ProQuest Dissertations & Theses, 2025. 32236453.

[29] Mallik S, Mathivanan SK, Shivahare BD, S.K.B, S., Jayagopal, P., & Chakraborty, S. (Eds.). (2026). Robotics in Weaponry using Machine Learning and Engineering (1st ed.). CRC Press. https://doi.org/10.1201/9781003663461

[30] Abdel A W, (2009). Pathophysiology. In: Passing the USMLE. Springer, New York, NY. https://doi.org/10.1007/978-0-387-68980-7_10

[31] Hoseini S A, Hassan J, Bokani A, Kanhere SS. In Situ MIMO-WPT Recharging of UAVs Using Intelligent Flying Energy Sources. Drones. 2021; 5(3):89. https://doi.org/10.3390/drones5030089

[32] Xiaoqiang C, Haobo Y, Jianjun M et al. Research on multi-section energy-saving operation control methods of urban rail train based on deep reinforcement learning, 18 October 2024, PREPRINT (Version 1) available at Research Square

[33] Cheng Z. Towards Trustworthy Deep Reinforcement Learning, Northwestern University ProQuest Dissertations & Theses, 2025. 32122366.

[34] Lu Q, Fang H, Yin Z, Zhu G. HAPS-PPO: A Multi-Agent Reinforcement Learning Architecture for Coordinated Regional Control of Traffic Signals in Heterogeneous Road Networks. Applied Sciences. 2025; 15(20):10945. https://doi.org/10.3390/app152010945

[35] El-Nabulsi AR, Non-Linear Dynamics with Non-Standard Lagrangians. Qual. Theory Dyn. Syst. 12, 273-291 (2013). https://doi.org/10.1007/s12346-012-0074-0

[36] Ma T, Xu C, Yang S, et al. An intelligent proactive defense against the client-side DNS cache poisoning attack via self-checking deep reinforcement learning. Int J Intell Syst. 2022;37:8170-8197. https://doi.org/10.1002/int.22934

[37] Prabu A, Integrating Data-Driven Control Methods with Motion Planning: A Deep Reinforcement Learning-Based Approach, Purdue University ProQuest Dissertations & Theses, 2023. 32363572.

[38] Wu M, Eldele E, Chen Z, Pan S, Wen Q, & Li, X. (Eds.). (2026). AI for Time Series: Volume 1: Unlocking Patterns with Deep Learning (1st ed.). CRC Press. https://doi.org/10.1201/9781003612742

[39] Misbaudeen AA, Hammed O, Anis R, Wook-Ho N, Qazeem OO, Timothy Denen Akpenpuun, Min-Hwi Kim, Hyeon-Tae Kim, Bo-Yeong Kang, Hyun-Woo Lee, Deep reinforcement learning for PID parameter tuning in greenhouse HVAC system energy Optimization: A TRNSYS-Python cosimulation approach, Expert Systems with Applications, Volume 252, Part A, 2024, 124126, ISSN 0957-4174, https://doi.org/10.1016/j.eswa.2024.124126

[40] Tao L, Zhang J and Zhang, X. Multi-Phase Multi-Objective Dexterous Manipulation with Adaptive Hierarchical Curriculum. J Intell Robot Syst 106, 1 (2022). https://doi.org/10.1007/s10846-022-01680-7

[41] Le BG, Ta VC, Low variance trust region optimization with independent actors and sequential updates in cooperative multi-agent reinforcement learning. Auton Agent Multi-Agent Syst 39, 12 (2025). https://doi.org/10.1007/s10458-025-09695-8

[42] Abdel AW (2009). Pathophysiology. In: Passing the USMLE. Springer, New York, NY. https://doi.org/10.1007/978-0-387-68980-7_10

[43] Hoseini SA, Hassan J, Bokani A, Kanhere SS. In Situ MIMO-WPT Recharging of UAVs Using Intelligent Flying Energy Sources. Drones. 2021; 5(3):89. https://doi.org/10.3390/drones5030089

[44] Bujgoi G, Sendrescu D. Tuning of PID Controllers Using Reinforcement Learning for Nonlinear System Control. Processes. 2025; 13(3):735. https://doi.org/10.3390/pr13030735

[45] Mahapatra AGKRS, and Mahapatro SR, Design of a decentralized control law for variable area coupled tank systems using H∞ complimentary sensitivity function, Asian J. Control. 26 (2024), 1540-1552. https://doi.org/10.1002/asjc.3281

[46] Sawant HH, Gujar R, Mandhare N, Sable MJ, Ambadekar, P.K. and Gawande, S.H. (2025), Comparative Analysis of Reinforcement Learning Agents for Optimizing Airfoil Shapes. Int J Numer Meth Fluids, 97: 1142-1156. https://doi.org/10.1002/fld.5395

[47] Rao A & Jelvis T, (2022). Foundations of Reinforcement Learning with Applications in Finance (1st ed.). Chapman and Hall/CRC. https://doi.org/10.1201/9781003229193-1

[48] Roberto CR, Orbital Maneuvers and Interplanetary Trajectory Design via Reinforcement Learning, Embry-Riddle Aeronautical University ProQuest Dissertations & Theses, 2025. 32237533.

[49] Ho TM, Nguyen KK and Cheriet M, "Converging Game Theory and Reinforcement Learning For Industrial Internet of Things," in IEEE Transactions on Network and Service Management, vol. 20, no. 2, pp. 890-903, June 2023, https://doi.org/10.1109/TNSM.2022.3202168

[50] Gautam M, Deep Reinforcement Learning for Resilient Power and Energy Systems: Progress, Prospects, and Future Avenues. Electricity. 2023; 4(4):336-380. https://doi.org/10.3390/electricity4040020

[51] Kim N, Ecological and Predictive Indicators of Social and Emotional Skills among Korean Adolescents: A Person-Centered Analysis within the OECD SSES Framework. Child Ind Res (2026). https://doi.org/10.1007/s12187-026-10355-w

[52] Dodda, A. (2025). Artificial Intelligence and Financial Transformation: Unlocking the Power of Fintech, Predictive Analytics, and Public Governance in the Next Era of Economic Intelligence. Deep Science Publishing. Chapter 5, 1-20. https://doi.org/10.70593/978-81-988918-1-5

Authors

Lincoln Nobert Munhenzwa
Adlen Kerboua
Godfrey Murairidzi Gotora
Munhenzwa, L. N. ., Kerboua, A. ., & Gotora, G. M. . (2026). Multi-agent cooperative control of a three-tank liquid system using reinforcement learning algorithms. International Journal of Applied Resilience and Sustainability, 2(2), 1130-1145. https://doi.org/10.70593/deepsci.0202046

Article Details

How to Cite

Munhenzwa, L. N. ., Kerboua, A. ., & Gotora, G. M. . (2026). Multi-agent cooperative control of a three-tank liquid system using reinforcement learning algorithms. International Journal of Applied Resilience and Sustainability, 2(2), 1130-1145. https://doi.org/10.70593/deepsci.0202046

Measuring sustainable use of artificial intelligence in higher education: A novel explainable AI model

Ashok Meti, Dimple Ravindra Patil, Nitin Liladhar Rane (Author)
Abstract View : 152
Download :47

Organizational culture for sustainable healthcare: An NLP-ML-SEM framework with the Sustainability Culture Alignment Index (SCAI)

Birupaksha Biswas, Suhena Sarkar, Joseph Ozigis Akomodi, Claudio Bellevicine (Author)
Abstract View : 222
Download :61

Responsible artificial intelligence in sustainable business: Enhancing customer relationships and loyalty

Nitin Liladhar Rane, Obizue Emmanuel Chika, Jayesh Rane (Author)
Abstract View : 208
Download :158

Sustainable supply chain management through a digital twin-enabled federated deep reinforcement learning framework

Santanu Acharyya, Suhena Sarkar, Bappaditya Biswas, Birupaksha Biswas, Prithwijit Banerjee...
Abstract View : 102
Download :183

Artificial intelligence-driven cybersecurity for resilient and sustainable business in Industry 5.0

Swapnil Malipatil, Jayesh Rane, Sibaram Prasad Panda, Nitin Liladhar Rane (Author)
Abstract View : 125
Download :120

Green artificial intelligence for sustainable and resilient development: A review

Dimple Ravindra Patil, Nitin Liladhar Rane, Obizue Mirian Ndidi, Jayesh Rane (Author)
Abstract View : 119
Download :59

Ethics, bias, and fairness challenges in artificial intelligence and machine learning

Ritesh Rastogi, Nitin Liladhar Rane, Ankur Chaudhary, Jayesh Rane (Author)
Abstract View : 117
Download :123

Evaluating teachers’ perceptions of artificial intelligence tools in education: Opportunities and challenges

Shreeshail Heggond, Nitin Liladhar Rane, Manjunath Munenakoppa, Ramesh Baragani (Author)
Abstract View : 114
Download :74

Why teachers do or don't rely on artificial intelligence in education: Impact, trust, and adoption factors

Swapnil Malipatil, Jayesh Rane, Shrishail Chiniwalar, Swati Dhurve, Nitin Liladhar Rane...
Abstract View : 101
Download :72

Artificial intelligence for enhancing learning and motivation among education faculty students

Idowu Johnson Mosaku, Obizue Mirian Ndidi, Nitin Liladhar Rane, Jayesh Rane (Author)
Abstract View : 145
Download :88

Qualitative research using artificial intelligence: Methods, techniques, challenges, and future directions

Dimple Ravindra Patil , Nitin Liladhar Rane , Obizue Mirian Ndidi, Jayesh Rane (Author)
Abstract View : 206
Download :90

Student perceptions of ChatGPT and artificial intelligence tools in higher education: Evidence from early experiences

Dimple Ravindra Patil , Nitin Liladhar Rane , Obizue Mirian Ndidi , Jayesh Rane (Author)
Abstract View : 93
Download :71

Inclusive education through artificial intelligence: Opportunities, challenges, and ethical considerations

Ojo Amos Adewale , Nitin Liladhar Rane , Martina Oluchi Ogbonna , Jayesh Rane (Author)
Abstract View : 124
Download :150

Teachers’ perceptions and adoption of artificial intelligence in higher education

Silamanthula Hari Krishna, Raja Bhushanam Singavarapu , Molagavalli Rajesh , Siva Sankar Rao...
Abstract View : 55
Download :33

The impact of artificial intelligence and learning analytics on students’ academic performance

Mohd. Shuyeb, Surinder Kumar, Mukesh Kumar, Ankur Jain (Author)
Abstract View : 64
Download :42

Teachers’ acceptance of artificial intelligence tools in K-12 education

Shobhit Srivastava, Surinder Kumar, Mukesh Kumar (Author)
Abstract View : 76
Download :44