Combined use of dynamic inversion and reinforcement learning for optimal adaptive control of supersonic transport airplane motion

Cover Page

Cite item

Full Text

Open Access Open Access
Restricted Access Access granted
Restricted Access Subscription Access

Abstract

We consider the problem of aircraft motion control under uncertainties caused by incomplete and inaccurate knowledge of the aircraft characteristics, as well as by abnormal situations in flight that affect the properties of the aircraft as a control object. One of the effective tools for solving problems of this kind, providing the adjustment of aircraft control algorithms taking into account its changed dynamics, is reinforcement learning in the variant of Approximate Dynamic Programming (ADP), in combination with artificial neural networks. In the last decade, a family of methods known as Adaptive Critic Design (ACD) has been actively developed within the ADP approach to control the behavior of complex dynamical systems. The paper discusses the application of one variant of the ACD approach, namely SNAC (Single Network Adaptive Critic) and its development through combined use with the dynamic inversion method. This approach makes it possible to form an optimal adaptive control law for aircraft motion. Its effectiveness is demonstrated on the example of longitudinal motion control for supersonic transport airplane (SST).

Full Text

Restricted Access

About the authors

G. Dhiman

Moscov Aviation Institute (National Research University)

Author for correspondence.
Email: gd9617@mail.ru
Russian Federation, Moscow

Yu. V. Tiumentsev

Moscov Aviation Institute (National Research University)

Email: yutium@gmail.com
Russian Federation, Moscow

R. A. Tskhay

Moscov Aviation Institute (National Research University)

Email: romantskhai106@yandex.ru
Russian Federation, Moscow

References

  1. Powell W.B. Approximate Dynamic Programming: Solving the Curse of Dimensionality. 2nd Ed. Wiley, 2011. 658 p.
  2. Werbos P.J. Approximate Dynamic Programming for Real-time Control and Neural Modeling // Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, Van Nostrand Reinhold / Eds D.A. White, D.A. Sofge. N.Y. USA, 1992. P. 493–525.
  3. Lewis F.L., Vrabie D. Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control // IEEE Circuits and Systems Magazine. 2009. V. 9. № 3. P. 32–50.
  4. Reinforcement Learning and Approximate Dynamic Programming for Feedback Control / Eds F.L. Lewis, D. Liu. Wiley, 2013. 634 p.
  5. Liu D., Xue S., Zhao B., Luo B., Wei Q. Adaptive Dynamic Programming for Control: A Survey and Recent Advances // IEEE Trans. on Systems, Man, and Cybernetics. 2021. V. 51. № 1. P. 142–160.
  6. Wei Q., Song R., Li B., Lin X. Self-learning Optimal Control of Nonlinear Systems: Adaptive Dynamic Programming Approach. Springer, 2018. 240 p.
  7. Song R., Wei Q., Li Q. Adaptive Dynamic Programming: Single and Multiple Controllers. Springer, 2019. 278 p.
  8. Liu D., Wei Q., Wang D., Yang X., Li H. Adaptive Dynamic Programming with Applications in Optimal Control. Springer, 2017. 609 p.
  9. Хайкин С. Нейронные сети: полный курс. 2-е изд. М.: Вильямс, 2006. 1106 с.
  10. Werbos P.J. A Menu of Designs for Reinforcement Learning over Time // Neural Networks for Control / Eds W.T.Miller, R.S.Sutton, P.J.Werbos. Cambridge, MA: MIT Press, 1990. P. 67–95.
  11. Ferrari S., Stengel R.F. Online Adaptive Critic Flight Control // J. Guidance, Control, and Dynamics. 2004. V. 27. № 5. P. 777–786.
  12. Vamvoudakis K.G., Lewis F.L. Online Actor-critic Algorithm to Solve the Continuous-Time Infinite Horizon Optimal Control Problem // Automatica. 2010. V. 46. P. 878–888.
  13. Wang D., He H., Liu D. Adaptive Critic Nonlinear Robust Control: A Survey // IEEE Trans. Cybern. 2017. V. 47. № 10. P. 1–22.
  14. Wang D., Mu C. Adaptive Critic Control with Robust Stabilization for Uncertain Nonlinear Systems. Springer Nature, 2019. 317 p.
  15. Wang D., Ha M, Zhao M. Advanced Optimal Control and Applications Involving Critic Intelligence. Springer Nature, 2023. 283 p.
  16. Padhi R., Unikrishnan N., Wang X., Balakrishnan S.N. A Single Network Adaptive Critic (SNAC) Architecture for Optimal Control Synthesis for a Class of Nonlinear Systems // Neural Networks. 2006. V. 19. P. 1648–1660.
  17. Steck J.E., Lakshmikanth G.S., Watkins J.M. Adaptive Critic Optimization of Dynamic Inverse Control // AIAA Infotech and Aerospace Conf. Garden Grove, California, USA. Preprint 2012–2408. 21 p.
  18. Lakshmikanth G.S., Padhi R., Watkins J.M., Steck J.E. Single Network Adaptive Critic Aided Dynamic Inversion for Optimal Regulation and Command Tracking with Online Adaptation for Enhanced Robustness // Optimal Control Applications and Methods. 2014. V. 35. P. 479–500.
  19. Lakshmikanth G.S., Padhi R., Watkins J.M., Steck J.E. Adaptive Flight-Control Design Using Neural-Network-Aided Optimal Nonlinear Dynamic Inversion // J. Aerospace Information Systems. 2014. V. 11. № 11. P. 785–806.
  20. Heyer S. Reinforcement Learning for Flight Control: Learning to Fly the PH-LAB. MS Thesis. Deft, Netherlands: Delft University of Technology, 2019. 126 p.
  21. Teirlinck C. Reinforcement Learning for Flight Control: Hybrid Offline-Online Learning for Robust and Adaptive Fault-Tolerance. MS Thesis. Deft, Netherlands: Delft University of Technology, 2022. 153 p.
  22. Tiumentsev Yu.V., Tshay R.A. SNAC Approach to Aircraft Motion Control // Studies in Computational Intelligence. 2023. V. 1120. P. 420–434.
  23. Enns D., Bugajski D., Hendrick R., Stein G. Dynamic Inversion: An Evolving Methodology for Flight Control Design // Intern. J. Control. 1994. V. 59. № 1. P. 71–91.
  24. Looye G. Design of Robust Autopilot Control Laws with Nonlinear Dynamic Inversion // Automatisierungstechnik. 2001. V. 49. № 12. P. 523–531.
  25. Lombaerts T.J.J., Huisman H.O., Chu Q.P., Mulder J.A., Joosten D.A. Nonlinear Reconfiguring Flight Control Based on Online Physical Model Identification // J. of Guidance, Control, and Dynamics, 2009. V. 32. № 3. P. 727–748.
  26. Lombaerts T.J.J., Looye G.H.N. Design and Flight Testing of Nonlinear Auto Flight Control Laws // AIAA Guidance, Navigation and Control Conf. Minneapolis, Minnesota, USA. Preprint 2012–4982. 24 p.
  27. Горбань А.Н. Обобщенная аппроксимационная теорема и вычислительные возможности нейронных сетей // Сиб. журн. вычисл. математики. 1998. Т. 1. № 1. С.11–24.
  28. Горбань А.Н., Дунин-Барковский, Кирдин А.Н. и др. Нейроинформатика. Новосибирск: Наука, 1998. 296 с.
  29. Шибзухов З.М. Некоторые вопросы теоретической нейроинформатики // XIII Всеросс. науч.-техн. конф. “Нейроинформатика-2011”, Школа-семинар “Соврем. проблемы нейроинформатики”. M.: Изд-во МИФИ, 2011. С. 1–30.
  30. Cook M.V. Flight Dynamics Principles. 2nd Ed. Elsevier, 2007. 496 p.
  31. Stevens B.L., Lewis F.L., Johnson E.N. Aircraft Control and Simulation: Dynamics, Controls Design and Autonomous Systems. 3rd Ed. Wiley, 2016. 764 p.
  32. Sutton R.S., Barto A.G. Reinforcement Learning: An Introduction. 2nd Ed. Cambridge, Massachusetts, USA: The MIT Press, 2018. 548 p.
  33. Chulin M.I., Tiumentsev Yu.V., Zarubin R.A. LQR Approach to Aircraft Control Based on the Adaptive Critic Design // Studies in Computational Intelligence. 2023. V. 1120. P. 406–419.
  34. Tiumentsev Yu.V., Zarubin R.A. Lateral Motion Control of a Maneuverable Aircraft Using Reinforcement Learning // Optical Memory and Neural Networks. 2024. V. 33. № 1. P. 1–12.
  35. Prodanik V.A., Efremov A.V. Synthesis of a Controller Based on the Principle of Inverse Dynamics and the Online Identification of a Lateral Motion Model in a Next-Generation Supersonic Transport // Recent Developments in High-Speed Transport / Eds D.Y.Strelets, O.N. Korsun. Springer, 2023. P. 41–49.
  36. Lewis F.L., Vrabie D.L., Syrmos V.L. Optimal Control. 3rd Ed. Hoboken, New Jersey: John Wiley & Sons, Inc., 2012. 550 p.
  37. Bryson A.E., Ho Y.-C. Applied Optimal Control: Optimization, Estimation and Control. N.Y: Taylor & Francis Group, 1975. 496 p.
  38. Grishina A.Y., Efremov A.V. Development of a Controller Law for a Supersonic Transport Using Alternative Means of Automation in the Landing Phase // Recent Developments in High-Speed Transport / Eds D.Y. Strelets, O.N. Korsun. Springer, 2023. P. 41–49.
  39. Webb B.D., Takahashi T.T. Emerging Federal Regulatory Framework for Future Supersonic Transport Aircraft // AIAA SCITECH Forum, San Diego, California, USA. Preprint 2022–0366. 23 p.
  40. Ericsson L., Reding J. Unsteady Aerodynamics of Slender Delta Wings at Large Angles of Attack // J. Aircraft. 1975. V. 12. № 9. P. 721–729.

Supplementary files

Supplementary Files
Action
1. JATS XML
2. Fig. 1. Generalized scheme of reinforcement learning.

Download (124KB)
3. Fig. 2. General structure of the ACD algorithm for adaptive control of a dynamic system.

Download (94KB)
4. Fig. 3. Training scheme of NS-critic network in SNAC approach to motion control.

Download (93KB)
5. Fig. 4. Schematic of SNAC and DI joint operation (J back - set pitch angle value).

Download (40KB)
6. Fig. 5. Pitch angle set point of 5° when SNAC and DI are used together.

Download (127KB)
7. Fig. 6. Multistage pitch angle reference signal processing when SNAC and DI are used together.

Download (132KB)
8. Fig. 7. Stabilization of the balancing angle of attack when SNAC and DI are used together (abal is the balancing value of the angle of attack).

Download (222KB)
9. Fig. 8. Comparison of different variants of DI + SNAC scheme when modeling system failure (v - auxiliary input signal according to relation (2.5)).

Download (179KB)

Copyright (c) 2025 Russian Academy of Sciences