[1] F. Y. Hadaegh, S. J. Chung, and H. M. Manohara, “On development of 100-gram-class spacecraft for swarm applications,” IEEE Syst. J., vol. 10, no. 2, pp. 673–684, 2016, doi: 10.1109/JSYST.2014.2327972.
[2] D. Wang, B. Wu, and E. K. Poh, Satellite Formation Flying Relative Dynamics, Formation Design, Fuel Optimal Maneuvers and Formation Maintenance, vol. 87, no. 2001. 2017.
[3] K. T. Alfriend, S. R. Vadali, P. Gurfil, J. P. How, and L. S. Breger, Spacecraft formation flying: Dynamics, control and navigation. 2009.
[4] H. Cho, “Energy-optimal reconfiguration of satellite formation flying in the presence of uncertainties,” Adv. Sp. Res., vol. 67, no. 5, pp. 1454–1467, Mar. 2021, doi: 10.1016/j.asr.2020.11.036.
[5] K. Dharmarajan and G. B. Palmerini, “Optimal Reconfiguration Manoeuvres in Formation Flying Missions,” in 2021 IEEE Aerospace Conference (50100), Mar. 2021, vol. 2021-March, pp. 1–9, doi: 10.1109/AERO50100.2021.9438285.
[6] X. Bai, Y. He, and M. Xu, “Low-Thrust Reconfiguration Strategy and Optimization for Formation Flying Using Jordan Normal Form,” IEEE Trans. Aerosp. Electron. Syst., vol. 57, no. 5, pp. 3279–3295, Oct. 2021, doi: 10.1109/TAES.2021.3074204.
[7] G. Di Mauro, D. Spiller, S. F. Rafano Carnà, and R. Bevilacqua, “Minimum-fuel control strategy for spacecraft formation reconfiguration via finite-time maneuvers,” J. Guid. Control. Dyn., vol. 42, no. 4, pp. 752–768, 2019, doi: 10.2514/1.G003822.
[8] G. Di Mauro, D. Spiller, R. Bevilacqua, and S. D’Amico, “Spacecraft formation flying reconfiguration with extended and impulsive maneuvers,” J. Franklin Inst., vol. 356, no. 6, pp. 3474–3507, Apr. 2019, doi: 10.1016/j.jfranklin.2019.02.012.
[9] H. M. PARI and H. Bolandi, “Discrete time multiple spacecraft formation flying attitude optimal control in presence of relative state constraints,” Chinese J. Aeronaut., vol. 34, no. 4, pp. 293–305, 2021.
[10] D. Wang, B. Wu, and E. K. Poh, Satellite Formation Flying, vol. 87. Singapore: Springer Singapore, 2017.
[11] H. Rouzegar, A. Khosravi, and P. Sarhadi, “Spacecraft formation flying control under orbital perturbations by state-dependent Riccati equation method in the presence of on–off actuators,” Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng., vol. 233, no. 8, pp. 2853–2867, Jun. 2019, doi: 10.1177/0954410018787417.
[12] X. Liu, P. Lu, and B. Pan, “Survey of convex optimization for aerospace applications,” Astrodynamics, vol. 1, no. 1, pp. 23–40, Sep. 2017, doi: 10.1007/s42064-017-0003-8.
[13] D. Parente, D. Spiller, and F. Curti, “Time-suboptimal satellite formation maneuvers using inverse dynamics and differential evolution,” J. Guid. Control. Dyn., vol. 41, no. 5, pp. 1108–1121, 2018, doi: 10.2514/1.G003110.
[14] W. Wang, G. Mengali, A. A. Quarta, and J. Yuan, “Distributed adaptive synchronization for multiple spacecraft formation flying around Lagrange point orbits,” Aerosp. Sci. Technol., vol. 74, pp. 93–103, 2018, doi: 10.1016/j.ast.2018.01.007.
[15] B. Shasti, A. Alasty, and N. Assadian, “Robust distributed control of spacecraft formation flying with adaptive network topology,” Acta Astronaut., vol. 136, no. October 2016, pp. 281–296, Jul. 2017, doi: 10.1016/j.actaastro.2017.03.001.
[16] H. Liu, Y. Tian, F. L. Lewis, Y. Wan, and K. P. Valavanis, “Robust formation flying control for a team of satellites subject to nonlinearities and uncertainties,” Aerosp. Sci. Technol., vol. 95, p. 105455, Dec. 2019, doi: 10.1016/j.ast.2019.105455.
[17] Y. Guo, J. Zhou, and Y. Liu, “Distributed RISE control for spacecraft formation reconfiguration with collision avoidance,” J. Franklin Inst., vol. 356, no. 10, pp. 5332–5352, Jul. 2019, doi: 10.1016/j.jfranklin.2019.05.003.
[18] D. Lee, “Nonlinear disturbance observer-based robust control for spacecraft formation flying,” Aerosp. Sci. Technol., vol. 76, pp. 82–90, May 2018, doi: 10.1016/j.ast.2018.01.027.
[19] G. Gaias and S. D’Amico, “Impulsive Maneuvers for Formation Reconfiguration Using Relative Orbital Elements,” J. Guid. Control. Dyn., vol. 38, no. 6, pp. 1036–1049, Jun. 2015, doi: 10.2514/1.G000189.
[20] M. Chernick and S. D’Amico, “New Closed-Form Solutions for Optimal Impulsive Control of Spacecraft Relative Motion,” J. Guid. Control. Dyn., vol. 41, no. 2, pp. 301–319, Feb. 2018, doi: 10.2514/1.G002848.
[21] B. L. Wu, D. W. Wang, and E. K. Poh, “Energy-optimal low-thrust satellite formation manoeuvre in presence of J2 perturbation,” Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng., vol. 225, no. 9, pp. 961–968, 2011, doi: 10.1177/0954410011408659.
[22] A. D. Ogundele, “Approximate analytic solution of nonlinear Riccati spacecraft formation flying dynamics in terms of orbit element differences,” Aerosp. Sci. Technol., vol. 113, p. 106686, 2021, doi: 10.1016/j.ast.2021.106686.
[23] J. C. H. Christopher, “Watkins and peter dayan,” Q-Learning. Mach. Learn., vol. 8, no. 3, pp. 279–292, 1992.
[24] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press, 2018.
[25] L. Buşoniu, T. de Bruin, D. Tolić, J. Kober, and I. Palunko, “Reinforcement learning for control: Performance, stability, and deep approximators,” Annu. Rev. Control, vol. 46, no. xxxx, pp. 8–28, 2018, doi: 10.1016/j.arcontrol.2018.09.005.
[26] S. G. Khan, G. Herrmann, F. L. Lewis, T. Pipe, and C. Melhuish, “Reinforcement learning and optimal adaptive control: An overview and implementation examples,” Annu. Rev. Control, vol. 36, no. 1, pp. 42–59, 2012, doi: 10.1016/j.arcontrol.2012.03.004.
[27] D. Bertsekas, Dynamic programming and optimal control: Volume I, vol. 1. Athena scientific, 2012.
[28] D. Bertsekas, Reinforcement learning and optimal control. Athena Scientific, 2019.
[29] Y. Yang, Y. Wan, J. Zhu, and F. L. Lewis, “H ∞ Tracking Control for Linear Discrete-Time Systems: Model-Free Q-Learning Designs,” IEEE Control Syst. Lett., vol. 5, no. 1, pp. 175–180, Jan. 2021, doi: 10.1109/LCSYS.2020.3001241.
[30] C. Chen, H. Modares, K. Xie, F. L. Lewis, Y. Wan, and S. Xie, “Reinforcement Learning-Based Adaptive Optimal Exponential Tracking Control of Linear Systems with Unknown Dynamics,” IEEE Trans. Automat. Contr., vol. 64, no. 11, pp. 4423–4438, Nov. 2019, doi: 10.1109/TAC.2019.2905215.
[31] N. Li, I. Kolmanovsky, and A. Girard, “LQ control of unknown discrete-time linear systems—A novel approach and a comparison study,” Optim. Control Appl. Methods, vol. 40, no. 2, pp. 265–291, 2019, doi: 10.1002/oca.2477.
[32] S. A. A. Rizvi and Z. Lin, “Output Feedback Q-Learning Control for the Discrete-Time Linear Quadratic Regulator Problem,” IEEE Trans. Neural Networks Learn. Syst., vol. 30, no. 5, pp. 1523–1536, 2019, doi: 10.1109/TNNLS.2018.2870075.
[33] S. Ali Asad Rizvi and Z. Lin, “Model-Free Global Stabilization of Discrete-Time Linear Systems with Saturating Actuators Using Reinforcement Learning,” Proc. IEEE Conf. Decis. Control, vol. 2018-Decem, no. Cdc, pp. 5276–5281, 2019, doi: 10.1109/CDC.2018.8618941.
[34] S. Ali, A. Rizvi, and Z. Lin, “Output Feedback Optimal Tracking Control Using Reinforcement Q-Learning,” Proc. Am. Control Conf., vol. 2018-June, no. 2, pp. 3423–3428, 2018, doi: 10.23919/ACC.2018.8430997.
[35] B. Kiumarsi, K. G. Vamvoudakis, H. Modares, and F. L. Lewis, “Optimal and Autonomous Control Using Reinforcement Learning: A Survey,” IEEE Trans. Neural Networks Learn. Syst., vol. 29, no. 6, pp. 2042–2062, 2018, doi: 10.1109/TNNLS.2017.2773458.
[36] X. Li, L. Xue, and C. Sun, “Linear quadratic tracking control of unknown discrete-time systems using value iteration algorithm,” Neurocomputing, vol. 314, pp. 86–93, 2018, doi: 10.1016/j.neucom.2018.05.111.
[37] M. Zheng, Y. Wu, and C. Li, “Reinforcement learning strategy for spacecraft attitude hyperagile tracking control with uncertainties,” Aerosp. Sci. Technol., vol. 119, p. 107126, Dec. 2021, doi: 10.1016/j.ast.2021.107126.
[38] X. Wang, P. Shi, C. Wen, and Y. Zhao, “Design of Parameter-self-tuning Controller Based on Reinforcement Learning for Tracking Non-cooperative Targets in Space,” IEEE Trans. Aerosp. Electron. Syst., vol. 9251, no. c, pp. 1–1, 2020, doi: 10.1109/taes.2020.2988170.
[39] J. Broida and R. Linares, “Spacecraft rendezvous guidance in cluttered environments via reinforcement learning,” Adv. Astronaut. Sci., vol. 168, pp. 1777–1788, 2019.
[40] F. Sun and K. Turkoglu, “Reinforcement learning based continuous-time on-line spacecraft dynamics control: Case study of NASA SPHERES spacecraft,” AIAA Guid. Navig. Control Conf. 2018, no. 210039, pp. 1–11, Jan. 2018, doi: 10.2514/6.2018-0859.
[41] S. Silvestrini and M. R. Lavagna, “Spacecraft Formation Relative Trajectories Identification for Collision-Free Maneuvers using Neural-Reconstructed Dynamics,” no. January, pp. 1–14, 2020, doi: 10.2514/6.2020-1918.
[42] M. Shirobokov, S. Trofimov, and M. Ovchinnikov, “Survey of machine learning techniques in spacecraft control design,” Acta Astronaut., vol. 186, no. April, pp. 87–97, Sep. 2021, doi: 10.1016/j.actaastro.2021.05.018.
[43] A. Scorsoglio, A. D’Ambrosio, L. Ghilardi, B. Gaudet, F. Curti, and R. Furfaro, “Image-Based Deep Reinforcement Meta-Learning for Autonomous Lunar Landing,” J. Spacecr. Rockets, pp. 1–13, 2021.
[44] F. L. Lewis, D. Vrabie, and V. L. Syrmos, Optimal control. John Wiley & Sons, 2012.
[45] J. Sullivan, S. Grimberg, and S. D’Amico, “Comprehensive survey and assessment of spacecraft relative motion dynamics models,” J. Guid. Control. Dyn., vol. 40, no. 8, pp. 1837–1859, 2017, doi: 10.2514/1.G002309.
[46] S. A. Schweighart and R. J. Sedwick, “High-Fidelity Linearized J Model for Satellite Formation Flight,” J. Guid. Control. Dyn., vol. 25, no. 6, pp. 1073–1080, Nov. 2002, doi: 10.2514/2.4986.
[47] F. L. Lewis, D. Vrabie, and K. G. Vamvoudakis, “Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers,” IEEE Control Syst. Mag., vol. 32, no. 6, pp. 76–105, 2012.
[48] D. Vrabie, K. G. Vamvoudakis, and F. L. Lewis, Optimal adaptive control and differential games by reinforcement learning principles. 2012.
[49] P. Werbos, “Approximate dynamic programming for realtime control and neural modelling,” Handb. Intell. Control neural, fuzzy Adapt. approaches, pp. 493–525, 1992.
[50] A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, “Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control,” Automatica, vol. 43, no. 3, pp. 473–481, 2007, doi: 10.1016/j.automatica.2006.09.019.