Analysis of Classical and Advanced Control Techniques Tuned with Reinforcement Learning
Control Theory. Reinforcement Learning. Offline Loop. Differential Games. Ricatti Equation.
Control theory is used to stabilize systems and obtain specific responses for each type of process. Classic controllers, such as the PID and the IMC used in this research, are spread globally in industries, this because they have well-studied topologies in the literature and are easily applied in microcontrollers; advanced ones, such as GMV, GPC and LQR, also used in this work, have some resistance in common applications in base industries, but are widely used in energy, aerospace and robotic systems, since the complexity and structure of these methods generate robustness and reach satisfactory performances for processes that are difficult to control. In this work, these methods are studied and evaluated with a tuning approach that uses reinforcement learning. The tuning methods are used in two forms and are applied to the controllers, these are the offline loop method and the differential games method. The first uses offline iterations, where the process agent is the control technique used, which uses performance and robustness indexes as an environment (metric of how the process is evolving), being able to organize an adjustment policy for the controller, which is based on rewarding the weighting factor until reaching the process stopping criterion (desired response). The second method is based on using reinforcement strategies that reward the controller as the response changes, so the LQR learns the ideal control policies, adapting to changes in the environment, which allows for better performance by recalculating the traditional gains found. with the Ricatti equation for tuning the regulator; in this method, differential games are used as a framework to model and analyze dynamic systems with multiple agents. To validate what is presented, the Tachogenerator Motore and the Ar Drone are used. The TGM is modeled using least squares estimation with an ARX-SISO topologie, in order to evaluate the first tuning method. The Ar Drone is modeled using the Kalman estimator with an ARX-MIMO structure to evaluate the second tuning method.