Reinforcement learning to tune a PI controller

Question

yiwei on 15 Dec 2025 at 8:36

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/2181814-reinforcement-learning-to-tune-a-pi-controller

Commented: Sam Chak on 15 Dec 2025 at 16:10

I’ve been studying the MathWorks official example “Tune PI Controller Using Reinforcement Learning” (link: https://ww2.mathworks.cn/help/reinforcement-learning/ug/tune-pi-controller-using-td3.html?s_tid=srchtitle_site_search_3_TD3) and have some questions during the learning process.

1.When using reinforcement learning to tune a PI controller, is a fixed set of parameters (kp, ki) used for control in the end? (During the simulation, kp and ki do not change in real time, similar to a fuzzy-PID or BP-PID).

2.Will its control performance be comparable to online-tuning algorithms?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Sam Chak about 6 hours ago

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/2181814-reinforcement-learning-to-tune-a-pi-controller#answer_1572694

Hi @yiwei

If you scroll down to the "Validate Trained Agent" section, you will observe that the RL agent returns a set of fixed values for the proportional and integral gains.

Comparison to Fuzzy PID Controllers:

In the design of the Fuzzy PID controller, the control gains can change in real time, depending on the architecture of the controller. For example, human designers can intelligently use fuzzy rules to tune the PID parameters:

In the fixed-valued Fuzzy PID control architecture, it appears as follows:

where

,

, and

and fixed values.

Comparison to online-tuning algorithms:

Most online tuning algorithms typically adjust the parameters of a controller (such as gains), which subsequently determine the control action under dynamic operating conditions. The gains often change continuously or at preset intervals during operation. The algorithm observes the current error from a setpoint in real time and decides whether to update a parameter (such as increasing or decreasing a gain) to enhance future performance. The updated controller employs these new values to calculate the final control action. However, some optimization algorithms may adjust the control signals more directly, such as thrust and angle in interplanetary transfer missions, when the control law is either unavailable or overly complex.

In the example where the PI controller for the water tank is tuned by an RL agent, an offline optimization approach is employed because the system operates under static conditions (the size of the water tank does not change over time, and the water level setpoint is typically fixed). The offline algorithm conducts a test (such as a step response) to determine the "best" set of gains in a simulated environment. Once identified, these gains are fixed and used for standard operation until a human operator or a new trigger event initiates another tuning session.

2 Comments
Show NoneHide None

yiwei 4 minutes ago

Thank you very much for your answer. This undoubtedly resolved my confusion.I would also like to ask whether reinforcement learning can be used for online tuning. If so, are there learning resources in this area? Thanks again.

Sam Chak 14 minutes ago

Hi @yiwei

The example of "Quadruped Robot Locomotion Using DDPG Agent" uses RL for online optimization. Instead of determining the control gains, which are commonly used in conventional strategies to calculate the control action, the RL agent directly generates eight control torque signals for the revolute joints of the robot's four legs.

Sign in to comment.

Reinforcement learning to tune a PI controller

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments
Show NoneHide None

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Reinforcement learning to tune a PI controller

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments Show NoneHide None

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None