RL Agent External action not properly used in SAC
Show older comments
I am using the external action input of a Simulink RL Agent Block at the beginning of training, to guide the agent.
When using PPO, this was enough to let the agent also learn from those forced external actions.
When using SAC, the agent seems to only learn to output 0 with this setup. I finally found that adding the last_action input fixed the setup. In PPO this seems to happen internally.
This woraround is sufficent for me, so there is no need for an immediate solution. I just thought I would report this unexpected behavior. The documentation says, that the external action is used for learning, so I think the way it works with PPO is the desired outcome.
Accepted Answer
More Answers (0)
Categories
Find more on Reinforcement Learning in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!