I’ve tried this. I still don’t see the values going beyond [–1, 1]. However, I might be able to answer your question. If you have a look at the helper functions createTD3Agent.m and createDDPGAgent.m, you will notice the ‘agentoptions’ object. The parameters called ‘ExplorationModel’ or ‘NoiseModel’ specify details about the kind of noise added to the predicted action. This can either be an ‘OrnsteinUhlenbeckActionNoise’ object or a ‘GaussianActionNoise’ object each with their own set of parameters. Have a more detailed look at the Noise Options here: rlDDPGAgentoptions and rlTD3AgentOptions. This noise is added to encourage the agent to explore the environment.
The output action from the tanhLayer in the ‘actorNetwork’ will still be in the range of [–1, 1]. Once the noise is added, the new action values will be saturated to the limits specified in the ‘ActorInfo’. These limits will be [-Inf, Inf] by default and won’t saturate your action values when not mentioned.