MATLAB Answers

Incorrect tanhLayer output in RL agent

4 views (last 30 days)
Last layer in my actor network is set to tanhLayer. However, I am seeing output that goes above 1 or below -1 from the RL agent block. Is this normal behavior of RL agent?

  4 Comments

Show 1 older comment
Mohammad Ashraful Islam
Mohammad Ashraful Islam on 8 Apr 2020
If you take out uper and lower limit on rlNumericSpec, as such:
actInfo = rlNumericSpec([numAct 1],'LowerLimit',-1,'UpperLimit',1);
with:
actInfo = rlNumericSpec([numAct 1]);
And observe the output action of the agent, you will see the output going above or below 1. However, when I manually use tanh layer in the command window, I get the right output between -1 and 1 without needing to use lower or upper limit.
such as:
layer = [tanhLayer];
predict (layer, 5.0)
ans =
0.9999
Asvin Kumar
Asvin Kumar on 11 Apr 2020
I am unable to reproduce the error. Here's what I got:
Each view corresponds to a leg of the bipedal robot. The three signals are the normalized torques applied to the ankle, knee and hip.
Mind sharing your model to have a look?
Mohammad Ashraful Islam
Mohammad Ashraful Islam on 13 Apr 2020
I am just making sure, did you make the following change?
replace:
actInfo = rlNumericSpec([numAct 1],'LowerLimit',-1,'UpperLimit',1);
with
actInfo = rlNumericSpec([numAct 1]);

Sign in to comment.

Accepted Answer

Asvin Kumar
Asvin Kumar on 13 Apr 2020
I’ve tried this. I still don’t see the values going beyond [–1, 1]. However, I might be able to answer your question. If you have a look at the helper functions createTD3Agent.m and createDDPGAgent.m, you will notice the agentoptions’ object. The parameters called ‘ExplorationModel’ or ‘NoiseModel’ specify details about the kind of noise added to the predicted action. This can either be an ‘OrnsteinUhlenbeckActionNoise’ object or a GaussianActionNoise’ object each with their own set of parameters. Have a more detailed look at the Noise Options here: rlDDPGAgentoptions and rlTD3AgentOptions. This noise is added to encourage the agent to explore the environment.
The output action from the tanhLayer in the ‘actorNetwork’ will still be in the range of [–1, 1]. Once the noise is added, the new action values will be saturated to the limits specified in the ‘ActorInfo. These limits will be [-Inf, Inf] by default and won’t saturate your action values when not mentioned.

  1 Comment

Mohammad Ashraful Islam
Mohammad Ashraful Islam on 20 Apr 2020
That makes sense. Thanks for the explanation!

Sign in to comment.

More Answers (0)