Reinforcement Learning Toolbox: Discount factor issue
24 views (last 30 days)
I am trying to apply some RL algorithms in the RL toolbox such as ,the actor-critic algorithm, to a problem where the rewards for each step in an episode is discounted, though in the training manager window I see the episode reward as the cumulative reward rather than the discounted sum of rewards. I wonder if this is a bug as these seems confusing .
Ajay Pattassery on 26 Aug 2019
Edited: Ajay Pattassery on 26 Aug 2019
In the Episode Manager you could view the discounted sum of rewards for each episode named as Episode Reward. This should be the discounted sum of rewards over the time steps if you have set rlACAgentOptions to a discount factor as below.
opt = rlACAgentOptions('DiscountFactor',0.95)
If you are observing the reward on each episode is not the discounted sum of rewards, revert with env, critic, actor, trainOpts to reproduce the issue (Or the code you have used).