The reward is minimized using DQN agent

I made a DQN reinforcement learning agent to solve my own problem. Specifically, my task is to determine the location of some electric vehicle charging stations in a transportation network, and I defined a nagative reward for each step. However, it seems that agent tries to find the worst solution. I have used the RL toolbox many times and I never met a problem like this. If I change the reward signal to a positive value, the agent will maxmize the eposide reward instead, which still gives the worst solution.
Thank you for your help!

1 Comment

I'm having the same issue with the PPO agent, did you understood the cause of the problem?

Sign in to comment.

Answers (0)

Asked:

on 28 Sep 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!