Hmm, ive been training him for quite a while using DoubleQLearningV2 however for some reason he just doesnt want to correct his mistakes…
This is his BuildModel function.
His environmentVector.
His reward and respawning logic(he is penalised and respawned for touching sidewalks)
What’s wrong with this agent??