DataPredict™ [Release 2.4] (Mature + Maintenance Mode) - Machine Learning And Deep Learning Library - 40+ Models + Deep Reinforcement Learning!

Oof, very tricky stuff there. But let’s try anyways. I still don’t enough context but we’ll go on anyways.

You may want to change the model to AdvantageActorCritic model because Double Q Learning isn’t quite suitable for that task. I really don’t want to explain what are the differences between models, so just trust me.

We will also split the location-based abilities and movement to separate 2 models. Otherwise, I expect it will make training longer and harder.

For location based abilities, take the three output as your vector from the first model. And only call reinforce() once you deploy the location-based abilities for few seconds so that you can calculate the reward. You may also want to set return original output to “true” and use those values instead.

For movement, just reuse the same code that I have given in the sword fighting AIs (version 2), but you may want to adjust it.