Do you have any advice on how to modify the layers/configuration/senses functions to my model to make it better? I am working on creating some fighting npcs. This is my code: NPCSenses.lua · GitHub I can’t really tell if the model is improving during training.
I am just training them out in the open like this:
Something I noticed is the params weights all look the same and they slowly keep getting more and more negative the longer I train them:
Start:
I’ll list them from highest importance to lower ones.
Issue 1: Bad inputs.
You’re using current position. Remove that since it does not contain useful information and likely make the NPC learn much more slower.
Remove Y rotation or change how it is calculated. Currently its calculating its own rotation and not relative rotation to the target. It is useless information for the NPC and makes the learning slower.
Issue 2: Reward values probably need a little bit of tweaking.
Rarer events = Higher reward value. Right now its being biased to negative values since the NPC probably might have encounter negative rewards way too often.
Issue 3: Wrong algorithm.
Switch from DQN to Advantage Actor Critic / Proximal Policy Optimization.
Currently the issue is that DQN only takes into account of changing one state to another, but not a whole timeline. What you’re doing is that you’re expecting the NPC to learn the connection between two states, but in reality the actions relies on the whole timeline since it requires future planning.
I just recently added information to the documentation related to this kind of stuff explaining why those inputs are chosen. It seems like a common mistake I kept seeing here.
Could you explain what the classesList for the critic model is supposed to mean? I was just following the example in the sword fighting code but that was one thing that confused me.
This is my code now I tried to make the improvements you suggested:
The npc is definitely behaving differently now, they seem to be preferring to do 1 action over and over.
Well technically, the classesList for the Critic Model is pretty much useless right now, but I just added it there to avoid unseen bugs. We’re just using raw values from the Critic model, not the selected class.
Also for the bias having one action over and over, I think you can try adding negative rewards if it chooses certain actions under certain conditions. I think that action is being chosen over and over again because it gives the highest reward. Or you can just wait it out and see if it evolves to choose different actions.
Recently I made some minor updates to the REINFORCENeuralNetwork model. If you are planning to use or currently using the model, I recommend you to update the current version immediately. Otherwise, you can update it as soon as you can.
Also, you might have noticed on the lack of updates on this library. That’s because this library is considered feature-complete. There are plenty of more deep reinforcement learning algorithms that I have not added, but there was no demand or those algorithms don’t give significant advantages when added.
That being said, I’m moving on to a new project, but currently unsure what path to take. There are two projects I’m thinking of:
There is “DataPredict Neural” library under development, but it was paused due to lack in computer vision and sequential model demand. I’m not even sure if there are any useful use cases anyways. It is supposed to be similar to TensorFlow and PyTorch (which are pure deep learning library) but I thought the current neural network here is enough to get the job done.
Another project is to create a platform that allows cross-server training for DataPredict by sending data / model parameters through Roblox’s HTTPService. However, people might have to subscribe to this service since I need to bear the cost of server hosting. Might add free tiers for small developers, but I can’t guarantee that.
So, I’m putting a poll here to see which project you want me to develop. Note that even if an option receives a majority vote, it won’t mean I will choose that project to develop.
So, which project do you prefer? You can only choose one option.
Hiya, so I have this code here that gets up to a certain amount of parts, and puts their size, distance and direction in the environment vector, just wondering why I’m getting this incompatibility error.
local InputLayers = 4 + (DetectionMaxInstances * 7)
local DNet = DataPredict.Models.QLearningNeuralNetwork.new()
DNet:createLayers({InputLayers, 6, 6})
-- ...
function GetState()
local EFV = {1, V3toTuple(char.Torso.Position), hrp.Orientation.Y}
local parts = workspace:GetPartBoundsInRadius(hrp.Position, 24, params)
for i, v in parts do
table.insert(EFV, (v.Position-hrp.Position).Magnitude)
table.insert(EFV, v.Size.X)
table.insert(EFV, v.Size.Y)
table.insert(EFV, v.Size.Z)
local lv = lookVec(v.Position, hrp.Position)
table.insert(EFV, lv.X)
table.insert(EFV, lv.Y)
table.insert(EFV, lv.Z)
end
local len = #EFV
print(len)
if len < InputLayers then
for i = #EFV, InputLayers do
EFV[i] = 9e9
end
end
print(#EFV, InputLayers)
return {EFV}
end
On the version 2 , source rbxl, I am getting this error in output when I run it
16:09:34.013 Success! - Server - MainScript:71
16:09:34.564 Something unexpectedly tried to set the parent of Lord_BradyRocks to NPCFolder while trying to set the parent of Lord_BradyRocks. Current parent is Workspace. - Studio
16:09:35.438 Success! (x3) - Server - MainScript:71
also maybe switch out the sword to one that you have to swing to do damage… this one you can and I assume them do damage with out swinging… but no biggie…