hey i have trained 1 hours again and still not efficiently fighting.
training model 1.rbxl (219.5 KB)
i also changed from single view ray to multiple for angle radius view.
can u help?
edit: updated sense script
hey i have trained 1 hours again and still not efficiently fighting.
training model 1.rbxl (219.5 KB)
i also changed from single view ray to multiple for angle radius view.
can u help?
edit: updated sense script
I’ll deal with it later, currently at work right now.
Oh, does it changes the color too?
also kill reward never being awarded and previous enemy health never detected or counted. so because of that i made additional kill/dead detection for awarding pts.
training model 1.rbxl (326.9 KB)
but the thing is for that additional kill awarding detection, it use default env.
why u no longer can set a setExperienceReplay?
edit: seems like setExperienceReplay removed at 1.14
edit2: i added a chase class.
edit3: i changed view from raycast to AOE fov instead.
I’m surprised that you read the version history documentation…
round 1200 still not getting a efficient bot to fighting.
man its been 3 hours training and the bot still cant fighting each other.
edit: they can fight and chase with in short range like ~10 magnitude distance otherwise they cant.
How about you take a break until I look over the code and see what’s going on here. As much I like your persistence, I’m a bit busy with other things until the weekends.
I made some changes to the scripts. Replace the codes with these ones.
For main script:
local function buildActorModel(ID)
local Model = DataPredict.Models.NeuralNetwork.new(1)
Model:setModelParametersInitializationMode("LeCunUniform")
Model:addLayer(5, true, 'LeakyReLU', 0.01)
Model:addLayer(3, true, 'LeakyReLU', 0.01)
Model:addLayer(7, false, 'StableSoftmax', 0.01)
Model:setClassesList({'A','D','W','S','jump','useWeapon', "none"})
local ModelParameters = loadModelParameters(ActorModelDataStore, ID)
Model:setModelParameters(ModelParameters)
table.insert(ActorModelArray, Model)
return Model
end
local function buildCriticModel(ID)
local Model = DataPredict.Models.NeuralNetwork.new(1)
Model:setModelParametersInitializationMode("LeCunUniform")
Model:addLayer(5, true, 'LeakyReLU', 0.01)
Model:addLayer(3, true, 'LeakyReLU', 0.01)
Model:addLayer(1, false, 'Sigmoid', 0.01)
Model:setClassesList({1, 2})
local ModelParameters = loadModelParameters(CriticModelDataStore, ID)
Model:setModelParameters(ModelParameters)
table.insert(CriticModelArray, Model)
return Model
end
local function buildModel(ID)
local classesList = {'A','D','W','S','jump','useWeapon', "none"}
local MainModel = DataPredict.Models.AdvantageActorCritic.new(60, 0.05, 1)
local AModel = buildActorModel(ID)
local CModel = buildCriticModel(ID)
MainModel:setActorModel(AModel)
MainModel:setCriticModel(CModel)
local MainModelQuickSetup = DataPredict.Others.ReinforcementLearningQuickSetup.new(120, 0.05, 1)
MainModelQuickSetup:setModel(MainModel)
MainModelQuickSetup:setPrintReinforcementOutput(false)
MainModelQuickSetup:setClassesList(classesList)
table.insert(ModelArray, MainModelQuickSetup)
return MainModelQuickSetup
end
For Senses script:
local function getRewardValue(orientationDifference)
local currentHealth = Humanoid.Health
local currentLocation = Character:GetPivot().Position
local currentRotationY = Character:GetPivot().Rotation.Y
local healthChange = currentHealth - previousHealth
local closestEnemy, damageDealt, distanceDifference, distanceToEnemy = getEnemyStatus()
local isSeeingEnemy, viewingDistance = getCurrentView()
local noEnemy = (closestEnemy == nil)
local idlePunishment = (noEnemy and -0.1) or 0
local isEnemyDead = (previousEnemyHealth == 0)
local enemyDeathReward = (isEnemyDead and 1) or 0
local isEnemyReward = (isSeeingEnemy and 10) or 0
local isLookingAtTheWall = ((viewingDistance <= 3) and not isSeeingEnemy)
local isLookingAtTheWallValue = (isLookingAtTheWall and 1) or 0
local isLookingAtTheWallReward = (isLookingAtTheWall and -40) or 0
local heightChange = (currentLocation.Y - previousLocation.Y)
local heightChangeValue = ((math.abs(heightChange/lastDeltaTime) > 1) and 0) or 1
local isInRange = false
local isTooNear = false
if not distanceToEnemy then
isTooNear = (distanceToEnemy <= 2)
isInRange = (distanceToEnemy <= 10)
end
local isWeaponUsed = (lastAction == "useWeapon")
local isDamageDealt = (damageDealt > 0)
local isPlayerHittingEnemyReward = (isWeaponUsed and isInRange and isSeeingEnemy and 30) or (isWeaponUsed and isInRange and 3) or (isWeaponUsed and not isInRange and -0.1) or 0
local movedDistance = (currentLocation - previousLocation).Magnitude
local isWalkingForwardValue = ((lastAction == "W") and 10) or 0
local hasPlayerMovedValue = ((movedDistance/lastDeltaTime > 2) and 1) or 0
local damageDealtRatio = (damageDealt / maxHealth)
local healthPercentage = currentHealth / maxHealth
local healReward = (1 - healthPercentage) * math.log(distanceToEnemy)
local isRotated = (lastAction == "rotateRight") or (lastAction == "rotateLeft")
local rotatingForNoReasonValue = (not isLookingAtTheWallReward and isRotated and -30) or 0
local distanceChangeReward = (isTooNear and -3) or (heightChangeValue * hasPlayerMovedValue * distanceDifference * 10 * isWalkingForwardValue * isEnemyReward) --math.sign(distanceDifference) * math.log(math.abs(distanceDifference))
local rewardValue
if isLookingAtTheWall then
rewardValue = -10
elseif ((lastAction == "S") or (lastAction == "A") or (lastAction == "D") or (lastAction == "AW") or (lastAction == "DW") or (lastAction == "AS") or (lastAction == "DS")) and (not isTooNear) then
rewardValue = -40
else
rewardValue = healReward + damageDealtRatio + enemyDeathReward + distanceChangeReward + isPlayerHittingEnemyReward + isLookingAtTheWallReward + orientationDifference + isEnemyReward --+ changeInOrientationReward
end
return rewardValue
end
ReinforcementLearningQuickSetup or UniformExperienceReplay? i currently use UniformExperienceReplay.
ReinforcementLearningQuickSetup. Experience Replay is supposed to be used inside the ReinforcementLearningQuickSetup, but it isn’t recommended in this case.
The video is private…
Also, I don’t want to get too deep into the reinforcement learning topic, so let’s just say ExperienceReplay is not that compatible with the current model you are using. If you really want to use it, variants of Q-Learning should do the trick. Otherwise, it will make learning much slower.
can u check again? i dont think it was private.
Now I can access it, but I’m guessing you’re using quite a lot of classesList causing slow training. Also, try reseting the model parameters.
hey u messed up with new 1.14 where AdvantageActorCritic.new argument only accept discountFactor which only 1 argument, but u put 3 there.
Ah sorry, maybe that’s why the training so slow. I must have been too fatigued that I missed that. Here’s the new code. That value of the argument should not be more than one or less than zero. Made it nil since it already has a default value recommended by researchers.
local function buildModel(ID)
local classesList = {'A','D','W','S','jump','useWeapon', "none"}
local MainModel = DataPredict.Models.AdvantageActorCritic.new()
local AModel = buildActorModel(ID)
local CModel = buildCriticModel(ID)
MainModel:setActorModel(AModel)
MainModel:setCriticModel(CModel)
local MainModelQuickSetup = DataPredict.Others.ReinforcementLearningQuickSetup.new(60, 0.05, 1)
MainModelQuickSetup:setModel(MainModel)
MainModelQuickSetup:setPrintReinforcementOutput(false)
MainModelQuickSetup:setClassesList(classesList)
table.insert(ModelArray, MainModelQuickSetup)
return MainModelQuickSetup
end
Did you deleted the modules on Roblox? Because I cant find them anymore and in my favorite folder there’s much: Content deleted. The only one I can find [Which is the right one [but idk the version]] is from a X32Gex5
I’ve moved them into GitHub to avoid legal loopholes. Sorry about that. You now have to take it directly from the GitHub. Also could you give the link for X32Gex5 thing.
hey my bot finally get better now after some changes on awarding pts and fixes such like view fov.
this just trained less 5-10 min. btw i removed jump from input.
i like the bot movement rn where it never/less like to go look a wall.
but i believe that sense script still need to be fixed.