DataPredict [Release 1.21] - General Purpose Machine Learning And Deep Learning Library (Learning AIs, Generative AIs, and more!)

If you make the learning rate higher, it will make the training faster, but more likely to have a risk of “untraining” it. I prefer keeping it less than 0.5 if you really want to increase it.

For creating an optimizer, you can have a look at other optimizers that have been created. All of them inherits the BaseOptimizer class.

do i have to increase that layer for LeakyReLU for adding more classList?

Yep. The value must be equal to the length of the classesList table.

hey i have trained 1 hours again and still not efficiently fighting.
training model 1.rbxl (219.5 KB)

i also changed from single view ray to multiple for angle radius view.

can u help?

edit: updated sense script

I’ll deal with it later, currently at work right now.

Oh, does it changes the color too?

also kill reward never being awarded and previous enemy health never detected or counted. so because of that i made additional kill/dead detection for awarding pts.

training model 1.rbxl (326.9 KB)

but the thing is for that additional kill awarding detection, it use default env.

why u no longer can set a setExperienceReplay?

edit: seems like setExperienceReplay removed at 1.14
edit2: i added a chase class.
edit3: i changed view from raycast to AOE fov instead.

I’m surprised that you read the version history documentation…

round 1200 still not getting a efficient bot to fighting.

man its been 3 hours training and the bot still cant fighting each other.

edit: they can fight and chase with in short range like ~10 magnitude distance otherwise they cant.

How about you take a break until I look over the code and see what’s going on here. As much I like your persistence, I’m a bit busy with other things until the weekends.

I made some changes to the scripts. Replace the codes with these ones.

For main script:


local function buildActorModel(ID)
	
	local Model = DataPredict.Models.NeuralNetwork.new(1)
	
	Model:setModelParametersInitializationMode("LeCunUniform")

	Model:addLayer(5, true, 'LeakyReLU', 0.01)

	Model:addLayer(3, true, 'LeakyReLU', 0.01)

	Model:addLayer(7, false, 'StableSoftmax', 0.01)

	Model:setClassesList({'A','D','W','S','jump','useWeapon', "none"})
	
	local ModelParameters = loadModelParameters(ActorModelDataStore, ID)
	
	Model:setModelParameters(ModelParameters)
	
	table.insert(ActorModelArray, Model)
	
	return Model
	
end

local function buildCriticModel(ID)
	
	local Model = DataPredict.Models.NeuralNetwork.new(1)
	
	Model:setModelParametersInitializationMode("LeCunUniform")

	Model:addLayer(5, true, 'LeakyReLU', 0.01)

	Model:addLayer(3, true, 'LeakyReLU', 0.01)

	Model:addLayer(1, false, 'Sigmoid', 0.01)

	Model:setClassesList({1, 2})
	
	local ModelParameters = loadModelParameters(CriticModelDataStore, ID)
	
	Model:setModelParameters(ModelParameters)
	
	table.insert(CriticModelArray, Model)
	
	return Model
	
end

local function buildModel(ID)
	
	local classesList = {'A','D','W','S','jump','useWeapon', "none"}
	
	local MainModel = DataPredict.Models.AdvantageActorCritic.new(60, 0.05, 1)
	
	local AModel = buildActorModel(ID)
	
	local CModel = buildCriticModel(ID)
	
	MainModel:setActorModel(AModel)
	
	MainModel:setCriticModel(CModel)
	
	local MainModelQuickSetup = DataPredict.Others.ReinforcementLearningQuickSetup.new(120, 0.05, 1)
	
	MainModelQuickSetup:setModel(MainModel)
	
	MainModelQuickSetup:setPrintReinforcementOutput(false)
	
	MainModelQuickSetup:setClassesList(classesList)
	
	table.insert(ModelArray, MainModelQuickSetup)
	
	return MainModelQuickSetup
	
end

For Senses script:


local function getRewardValue(orientationDifference)
	
	local currentHealth = Humanoid.Health
	
	local currentLocation = Character:GetPivot().Position

	local currentRotationY = Character:GetPivot().Rotation.Y

	local healthChange = currentHealth - previousHealth

	local closestEnemy, damageDealt, distanceDifference, distanceToEnemy = getEnemyStatus()

	local isSeeingEnemy, viewingDistance = getCurrentView()

	local noEnemy = (closestEnemy == nil)

	local idlePunishment = (noEnemy and -0.1) or 0

	local isEnemyDead = (previousEnemyHealth == 0)

	local enemyDeathReward = (isEnemyDead and 1) or 0

	local isEnemyReward = (isSeeingEnemy and 10) or 0
	
	local isLookingAtTheWall = ((viewingDistance <= 3) and not isSeeingEnemy)
	
	local isLookingAtTheWallValue = (isLookingAtTheWall and 1) or 0
	
	local isLookingAtTheWallReward = (isLookingAtTheWall and -40) or 0
	
	local heightChange = (currentLocation.Y - previousLocation.Y)
	
	local heightChangeValue = ((math.abs(heightChange/lastDeltaTime) > 1) and 0) or 1
	
	local isInRange = false
	
	local isTooNear = false
	
	if not distanceToEnemy then
		
		isTooNear = (distanceToEnemy <= 2)
		
		isInRange = (distanceToEnemy <= 10)
		
	end
	
	local isWeaponUsed = (lastAction == "useWeapon") 
	
	local isDamageDealt = (damageDealt > 0)
	
	local isPlayerHittingEnemyReward = (isWeaponUsed and isInRange and isSeeingEnemy and 30) or (isWeaponUsed and isInRange and 3) or (isWeaponUsed and not isInRange and -0.1) or 0
	
	local movedDistance = (currentLocation - previousLocation).Magnitude
	
	local isWalkingForwardValue = ((lastAction == "W") and 10) or 0
	
	local hasPlayerMovedValue = ((movedDistance/lastDeltaTime > 2) and 1) or 0

	local damageDealtRatio = (damageDealt / maxHealth)

	local healthPercentage = currentHealth / maxHealth

	local healReward = (1 - healthPercentage) * math.log(distanceToEnemy)
	
	local isRotated = (lastAction == "rotateRight") or (lastAction == "rotateLeft")
	
	local rotatingForNoReasonValue = (not isLookingAtTheWallReward and isRotated and -30) or 0

	local distanceChangeReward = (isTooNear and -3) or (heightChangeValue * hasPlayerMovedValue * distanceDifference * 10 * isWalkingForwardValue * isEnemyReward) --math.sign(distanceDifference) * math.log(math.abs(distanceDifference))

	local rewardValue
	
	if isLookingAtTheWall then
		
		rewardValue = -10
		
	elseif ((lastAction == "S") or (lastAction == "A") or (lastAction == "D") or (lastAction == "AW") or (lastAction == "DW") or (lastAction == "AS") or (lastAction == "DS")) and (not isTooNear) then
		
		rewardValue = -40
		
	else
		
		rewardValue = healReward + damageDealtRatio + enemyDeathReward + distanceChangeReward + isPlayerHittingEnemyReward + isLookingAtTheWallReward + orientationDifference + isEnemyReward --+ changeInOrientationReward
		
	end
	
	return rewardValue
	
end

ReinforcementLearningQuickSetup or UniformExperienceReplay? i currently use UniformExperienceReplay.

ReinforcementLearningQuickSetup. Experience Replay is supposed to be used inside the ReinforcementLearningQuickSetup, but it isn’t recommended in this case.

why?

Video (srry for low res to reduce size)

The video is private…

Also, I don’t want to get too deep into the reinforcement learning topic, so let’s just say ExperienceReplay is not that compatible with the current model you are using. If you really want to use it, variants of Q-Learning should do the trick. Otherwise, it will make learning much slower.

can u check again? i dont think it was private.

Now I can access it, but I’m guessing you’re using quite a lot of classesList causing slow training. Also, try reseting the model parameters.

hey u messed up with new 1.14 where AdvantageActorCritic.new argument only accept discountFactor which only 1 argument, but u put 3 there.

Ah sorry, maybe that’s why the training so slow. I must have been too fatigued that I missed that. Here’s the new code. That value of the argument should not be more than one or less than zero. Made it nil since it already has a default value recommended by researchers.


local function buildModel(ID)
	
	local classesList = {'A','D','W','S','jump','useWeapon', "none"}
	
	local MainModel = DataPredict.Models.AdvantageActorCritic.new()
	
	local AModel = buildActorModel(ID)
	
	local CModel = buildCriticModel(ID)
	
	MainModel:setActorModel(AModel)
	
	MainModel:setCriticModel(CModel)
	
	local MainModelQuickSetup = DataPredict.Others.ReinforcementLearningQuickSetup.new(60, 0.05, 1)
	
	MainModelQuickSetup:setModel(MainModel)
	
	MainModelQuickSetup:setPrintReinforcementOutput(false)
	
	MainModelQuickSetup:setClassesList(classesList)
	
	table.insert(ModelArray, MainModelQuickSetup)
	
	return MainModelQuickSetup
	
end