DataPredict [Release 1.21] - General Purpose Machine Learning And Deep Learning Library (Learning AIs, Generative AIs, and more!)

MYOriginsWorkshop · March 27, 2024, 3:57am

If you make the learning rate higher, it will make the training faster, but more likely to have a risk of “untraining” it. I prefer keeping it less than 0.5 if you really want to increase it.

For creating an optimizer, you can have a look at other optimizers that have been created. All of them inherits the BaseOptimizer class.

Eternity_Devs · March 27, 2024, 4:18am

do i have to increase that layer for LeakyReLU for adding more classList?

MYOriginsWorkshop · March 27, 2024, 5:00am

Yep. The value must be equal to the length of the classesList table.

Eternity_Devs · March 27, 2024, 5:29am

hey i have trained 1 hours again and still not efficiently fighting.
training model 1.rbxl (219.5 KB)

i also changed from single view ray to multiple for angle radius view.

can u help?

edit: updated sense script

MYOriginsWorkshop · March 27, 2024, 6:12am

I’ll deal with it later, currently at work right now.

Misterx113 · March 27, 2024, 7:39am

Oh, does it changes the color too?

Eternity_Devs · March 27, 2024, 12:44pm

also kill reward never being awarded and previous enemy health never detected or counted. so because of that i made additional kill/dead detection for awarding pts.

training model 1.rbxl (326.9 KB)

but the thing is for that additional kill awarding detection, it use default env.

why u no longer can set a setExperienceReplay?

edit: seems like setExperienceReplay removed at 1.14
edit2: i added a chase class.
edit3: i changed view from raycast to AOE fov instead.

MYOriginsWorkshop · March 28, 2024, 3:39am

I’m surprised that you read the version history documentation…

Eternity_Devs · March 28, 2024, 5:44am

round 1200 still not getting a efficient bot to fighting.

Eternity_Devs · March 28, 2024, 2:05pm

man its been 3 hours training and the bot still cant fighting each other.

edit: they can fight and chase with in short range like ~10 magnitude distance otherwise they cant.

MYOriginsWorkshop · March 28, 2024, 2:15pm

How about you take a break until I look over the code and see what’s going on here. As much I like your persistence, I’m a bit busy with other things until the weekends.

MYOriginsWorkshop · March 28, 2024, 2:31pm

I made some changes to the scripts. Replace the codes with these ones.

For main script:


local function buildActorModel(ID)
	
	local Model = DataPredict.Models.NeuralNetwork.new(1)
	
	Model:setModelParametersInitializationMode("LeCunUniform")

	Model:addLayer(5, true, 'LeakyReLU', 0.01)

	Model:addLayer(3, true, 'LeakyReLU', 0.01)

	Model:addLayer(7, false, 'StableSoftmax', 0.01)

	Model:setClassesList({'A','D','W','S','jump','useWeapon', "none"})
	
	local ModelParameters = loadModelParameters(ActorModelDataStore, ID)
	
	Model:setModelParameters(ModelParameters)
	
	table.insert(ActorModelArray, Model)
	
	return Model
	
end

local function buildCriticModel(ID)
	
	local Model = DataPredict.Models.NeuralNetwork.new(1)
	
	Model:setModelParametersInitializationMode("LeCunUniform")

	Model:addLayer(5, true, 'LeakyReLU', 0.01)

	Model:addLayer(3, true, 'LeakyReLU', 0.01)

	Model:addLayer(1, false, 'Sigmoid', 0.01)

	Model:setClassesList({1, 2})
	
	local ModelParameters = loadModelParameters(CriticModelDataStore, ID)
	
	Model:setModelParameters(ModelParameters)
	
	table.insert(CriticModelArray, Model)
	
	return Model
	
end

local function buildModel(ID)
	
	local classesList = {'A','D','W','S','jump','useWeapon', "none"}
	
	local MainModel = DataPredict.Models.AdvantageActorCritic.new(60, 0.05, 1)
	
	local AModel = buildActorModel(ID)
	
	local CModel = buildCriticModel(ID)
	
	MainModel:setActorModel(AModel)
	
	MainModel:setCriticModel(CModel)
	
	local MainModelQuickSetup = DataPredict.Others.ReinforcementLearningQuickSetup.new(120, 0.05, 1)
	
	MainModelQuickSetup:setModel(MainModel)
	
	MainModelQuickSetup:setPrintReinforcementOutput(false)
	
	MainModelQuickSetup:setClassesList(classesList)
	
	table.insert(ModelArray, MainModelQuickSetup)
	
	return MainModelQuickSetup
	
end

For Senses script:


local function getRewardValue(orientationDifference)
	
	local currentHealth = Humanoid.Health
	
	local currentLocation = Character:GetPivot().Position

	local currentRotationY = Character:GetPivot().Rotation.Y

	local healthChange = currentHealth - previousHealth

	local closestEnemy, damageDealt, distanceDifference, distanceToEnemy = getEnemyStatus()

	local isSeeingEnemy, viewingDistance = getCurrentView()

	local noEnemy = (closestEnemy == nil)

	local idlePunishment = (noEnemy and -0.1) or 0

	local isEnemyDead = (previousEnemyHealth == 0)

	local enemyDeathReward = (isEnemyDead and 1) or 0

	local isEnemyReward = (isSeeingEnemy and 10) or 0
	
	local isLookingAtTheWall = ((viewingDistance <= 3) and not isSeeingEnemy)
	
	local isLookingAtTheWallValue = (isLookingAtTheWall and 1) or 0
	
	local isLookingAtTheWallReward = (isLookingAtTheWall and -40) or 0
	
	local heightChange = (currentLocation.Y - previousLocation.Y)
	
	local heightChangeValue = ((math.abs(heightChange/lastDeltaTime) > 1) and 0) or 1
	
	local isInRange = false
	
	local isTooNear = false
	
	if not distanceToEnemy then
		
		isTooNear = (distanceToEnemy <= 2)
		
		isInRange = (distanceToEnemy <= 10)
		
	end
	
	local isWeaponUsed = (lastAction == "useWeapon") 
	
	local isDamageDealt = (damageDealt > 0)
	
	local isPlayerHittingEnemyReward = (isWeaponUsed and isInRange and isSeeingEnemy and 30) or (isWeaponUsed and isInRange and 3) or (isWeaponUsed and not isInRange and -0.1) or 0
	
	local movedDistance = (currentLocation - previousLocation).Magnitude
	
	local isWalkingForwardValue = ((lastAction == "W") and 10) or 0
	
	local hasPlayerMovedValue = ((movedDistance/lastDeltaTime > 2) and 1) or 0

	local damageDealtRatio = (damageDealt / maxHealth)

	local healthPercentage = currentHealth / maxHealth

	local healReward = (1 - healthPercentage) * math.log(distanceToEnemy)
	
	local isRotated = (lastAction == "rotateRight") or (lastAction == "rotateLeft")
	
	local rotatingForNoReasonValue = (not isLookingAtTheWallReward and isRotated and -30) or 0

	local distanceChangeReward = (isTooNear and -3) or (heightChangeValue * hasPlayerMovedValue * distanceDifference * 10 * isWalkingForwardValue * isEnemyReward) --math.sign(distanceDifference) * math.log(math.abs(distanceDifference))

	local rewardValue
	
	if isLookingAtTheWall then
		
		rewardValue = -10
		
	elseif ((lastAction == "S") or (lastAction == "A") or (lastAction == "D") or (lastAction == "AW") or (lastAction == "DW") or (lastAction == "AS") or (lastAction == "DS")) and (not isTooNear) then
		
		rewardValue = -40
		
	else
		
		rewardValue = healReward + damageDealtRatio + enemyDeathReward + distanceChangeReward + isPlayerHittingEnemyReward + isLookingAtTheWallReward + orientationDifference + isEnemyReward --+ changeInOrientationReward
		
	end
	
	return rewardValue
	
end

Eternity_Devs · March 28, 2024, 3:15pm

ReinforcementLearningQuickSetup or UniformExperienceReplay? i currently use UniformExperienceReplay.

MYOriginsWorkshop · March 28, 2024, 3:29pm

ReinforcementLearningQuickSetup. Experience Replay is supposed to be used inside the ReinforcementLearningQuickSetup, but it isn’t recommended in this case.

Eternity_Devs · March 28, 2024, 4:06pm

why?

Video (srry for low res to reduce size)

MYOriginsWorkshop · March 28, 2024, 4:32pm

The video is private…

Also, I don’t want to get too deep into the reinforcement learning topic, so let’s just say ExperienceReplay is not that compatible with the current model you are using. If you really want to use it, variants of Q-Learning should do the trick. Otherwise, it will make learning much slower.

Eternity_Devs · March 28, 2024, 4:35pm

can u check again? i dont think it was private.

MYOriginsWorkshop · March 28, 2024, 4:38pm

Now I can access it, but I’m guessing you’re using quite a lot of classesList causing slow training. Also, try reseting the model parameters.

Eternity_Devs · March 28, 2024, 5:08pm

hey u messed up with new 1.14 where AdvantageActorCritic.new argument only accept discountFactor which only 1 argument, but u put 3 there.

MYOriginsWorkshop · March 28, 2024, 5:47pm

Ah sorry, maybe that’s why the training so slow. I must have been too fatigued that I missed that. Here’s the new code. That value of the argument should not be more than one or less than zero. Made it nil since it already has a default value recommended by researchers.


local function buildModel(ID)
	
	local classesList = {'A','D','W','S','jump','useWeapon', "none"}
	
	local MainModel = DataPredict.Models.AdvantageActorCritic.new()
	
	local AModel = buildActorModel(ID)
	
	local CModel = buildCriticModel(ID)
	
	MainModel:setActorModel(AModel)
	
	MainModel:setCriticModel(CModel)
	
	local MainModelQuickSetup = DataPredict.Others.ReinforcementLearningQuickSetup.new(60, 0.05, 1)
	
	MainModelQuickSetup:setModel(MainModel)
	
	MainModelQuickSetup:setPrintReinforcementOutput(false)
	
	MainModelQuickSetup:setClassesList(classesList)
	
	table.insert(ModelArray, MainModelQuickSetup)
	
	return MainModelQuickSetup
	
end