DataPredict™ [Release 2.2] (Mature + Maintenance Mode) - Machine Learning And Deep Learning Library - 40+ Models + Deep Reinforcement Learning!

MYOriginsWorkshop · August 21, 2023, 5:52am

@Cffex and @ImNotKevPlayz, let me give you guys some tips. If you guys are planning to do long term training, take advantage of BaseModel’s getModelParameters() and setModelParameters() function that is inherited by reinforcement learning neural networks.

getModelParameters() function returns the tables of matrices that holds all the weight parameters used by the neural network models.

Then you can save them in DataStore or something.

Cffex · August 21, 2023, 5:53am

I will attempt this again using DataPredict, I hadn’t gotten the chance to do this since everyday was busy for me, however not today.

Cffex · August 21, 2023, 5:55am

Hello, I was wondering if it is possible to convert the matrices to JSON string using HttpService?

MYOriginsWorkshop · August 21, 2023, 5:57am

Yea. The thing is those matrices are just regular tables of tables containing values. Nothing too special.

It was designed that way to avoid the hassle of converting stuff.

MYOriginsWorkshop · August 21, 2023, 6:39am

You can encourage the agent to fight against enemies and not idle around by:

Punishing the agent with very small negative values when idling per period of time. This is because the idle time can be quite lengthy and using large negative values wouldn’t be too appropriate.
Reward the agent with very large values when it damages enemies. The reason for the large values because hurting the enemies are rare occurrence. This should encourage the agent to seek out enemies instead of idling around.

LambHubGoBrrrrr · August 21, 2023, 9:01am

I can give you one

local RagdollModule = {}

local ReplicatedStorage = game:GetService("ReplicatedStorage")
local Players = game:GetService("Players")
local PhysicsService = game:GetService("PhysicsService")

local Events = ReplicatedStorage:WaitForChild("Events")

local RemoteEvent = Events.Ragdoll

local Table = {"Left Shoulder","Right Shoulder","Left Hip","Right Hip","Neck"}

function RagdollModule:Ragdoll(Character : Model, bool : boolean)
	local Humanoid : Humanoid = Character.Humanoid
	local Player = Players:GetPlayerFromCharacter(Character)
	if Character.Humanoid.Health > 0 and (Character:FindFirstChild("Torso") and not Character.Torso.Neck.Enabled) and bool == false then
		if Player then
			RemoteEvent:FireClient(Player,false)
		else
			Character.Animate.Disabled = false
			Humanoid:SetStateEnabled(Enum.HumanoidStateType.GettingUp,true)
			Humanoid:ChangeState(Enum.HumanoidStateType.GettingUp)
		end
		for _,v : Instance in ipairs(Character:GetDescendants()) do
			if v:IsA("Motor6D") then
				v.Enabled = true
			elseif v:IsA("BallSocketConstraint") then
				v.Enabled = false
			end
		end
	else
		if Player then
			RemoteEvent:FireClient(Player,true)
		else
			Humanoid:SetStateEnabled(Enum.HumanoidStateType.GettingUp,false)
			Humanoid:ChangeState(Enum.HumanoidStateType.Ragdoll)
			for _,v in ipairs(Humanoid.Animator:GetPlayingAnimationTracks())do 
				v:Stop(0)
			end
			Character.Animate.Disabled = true
		end
	
		for _,v : Instance in ipairs(Character:GetDescendants()) do
			if v:IsA("Motor6D") and RagdollModule:Check(v.Name)then
				v.Enabled = false
			elseif v:IsA("BallSocketConstraint") then
				v.Enabled = true
			end
		end
	end
	return
end

function RagdollModule:Check(Name : string)
	if table.find(Table,Name) then
		return true
	else
		return false
	end
end

function RagdollModule:Joints(Character : Model)
	local Humanoid : Humanoid = Character.Humanoid
	Humanoid.BreakJointsOnDeath = false
	Humanoid.RequiresNeck = false
	
	for _,v : Instance in ipairs(Character:GetDescendants()) do
		if v:IsA("Motor6D") and RagdollModule:Check(v.Name) then
			local BallSocketConstraint = Instance.new("BallSocketConstraint")
			local Attachment0 = Instance.new("Attachment")
			local Attachment1 = Instance.new("Attachment")
			
			Attachment0.CFrame = v.c0
			Attachment1.CFrame = v.C1
			Attachment0.Parent = v.Part0
			Attachment1.Parent = v.Part1
			
			BallSocketConstraint.Attachment0 = Attachment0
			BallSocketConstraint.Attachment1 = Attachment1
			BallSocketConstraint.LimitsEnabled = true
			BallSocketConstraint.TwistLimitsEnabled = true
			BallSocketConstraint.Enabled = false
			BallSocketConstraint.Parent = v.Parent
		elseif v:IsA("BasePart") then
			v.CollisionGroup = "RagdollA"
			if v.Name == "HumanoidRootPart" then
				v.CollisionGroup = "RagdollB"
			elseif v.Name == "Head"then
				v.CanCollide = true
			end
		end
	end
end

return RagdollModule

This an r6 ragdoll I beautified for you to use easily.
Source code : Ragdoll Script R15 and R6 [PC, Mobile, Xbox] - Resources / Community Resources - Developer Forum | Roblox

MYOriginsWorkshop · August 21, 2023, 9:10am

I have added Expected SARSA Neural Network to existing Release 1.2 / Beta 1.15.0.
Too lazy to add to new versions.

ImNotKevPlayz · August 21, 2023, 1:06pm

I’ve already tried that, but it doesn’t seem to necessarily solve the issue of sparse reward. It still collapsed to a single action. Maybe I did something wrong?

MYOriginsWorkshop · August 21, 2023, 1:55pm

Hmm… Probably calculation issues with your code?

Maybe try the library like mines to check if it is the sparse reward issue as opposed to calculation issues in your code.

ImNotKevPlayz · August 21, 2023, 2:13pm

Oddly enough, the AI learns how to walk toward the enemy AI if the enemy AI is frozen. So the code seems to be working. But when the enemy AI is allowed to move, the learning agent can’t learn and the random reward graph reflects this. Maybe this is because the environment is stochastic in a sense. I’ve heard that DQN is not able to learn stochastic policies.

I should try your library, I just haven’t done it yet for some reason. It looks a bit more complicated than the other library I tried so I thought it might take some time to learn. If you can link some tutorials and code examples that would help greatly.

MYOriginsWorkshop · August 21, 2023, 2:34pm

I’ll just put the link of the sample source code of the model and its environment here.

Also it isn’t really that hard to use to be honest. You can see the example code here:

local DataPredict = require(game.ServerScriptStorage:WaitForChild("DataPredict  - Release Version 1.2"))

local QLearningNeuralNetwork = DataPredict.Models.QLearningNeuralNetwork.new(100, 0.1, 1000, 10, 0.45, 0.1, 0.945)

QLearningNeuralNetwork:addLayer(2, true, 'LeakyReLU') --// input

QLearningNeuralNetwork:addLayer(6, true, 'LeakyReLU') --// hidden 1

QLearningNeuralNetwork:addLayer(5, true, 'LeakyReLU') --// hidden 2

QLearningNeuralNetwork:addLayer(3, false, 'sigmoid') --// output

QLearningNeuralNetwork:setClassesList({"A", "B", "C"}) --// output's neuron class

QLearningNeuralNetwork:setPrintReinforcementOutput(false)

local prediction, probability = QLearningNeuralNetwork:reinforce({{1, 2, 3}}, -0.01, false)

print(prediction, probability)

You can have a look at the functions in the API documentation. Make sure you also read the NeuralNetwork one as well since all reinforcement learning neural networks inherits the properties from it.

ImNotKevPlayz · August 21, 2023, 2:43pm

Wait, looking at the code, you’re using sigmoid for the output layer of the DQN, right? Aren’t you not supposed to use any activation function for the output because DQN outputs the expected discounted cumulative reward for each action essentially treating RL as a regression problem?

MYOriginsWorkshop · August 21, 2023, 2:44pm

Sir, this is just an example code. You can change things as you wish.

Magus_ArtStudios · August 21, 2023, 4:01pm

Perhaps the data you are feeding it isn’t the data you want. A more consistent pattern would be the enemys offset relative to the bots current position. That way instead of getting a position some random position in space, you get a much more consistent stream of pattern data.

ImNotKevPlayz · August 21, 2023, 4:08pm

I’m already doing that. That isn’t necessarily the issue. The issue in my opinion is either a bug in the code that only appears when switching to a more complex environment (enemy AI allowed to move) or an issue with the reward structure. I’m thinking of recoding the entire project using this library to prevent bugs and simplify the code.

Magus_ArtStudios · August 21, 2023, 4:13pm

The bot definitely has to train to learn and in nature would make a lot of mistakes until it has enough data.
It’s important that you save your models data.
As your model increases in size it’s important to consider that for each save key they have a maximum size of 4mb. So parsing your datastructure in anyway you can would be best.

function SaveDataPairs (Datastorkey,ChatBotData,Querie)
	-- Create an empty table to store the data pairs
	local DataPairs = {}

	-- Iterate over the children of the Queries folder
	for _, child in ipairs (Querie:GetChildren ()) do
		-- Check if the child is a string value object
		if child:IsA ("StringValue") then
			-- Add the child's name and value to the table as a key-value pair
			DataPairs[child.Name] = child.Value
		end
	end

	-- Save the table to the data store with a key of "ChatBotQueries"
	local success, result = pcall (function ()
		ChatBotData:SetAsync (Datastorkey, DataPairs)
	end)

	if success then
		print ("Data pairs saved successfully")
	else
		warn ("Data pairs failed to save: " .. result)
	end
end
datachange=true
local datachangehandle={}
function LoadDataPairs (Datastorkey,ChatBotData,Querie)
	local Querie=Querie
	local name
	local value
	-- Load the table from the data store with the same key that was used to save it
	local success, result = pcall (function ()
		return ChatBotData:GetAsync (Datastorkey)
	end)

	if success then
		if result then
			print ("Data pairs loaded successfully")

			-- Iterate over the table
			for name, value in pairs (result) do
				-- Create a new string value object in the Queries folder with the name and value from each pair
				if Querie:FindFirstChild(name)==nil then
					local datachanged=true
					local StringValue = Instance.new ("StringValue", Querie)
					StringValue.Name = name	
					StringValue.Value = value
					datachangehandle[Datastorkey] = function() if datachanged==true then datachanged=false return Datastorkey,ChatBotData,Querie else return false end end
					--Learning Function
					StringValue:GetPropertyChangedSignal("Value"):Connect(function()
						datachanged=true
						datachange=true
					end)


				else 
					Querie:FindFirstChild(name).Value=value
					Querie:FindFirstChild(name):GetPropertyChangedSignal("Value"):Connect(function()
						datachanged=true
						datachange=true
					end)
				end	
			end
		else
			print ("No data pairs found for this key")
		end
	else
		warn ("Data pairs failed to load: " .. result)
	end 
end

You can convert the models vector matrix table to a string using a table to string method

ImNotKevPlayz · August 21, 2023, 4:17pm

The bot simply doesn’t learn at all. I’ve trained it for 3000+ rounds before which is more than enough data and yet it failed to learn anything beyond taking the same action again and again. This points to an issue with the implementation, the hyperparameters, or the reward structure. Maybe it could be all of them. A DQN shouldn’t take that long to converge on a policy. If it starts taking the same action over and over again it means that something is wrong. Plus the loss never showed signs of increasing or fluctuating like a normal DQN model would.

MYOriginsWorkshop · August 21, 2023, 4:20pm

If your loss is not fluctuating… There’s a really good chance that there is something wrong with your implementation…

ImNotKevPlayz · August 21, 2023, 4:22pm

Yeah, that’s what I thought as well. The loss looked like an exponential graph dipping down and never really spiked at all. But like I said earlier the strange thing is the loss looked healthy when I restarted the training and froze the enemy AI. It was even able to learn how to walk toward the enemy.

(Enemy AI not frozen)

Epsilon ~0.308 https://gyazo.com/aa273fa2dc8fc074dc90fb963728e8c1 (Red agent is learning, blue is sort of argmax random policy so it creates an easy target)

I don’t really have any graphs of when the enemy AI was frozen apparently, but here’s what I could find:

MYOriginsWorkshop · August 21, 2023, 4:30pm

One, where do you get that awesome visualization graph?

Two, based on the graph, I think your model has gotten stuck in a local optimum or at a saddle point (google these terms if you don’t understand it).

Have you tried exploitation strategies instead of just exploring? (e.g. Choose a random action per number of actions)