DataPredict [Release 1.17] - General Purpose Machine And Deep Learning Library (Learning AIs, Generative AIs, and more!)

MYOriginsWorkshop · June 2, 2024, 2:02pm

Well, you technically can do multiple ways actually. Here are some below:

Population-Based Self-Play

The sword-fighting self-learning AI codes uses this method where it choose an AI that receives the highest reward to be copied to others every once in a while. That theoretically should eliminate any useless AIs.

Use Single Neural Network, But Make It Used By Different Reinforcement Learning Algorithms

There’s a clear separation between Neural Networks and Reinforcement Learning Algorithms codes. You can use a single neural network that handles the same logic, and make it used by multiple RL algorithms to handle different sequences of data at any time. Also each of the RL algorithms must have their own ReinforcementLearningQuickSetup if you want to have them.

You might initialize it something like this:


local NeuralNetwork = DataPredict.Models.NeuralNetwork.new() -- Let's just assume we already setup the layers.

local AdvantageActorCritic = DataPredict.Models.AdvantageActorCritic

local A2C_1 = AdvantageActorCritic.new()

local A2C_2 = AdvantageActorCritic.new()

local ReinforcementLearningQuickSetup =  DataPredict.Others.ReinforcementLearningQuickSetup

local RLQS_1 = ReinforcementLearningQuickSetup.new()

local RLQS_2 = ReinforcementLearningQuickSetup.new()

A2C_1:setModel(NeuralNetwork) -- Two different reinforcement learning algorithms using the same neural network

A2C_2:setModel(NeuralNetwork)

RLQS_1:setModel(A2C_1) -- Each reinforcement learning algorithm have their own "Quick Setup"

RLQS_2:setModel(A2C_2)

Then there we go!

Shared Gradients

You can create a copy of generated neural network model parameters (which we will call parent model parameters) and ensure that all neural networks uses that specific model parameters. When the neural network “updates”, then you send that gradients to the central model parameters. Every once in a while, you reupload back the “parent” model parameters back to the individual neural networks.

It’s a bit complicated to setup and I’m not sure if it is worth writing the whole thing here. Let me know if you are interested in using this way if you’ve tried the others.

disclosuure · June 2, 2024, 2:14pm

Thank you for the very detailed response, I am just confused about the second option. The AdvantageActorCritic model has no setModel function. Currently I am setting a critic model and actor model. Do you mean that I should use the same critic and actor models for all the AdvantageActorCritic models?

MYOriginsWorkshop · June 2, 2024, 2:17pm

Ah sorry. I completely forgotten about that.

All critic models will use a single neural network, and all the actor models will use another single neural network.

disclosuure · June 2, 2024, 6:49pm

We were able to implement the single neural network method and the results are better, but we are having an issue where after some time, all the agents stop moving. We have done some debugging and found some information:

The output action is reported as nil after the issue begins,
Occasionally in NewAdvantageActorCriticModel:setEpisodeUpdateFunction, the advantage and actionProbabilty are nil, and this leads to an error causing the Model:getModel():reset() function to be called every 60th time rather than providing the nil output.
The environment vector looks normal when it succeeds and fails, and also when the output is not nil.

MYOriginsWorkshop · June 2, 2024, 8:00pm

I have a feeling the current reinforcement model isn’t quite suitable for this task. Change it to one of these:

Q-Learning (Most Recommended)
State Action Reward State Action
Expected State Action Reward State Action

You can use the variants of those models as well.

MYOriginsWorkshop · June 4, 2024, 12:42pm

Writing API documentation for DataPredict Neural is so painful.

WHY DOES WRITING THE API REFERENCE TAKES SO LONG!

4 more folders left…

MYOriginsWorkshop · June 4, 2024, 10:16pm

Yo. Got any more issues? Or everything is fine right now?

disclosuure · June 5, 2024, 11:25am

We were able to get them working and now training against hard coded bots so just a matter of refining the reward function. There’s an issue where after a while they all stop moving and the output from the ReinforcementNeuralNetworkQuickSetup is reported as nil. If I restart the server it goes back to normal even if the data is saved.

This one isn’t related to Model:reset() but I think it’s a similar issue. We will try Q-learning also to see if it resolved it

MYOriginsWorkshop · June 5, 2024, 11:51am

To be honest, I just think the neural network calculation isn’t fast enough between the reinforcement learning models. Maybe all you need is to add pcall around it?

disclosuure · June 5, 2024, 12:52pm

We have a pcall around the reinforce function, it never fails, but the output from the function is nil

local success, output = pcall(function()
	return Model:reinforce(environmentVector, rewardValue)
end)

MYOriginsWorkshop · June 5, 2024, 8:01pm

Hi guys. Since we don’t really have a Roblox machine learning community around here, I’ll be creating a discord channel for it.

For now, the server is a little bland, but hopefully we can throw around some ideas on how to decorate it.

noisecooldeadpool362 · June 6, 2024, 5:19am

Well, this isn’t really a decoration suggestion but have you heard of KAN? It’s said to improve machine learning training 10x faster according to what I heard. Instead of a multitude of activation functions, KAN’s algorithm uses a B-Spline(Basic Spline) to mush it together which supposedly speeds up the process alongside much other. I think this technique is interesting and would be greatly helpful for module users if you manage to implement it. It is also said that KAN is more readable than MLP, so users can also see what the bot is actually trying to do/how it is learning because of the spline structure. Something else that caught my eye was it said KAN actually beat the curse of dimensionality, which is most likely the reason for the nil values that keep popping up when people try training using your RL models, so this would improve it I think. KAN also needs less data to converge faster than MLP (200 parameters needed for KAN compared to 300,000 parameters from MLP to outperform it)
I hope you are able to look into it. :> Sorry for the lack of activity.

MYOriginsWorkshop · June 6, 2024, 5:33am

Not with this library. Nope.

That could be useful for “DataPredict Neural” library. Currently it is a pain in the butt to write the API documentation for it.

Also wtf is KAN? Don’t throw in some abbreviation for no reason.

noisecooldeadpool362 · June 6, 2024, 5:41am

Oh, I thought you knew about KAN, it’s a mindblowing recent invention that people speculate will takeover MLPs in future AI implementations, Kolmogorov-Arnold Network. You should read the paper on it, it’s amazing.

Copmattis100 · June 6, 2024, 11:20am

Hi. so im having trouble understanding how to make a gan. can you help me?

Misterx113 · June 6, 2024, 11:42am

KAN is not faster, but slower. There are some cons. But yes, it can be better in some use cases.

Misterx113 · June 6, 2024, 11:43am

I don’t think so. You should read the cons too. Also in the paper are some tricks to make it look better than it is.

Misterx113 · June 6, 2024, 11:44am

You change the aktivation function there too.

MYOriginsWorkshop · June 6, 2024, 1:07pm

Sure. What are you attempting to do? Tell me the details of the sizes of the matrices and stuff like that.

Copmattis100 · June 6, 2024, 1:13pm

hi so i decided not to use that approach but i got an error when using the SVM. can i get help?