DataPredict™ [Release 2.3] (Mature + Maintenance Mode) - Machine Learning And Deep Learning Library - 40+ Models + Deep Reinforcement Learning!

MYOriginsWorkshop · June 2, 2024, 2:02pm

Well, you technically can do multiple ways actually. Here are some below:

Population-Based Self-Play

The sword-fighting self-learning AI codes uses this method where it choose an AI that receives the highest reward to be copied to others every once in a while. That theoretically should eliminate any useless AIs.

Use Single Neural Network, But Make It Used By Different Reinforcement Learning Algorithms

There’s a clear separation between Neural Networks and Reinforcement Learning Algorithms codes. You can use a single neural network that handles the same logic, and make it used by multiple RL algorithms to handle different sequences of data at any time. Also each of the RL algorithms must have their own ReinforcementLearningQuickSetup if you want to have them.

You might initialize it something like this:


local NeuralNetwork = DataPredict.Models.NeuralNetwork.new() -- Let's just assume we already setup the layers.

local AdvantageActorCritic = DataPredict.Models.AdvantageActorCritic

local A2C_1 = AdvantageActorCritic.new()

local A2C_2 = AdvantageActorCritic.new()

local ReinforcementLearningQuickSetup =  DataPredict.Others.ReinforcementLearningQuickSetup

local RLQS_1 = ReinforcementLearningQuickSetup.new()

local RLQS_2 = ReinforcementLearningQuickSetup.new()

A2C_1:setModel(NeuralNetwork) -- Two different reinforcement learning algorithms using the same neural network

A2C_2:setModel(NeuralNetwork)

RLQS_1:setModel(A2C_1) -- Each reinforcement learning algorithm have their own "Quick Setup"

RLQS_2:setModel(A2C_2)

Then there we go!

Shared Gradients

You can create a copy of generated neural network model parameters (which we will call parent model parameters) and ensure that all neural networks uses that specific model parameters. When the neural network “updates”, then you send that gradients to the central model parameters. Every once in a while, you reupload back the “parent” model parameters back to the individual neural networks.

It’s a bit complicated to setup and I’m not sure if it is worth writing the whole thing here. Let me know if you are interested in using this way if you’ve tried the others.