why lastLastAction is after lastAction variable? also why its never be used?
Ah that’s just a mistake lol. I was too concerned about making the training faster and didn’t clean up unused variables.
i use ur updated ver 3 and why its never walk forward? also how do you reset a model without changing the datastore key?
You can reset it by changing the argument inside loadModelParameters() function to “nil”.
local function buildActorModel(ID)
local Model = DataPredict.Models.NeuralNetwork.new(1)
Model:setModelParametersInitializationMode("LeCunUniform")
Model:addLayer(5, true, 'LeakyReLU', 0.001)
Model:addLayer(3, true, 'LeakyReLU', 0.001)
Model:addLayer(7, false, 'StableSoftmax', 0.001)
Model:setClassesList({'A','D','W','S','jump','useWeapon', "none"})
local ModelParameters = loadModelParameters(ActorModelDataStore, ID)
Model:setModelParameters(nil) -- Here.
table.insert(ActorModelArray, Model)
return Model
end
local function buildCriticModel(ID)
local Model = DataPredict.Models.NeuralNetwork.new(1)
Model:setModelParametersInitializationMode("LeCunUniform")
Model:addLayer(5, true, 'LeakyReLU', 0.001)
Model:addLayer(3, true, 'LeakyReLU', 0.001)
Model:addLayer(1, false, 'Sigmoid', 0.001)
Model:setClassesList({1, 2})
local ModelParameters = loadModelParameters(CriticModelDataStore, ID)
Model:setModelParameters(nil) -- And here.
table.insert(CriticModelArray, Model)
return Model
end
Don’t forget to put ModelParameters back after you stop the game.
btw how do u export and separate matrix value and make it an UI like this:
and what this do?
The matrices are nothing more than just a “table of table of numbers”. Anyways, since we are using a neural network model parameters, it will be a table of matrices.
for matrixNumber = 1, #ModelParameters, 1 do
local matrix = ModelParameters[matrixNumber]
for i = 1, #matrix, 1 do -- Rows
for j = 1, #matrix[1], 1 do -- Columns
print(matrix[i][j])
end
end
end
You might have to create your own UI though.
To answer your other question…
Model:setClassesList({1, 2})
This one is just to avoid unexpected bugs. Though, I think you can safely remove it if it bothers you that much.
hey, i just let my roblox studio run while the bot training it self for a half hours and no progress result, they keep walk around like spinning.
Hmm, I think it is just better to use the values output from the predict() function in that case.
matrixValue = Model:predict(currentInput, true)
The columns should be the actions, and the order of the keys follow the table in setClassesList()
You can see the parameters for predict() argument here.
Might need to change the Model then. I did made some changes to the code because there are so many websites giving me wrong information.
its CriticModel right? there no predict function on it.
I’ll give you the full code. This is for main script:
local function onInputReceived(ID, environmentVector, rewardValue)
local Model = ModelArray[ID]
local ActorModel = ActorModelArray[ID]
pcall(function()
local output = Model:reinforce(environmentVector, rewardValue)
local predictedMatrix = ActorModel:predict(environmentVector, true)
MatrixL:printMatrix(predictedMatrix)
OutputEvent:Fire(ID, output)
end)
end
You can choose from a random ID. Different ID means different model parameters. So you might want to choose when ID = 1.
wait, do the printed matrix is sequentially? ex:
in the table there 0.000002, 0.7, 1.78e-07, and so on…, so its sequentially?
class:
A, D, W, S, jump, attack, idle → A: 0.000002, D: 1.78e-08, W: …
Yep. And since we’re using softmax as our last layer, you can consider them as probabilities.
I think its better if you format it in a way that it shows less decimal numbers… That looks like hard to read.
4 decimal length
Reinforcement Learning Sword Fighting Version 3 (1).rbxl (214.7 KB)
edit: fixed the ui on changing model.
edit 2: fixes, add pause/resume button, better rounding decimal number (specially for negatives) & colored background when decrement/increment.
hey i have trained the bots (2 bot) for ~2 hours and they still cant fighting efficiently.
training model 1.rbxl (215.6 KB)
also learningRate
at addLayer function, u set it as 0.001 so what if i change it to higher like 0.1 or 0.5? it will be faster or slower? also how do you create an optimizer?