Hello guys! I made some fixes and improvements to all experience replays.
If anyone using any kind of experience replay, I highly recommend you to replace the current version of Release 1.17 to updated version of Release 1.17 as soon as possible.
Hello guys! I made some fixes and improvements to all experience replays.
If anyone using any kind of experience replay, I highly recommend you to replace the current version of Release 1.17 to updated version of Release 1.17 as soon as possible.
Hello guys! I made some improvements to the neural networks!
I recommend you guys updating Release 1.17 as soon as possible. Also, this should make the sword-fighting AIs train slightly more faster.
Hello, thank you so much for making this! This is incredible, and i will definetly use this in my projects.
im making a new Anti-Cheat now, but why does the cost value just go up? ive tried different combos of different MatrixL and DataPredict versions but nothing works, heres the code:
local module = {}
local MT = {}
MT.__index = MT
function module.new(Dataset, Varibles, Externals, Mask)
local self = {}
setmetatable(self, MT)
local NewDataset = {}
local Labels = {}
for i, v in pairs(Dataset) do
local Div = 0
for i2, v2 in pairs(v["Divs"]) do
Div += v2
end
Div = Div * v["Time"]
local Statistics = {}
for i2, Stat in pairs(v["Stats"]) do
table.insert(Statistics, Stat/Div)
end
for i2, v2 in pairs(Externals) do
if v["Externals"][v2] then
table.insert(Statistics, v["Externals"][v2])
else
table.insert(Statistics, Mask)
end
end
if v["Cheater"] == true then
table.insert(Labels, {1})
else
table.insert(Labels, {-1})
end
table.insert(NewDataset, Statistics)
end
print(NewDataset)
local Model = require(game.ReplicatedStorage.DataPredict.Models.SupportVectorMachine).new(100, 1, 'RadialBasisFunction')
Model:train(NewDataset, Labels)
return self
end
return module
Hello guys! I added two more models which are double dueling Q-learning variants. I also made some improvements to the library.
I recommend you guys updating Release 1.17 as soon as possible.
Good news people! You guys can now do accelerated self-learning AI training!
No more long training times!
Added ValueScheduler class. These will help you adjust the values as you call the calculate() function.
Added setEpsilonValueScheduler() and getEpsilonValueScheduler() function into the ReinforcementLearningQuickSetup.
Added setLearningRateValueScheduler() and getLearningRateValueScheduler() function into BaseOptimizer.
Removed epsilon decay factor parameter inside the ReinforcementLearningQuickSetup in favour of using ValueScheduler.
Removed timeStepToDecay parameter from the LearningRateTimeDecay optimizer.
Hi. Does this allow reward learning?
Somewhat. Currently the library only have Random Network Distillation if you want to do internal rewards.
Im trying to create something like in those videos of how it has to walk if yk what i mean
Possible, but quite limited with this library. Currently, it doesn’t support continuous action spaces. Only discrete ones.
Likely it won’t be implemented on this library, but rather on “DataPredict Neural” library.
Can I make a personalized advertisment ai that recommends new products in my shop with discounts based on previous user purchases, chat history, and product viewing behavior
Absolutely can. There’s a plenty of algorithms you can choose from.
Clustering Models? Yes.
Neural Networks? Yes.
Neural Networks with Reinforcement Learning? Hell Yes.
which algorithm would you recommend for maximizing the purchases? I’m planning on potentially using another ai too in picking discounts.
For example discounts are given to increase spending. So the ai might learn specific patterns of discounts to maximize spending.
If a user never spends then they get big discounts. IF they do spend but require certain patterns, the ai learns.
Maximizing Purchase? Probably stick with reinforcement learning. Those models will try to maximize getting rewards (in this case the amount of purchase). Though it might take a while to train.
As for discounts, be careful. You might not want the user exploit the model just by selecting discounted items, but never undiscounted items.
I might use ai for personalizing objects that maximize purchases and a simple algorithm I script myself for selecting discounts. The discount will be based on their previous purchases.
Never spent: 80-90% off
Rarely spends: 10-30%
Occasionally spends 5-20%
Consistent spending 15-40%
Frequency of previous purchases is accounted based on a point system, with older purchases degrading. Once a user makes their first purchase, they can never return to the “never spent category.”
This post didn’t age very well…
Heads up guys!
The next update will allow some of the algorithms to support continuous action spaces!
I’ll be updating the Beta version multiple times before releasing a stable release version for all to use!
So get ready!
Why I do this? Well, my plan is to make DataPredict as the industrial and research standard for RL in Roblox. So I’ll be completing this update before I leave for my Masters.
Added DiagonalGaussianPolicy and placed it under QuickSetups section.
Added a new parameter for reinforce() function to AsynchronousAdvantageActorCritic model.
Added diagonalGaussianUpdate() function to AsynchronousAdvantageActorCritic model.
Renamed ReinforcementLearningQuickSetup to CategoricalPolicy and placed it under QuickSetups section. Also made some internal code changes.
ReinforcementLearningBaseModel’s and ReinforcementLearningActorCriticBaseModel’s setUpdateFunction() and update() functions have been replaced with setCategoricalUpdateFunction(), setDiagonalGaussianUpdateFunction(), categoricalUpdate() and diagonalGaussianUpdate().
Made internal code changes to all reinforcement learning algorithms in the library.
Made a few API breaking changes related to the AsynchronousAdvantageActorCritic model:
Renamed update() function to categoricalUpdate().
Renamed reset() function to resetAll().
Renamed singleReset() function to reset().
Please update the MatrixL library so that you don’t run into issues when using this DataPredict library version. Some changes have been made at MatrixL library and these changes gets transferred over to the DataPredict library.
I also have added a new tutorial to the documentation that explains the discrete and continuous action spaces. Go on and have a look!