[Added SAC, DDPG and TD3] DataPredict [Release 2.0] - Machine Learning And Deep Learning Library (Learning AIs, Generative AIs, and more!)

Eternity_Devs · December 22, 2024, 7:28am

im trying to do some experiment and adding ExperienceReplay but end at this error. i think you forgot to implement “update” function to ReinforcementLearningBaseModel.

MYOriginsWorkshop · December 22, 2024, 7:38am

that should be categoricalUpdate(). Not sure why I have missed that. I’ll just go and do fixes to the original library.

Eternity_Devs · December 22, 2024, 7:40am

ProximalPolicyOptimizationClip dont work with UniformExperienceReplay

local function buildActorCriticRLQSModel(ID)

	local MainModel = DataPredict.Models.ProximalPolicyOptimizationClip.new()

	local AModel = buildActorModel(ID)

	local CModel = buildCriticModel(ID)

	MainModel:setActorModel(AModel)

	MainModel:setCriticModel(CModel)

	local MainModelQuickSetup = DataPredict.QuickSetups.CategoricalPolicy.new(60, 0, "Sample")

	local ExperienceReplay = DataPredict.ExperienceReplays.UniformExperienceReplay.new(1, 5, 30)

	MainModelQuickSetup:setModel(MainModel)

	MainModelQuickSetup:setPrintOutput(false)

	MainModelQuickSetup:setClassesList(classesList)

	MainModelQuickSetup:setExperienceReplay(ExperienceReplay)

	table.insert(ReinforcementLearningQuickSetupArray, MainModelQuickSetup)

	if includeRND then

		table.insert(RNDModelArray, buildRNDModel(ID))

	end

	return MainModelQuickSetup

end

MYOriginsWorkshop · December 22, 2024, 7:43am

Well technically, it should be like that though if we’re using the existing RL theories. I won’t go full explanation why it should be like that. So stick with variants for Deep Q Learning, Deep SARSA, And Deep Expected SARSA if you really want to use the experience replays

MYOriginsWorkshop · December 22, 2024, 7:51am

How is your reward “nan”?!?! That could have led to the “nan” model parameters.

I would like to know what PC you’re using.

Eternity_Devs · December 22, 2024, 7:55am

what “PC” u mean?? my device??

i didnt change anything to rewards in sense script.

MYOriginsWorkshop · December 22, 2024, 7:56am

Yea, I was wondering why you’re catching all these errors while I’m not. There could be difference in how your device handles the numbers.

Eternity_Devs · December 22, 2024, 7:57am

since im travelling (not in home) im using laptop (ASUS Vivobook S 14 OLED) it using Ryzen 5 7535HS. it doesnt have descrate gpu in the laptop just integrated gpu with 16gb ram ddr5…

Eternity_Devs · December 22, 2024, 8:00am

i think i had to add return math.max(-1e3, reward) to prevent the nan thingy.

MYOriginsWorkshop · December 22, 2024, 8:01am

That’s very strange. You pretty much have some of the similar specs with my pc, but you’re the one having a lot of problems… Nobody else in this thread complained about the same thing.

MYOriginsWorkshop · December 22, 2024, 8:02am

Keep it the same, don’t change anything else.

Eternity_Devs · December 22, 2024, 8:03am

i just wanted to remove “jump” class… so they focus on attack & movement…

MYOriginsWorkshop · December 22, 2024, 8:03am

Okay yeah, you can do that. Just make sure the final layers stays as LeakyReLU for all models.

Eternity_Devs · December 22, 2024, 8:25am

why the RND dont work (break everything)? whenever turn it on and start the run, it make the npc models spinning around…

MYOriginsWorkshop · December 22, 2024, 8:31am

Rewards generated by RND is likely too high.

Eternity_Devs · December 22, 2024, 8:33am

i have spent like almost 2 day no result, can u help check?
training model 3.rbxl (341.0 KB)

also does frame time per sec affect to the result? 60 fps & 240 fps…

and you can see that even the reward isnt nan, the parameter still can be nan…

Eternity_Devs · December 22, 2024, 9:41am

is there adaptive learning rate? and how do i implement it here?

MYOriginsWorkshop · December 22, 2024, 9:48am

Yeah, there is adaptive learning rate. You can either use the optimizers on their own or combine them with ValueSchedulers. If you choose the latter, the optimizers have the setLearningRateScheduler() function.

MYOriginsWorkshop · December 22, 2024, 9:51am

Yes. I think the calculations might be too slow for the 240 FPS (if we’re talking about heart-beat service and stuff like that. Not too sure about the “visualization” one)

Eternity_Devs · December 22, 2024, 9:51am

so should i sticky with 60 or 240