DataPredict [Release 1.17] - General Purpose Machine And Deep Learning Library (Learning AIs, Generative AIs, and more!)

MYOriginsWorkshop · May 22, 2024, 1:46pm

Let’s use PPO-Clip then. You may want to set the numberOfReinforcementsPerEpisode for your ReinforcementLearningQuickSetup

noisecooldeadpool362 · May 22, 2024, 1:58pm

Any recommended settings for PPOClip, quicksetup or layers?

MYOriginsWorkshop · May 22, 2024, 2:02pm

Ah. When you use any reinforcement learning models that ends with NeuralNetwork, it will inherit Neural Network model by default. I plan to remove this for this next upcoming version.

For PPO, please refer to the Sword fighting AIs code.

noisecooldeadpool362 · May 22, 2024, 2:03pm

Actor/Critic base right? (30words)

MYOriginsWorkshop · May 22, 2024, 2:04pm

Yep. PPO uses the Actor/Critic ones.

noisecooldeadpool362 · May 22, 2024, 2:10pm

errrm… nevermind, I dont know how to fix it…

MYOriginsWorkshop · May 22, 2024, 2:20pm

Okay, let’s go back to advantage actor critic then…

Sorry, can’t really help right now since I am making changes to the documentations.

noisecooldeadpool362 · May 22, 2024, 2:24pm

The :getActor() function is broken for the A2C, I don’t think I need it though, :reset() or not :reset() gives same results either way.

noisecooldeadpool362 · May 22, 2024, 2:27pm

I gave it a goal to drive towards, but it still keeps shifting between throttleup and throttledown.

same layers for the critic as well

Is my reward function maybe not optimal?

MYOriginsWorkshop · May 22, 2024, 2:31pm

Can you try calling :episodeUpdate() from the A2C (not the ReinforcementLearningQuickSetup) whenever the car resets? I think it doesn’t update properly.

noisecooldeadpool362 · May 22, 2024, 2:41pm

This time it almost reached the curve before crashing, but other than that, not many other improvements.

External Media

Here’s a minute long recording I took of it training, maybe you could take a look and analyse it for any faults?

It keeps repeating these movements over and over, it either almost reaches the curve or crashes to its right or left, always in the exact same spot.

MYOriginsWorkshop · May 22, 2024, 2:49pm

Let’s try giving a reward if the length of side ray casts stays the same for a period of time. Otherwise give zero rewards. Do not use punishments because the ray cast will definitely change its length if it was turning at the corners.

That should encourage the car to move straight.

noisecooldeadpool362 · May 22, 2024, 2:51pm

Each time I give it a reward, would I :reinforce() (e.g 5 seconds with constant raycast distance then :reinforce() ) or increase its total reward value then :reinforce() that every 0.01 seconds in the while loop

MYOriginsWorkshop · May 22, 2024, 2:52pm

Increase its total reward. calling reinforce() will just update the model.

noisecooldeadpool362 · May 22, 2024, 2:56pm

It wants to stay still when it gets rewarded though, what do I do about that

noisecooldeadpool362 · May 22, 2024, 2:58pm

Already have that, distance from goal = punishment

RaterixRGL · May 22, 2024, 2:59pm

One thing I have noticed is when you try to reinforce the AI, you need to make sure the reinforcement is provided in a truly valid area, don’t mix bad data.

MYOriginsWorkshop · May 22, 2024, 3:03pm

Okay let’s modify what I said before:

Let’s try giving a reward if the length of side ray casts stays the same for a period of time. Otherwise give zero rewards. Do not use punishments because the ray cast will definitely change its length if it was turning at the corners.

to something like:

Let’s try giving a reward if the length of side ray casts stays the same for a period of time provided that the car exceeds certain throttle speed. Otherwise give zero rewards. Do not use punishments because the ray cast will definitely change its length if it was turning at the corners.

noisecooldeadpool362 · May 22, 2024, 3:04pm

Uploading: 20240522-1501-55.5603461.mp4…

So far all it does it either crash at these corners as u see here, but I think Aqwam’s advise worked abit and the vehicle managed to atleast hit the corner 2 times as u see in the video, it has never been able to do that before, other than that I don’t know how else I can improve it.

noisecooldeadpool362 · May 22, 2024, 3:04pm

I’ll try this (30 wordsssssssssss)