DataPredict [Release 1.19] - Game-Oriented Machine And Deep Learning Library (Learning AIs, Generative AIs, and more!)

I wish Roblox would support GPU acceleration in the future lol to support your work.

At least we have coroutines. We can still speed up the training time by distributed training to replace the lack of GPU acceleration.

You can also utilize the inferior version of GPU cores, the CPU cores. By using actors to parallelize intense computations by spreading the load into many actors, you can do computations in parallel and fetch them for use. This proved effective when I tried to make my ant simulation faster

this sounds cool, only if there was a way for us noobs to understand how to use this model to train our AI.

Explain this like im five!

Yea, I’m currently thinking of making video tutorials on how to use my library. Currently, I have no other personal projects to do and I probably have a lot of free time while doing my masters degree next January.

Let me know what you guys want to know about this library for my video tutorials.

Hi, thanks for the quick reply,

For the video i would suggest explaining us how to use your library in general.

Show us example of how to use your library on a new baseplate project,
something simple so we can get the grasp of it.

the tutorial should answer these questions:

How do i define the goal behaviour of the NPC(How to tell the library what the npc should learn to do at the end of its training)

How do i use the library to train my NPC/AI to do achieve the goal.

How do i give points, negative points,

What of the learning functions is best for what

Hey! Long time no see, hope you still remember me, came back to roblox again to experiment. Now, I have learnt quite a lot of RL, ML knowledge. I’d like to ask if it is possible to merge model parameters to another model with a different neural network structure. For example, train an AI to pathfind with 4 environment vector values then transfer the model parameters to train another model with 10 model parameters, is that possible?

I’m no expert at this but theres 2 ways that I know of:

  1. Backpropagation → This is only applicable if you have a massive pre-existing data to train the model off, for example: If you want to train an AI to predict the square root of n, then you would need to get a sqrt dataset, or if you want to train an AI to classify if a shape is a square or a triangle; then you would need to get an image dataset for training and testing, the dataset must have variations or else the model would overfit into very specific scenarios
  2. Reward System → A more versatile but slower option is to train the model in a simulation. In DQL, the model gets a reward value for every action, higher reward means good performance, lower rewards means it can be better. This method requires you to model a reward function, which can be tricky since the model can easily cheat/exploit the system and not get the desired behavior

The swordfighting bot should be sufficient for you to learn off

Yes, but only 10%(?)

Since we’re talking different sized model parameters, there are no way for us to modify the values that ensures both model parameters contribute equally to the output (assuming the outputs are both the same). However, if you still want to do this, I can think of a couple of methods.

  • You trim down the larger model parameters to fit the smaller one and and then take the average from both values.

  • You expand the smaller larger model parameters by padding it with zeros, and then take the average from both values.

1 Like

Im talking about the second option.I already know about that but dont know the library api and things so I would like to know how to use the library for that. Someone should make a tutorial

There quite already is a tutorial, its in the documentation and theres literally a uncopylocked place of a swordfighting bot, complete with data saving

but what if we want to make something else. how do i know how to tell the bot to do other things with the script?
how do i give points and negative points(with the script)

For reinforcement learning, you just need to pass in the reward value in :reinforce(). And to tell the bot to do something else, just modify the actions, environment vector, and reward function.

Please just take a look here:

1 Like

(post deleted by author)

In the swordfight project, the only things the bot does is:

  1. Observe it’s surrounding
  2. Make a decision
  3. Reflect on it’s mistakes
  4. Repeat

Note: It’s all mathematics! No magic here.

  • How does it observe?

Try to put yourself in its shoes here. When you learn how to swordfight, you’ve got to observe your opponent right? When you swordfight, you gather information about your surroundings, enemies to be able to safely do a maneuver. “He is vulnerable” “He’s getting close, gotta dodge”, etc, etc…

The bot in the project observes too, but rather in numbers. If I recall correctly, the bot gathers information such as: distance between the bot and it’s enemy, enemy’s velocity, enemy’s rotational velocity

These numbers are called inputs / features. They are informations that we gather for them so that they can make a good decision out of them.

Example:

local distance = (enemy.Position - bot.Position).Magnitude

Note: the bots do not gather inputs on their own, we do. We plug in the inputs into their brain, then expecting a decision out of them.

  • How come, from the inputs, can they even make a decision?

Decisions are also called outputs / prediction

Their brain is essentially a giant function with lots of parameters. From a mathematical standpoint, it makes sense: f(x) -> y.

Think of the brain as a function, f, with input x, and output y, i.e, you provide informations / observations (which is x), you get a decision (which is y).

Aforementioned, the brain is a giant function with a lot of parameters, each of which can influence the brain’s decision.

For example: f(x) = m * x^2 -> y:

  1. f means function, or aforementioned as the brain
  2. x means inputs, or aforementioned as informations / observations
  3. y means outputs, or aforementioned as decisions
  4. m is the parameter here. Already said, parameters influence decision, you can see even if m changes ever so slightly, the outputs drastically change. This f(x) function is a really small one, meanwhile, the swordfight’s brain is a giant function; therefore choosing the right parameters inside the brain is super important.

The brain is exactly that, a function that takes in informations / observations as numbers, spits out a decision as also numbers, which is heavily influenced by parameters.

So, the brain’s function looks something like:

x_1 = distance_to_enemy
x_2 = enemy_health

f(x_1, x_2) = m_1 * x_1 + m_2 * x_2 -> (y_1, y_2, y_3, y_4)

We can put as many inputs / informations / observations as we like, we can also make the bot spits out as many outputs / decisions as we like.

We establish the architecture of the function, two inputs and four outputs. The more intricate the function, the more diverse the decisions can be.

We also establish the parameters to be random at first. Then, as the bot learns, the parameters continuously change itself, resulting in more wanted behavior coming from the bot.

The bot’s brain processes inputs like the distance to the enemy or their velocity as numerical values. But a decision isn’t just a number, it’s an action. So, how do we get from cold, hard numbers to something like, “Attack” or “Dodge”?

When the bot processes the inputs through its function, it produces outputs like (y_1, y_2, y_3, y_4). These might correspond to potential actions:

  • y_1 could represent the likelihood or intensity of an attack.
  • y_2 could represent a decision to defend.
  • y_3 might suggest dodging.
  • y_4 might indicate retreating.

Whether or not y_1, y_2 or whatever do correspond to these actions is your choice. You can make it, instead of y_1 representing some sort of attack, you could make it throttle if you want it to drive a car. Same for other outputs.

So, what do the outputs look like? Probably like this:

  • y_1 = 0.8 (attack)
  • y_2 = 0.2 (defend)
  • y_3 = 0.4 (dodge)
  • y_4 = 0.1 (retreat)

Why are they all non-negative numbers? Why are they all less than 1? Again, it’s your choice whether to make it like that or not. People, like me, usually do it because it is easy to interpret.

How do you deliberately make it like that anyway?

f(x) = max(0, x) → produces non-negative numbers
f(x) = sin(x) → produces numbers in range -1 to 1

When you have a number in range 0 - 1, you could present it as a percentage, or some kind of intensity. In code you could do something like:

if y_1 >= 0.5 then
    attack()
end

It is entirely up to you what you want to do with the outputs.

In the swordfight project, instead of attack, defend, dodge, retreat, they are move_foward, move_backward, move_left, move_right, swing_sword, rotate_left, rotate_right.

So, the brain takes in observations / informations as numbers then through sheer amounts of calculations then provide decisions, which are also numbers. The numbers are then interpreted into meaningful corresponding actions.

  • How does it reflect on it’s mistakes?

Above-mentioned, m means parameters, which influence the decisions of the brain. So, to reflect on one’s mistakes and thereby improve means to tweak the parameter m such that the resulting tweaking to the parameter result in better decisions / outputs.

A mistake happens when the bot’s decision leads to an unfavorable outcome. For example, if the bot decided to move forward y_1 = 0.8 when it should have dodged y_3 = 0.4, the decision led to the bot being hit by opponent’s sword, resulting in death. This failure is a mistake.

To reflect on this, the bot needs feedback. The feedback usually comes in the form of a reward or punishment, indicating whether the decision was good or bad. For example:

  • Positive feedback (reward) if the bot successfully dodges or lands a hit.
  • Negative feedback (punishment) if the bot gets hit or fails to defend properly.

In the project, the bot is rewarded if the opponent’s health is deducted, since it means that the bot is winning. Conversely, the bot is punished if the bot’s health is deducted, since it means that the bot is being attacked, implying that the bot made a mistake somewhere.

So, how do you change the bot’s behavior? This is where parameters come into play - m. So, how do you change the parameters so that the bot improves?

Honestly, I only know a bit, plus it’s really complicated and I’m not an expert in the field. Since it requires college level mathematics and I’m not in college yet!

You ought to ask the guy himself or research on Google. Sorry.

And honestly, you don’t need to understand this section to use the library. Just the first part of the section.

  • Repeat
  1. Observe surroundings (gather x_1, x_2, ..., x_n)
  2. Make a decision (f(x_1, x_2, ..., x_n) -> (y_1, y_2, ..., y_n))
  3. Reflect on mistakes, change.

I got so much free time I ended up writing a whole article.

2 Likes

Lol. I appreciate you writing the whole thing. I didn’t have time to create a response to that guy’s question since my hand is full on dealing with my Master’s application and some Roblox metaverse-building companies.

Thanks, that helped a lot. that is great