It seems like it only get the best network, it doesn’t save the whole population.
What does your “input_module” look like?
local input = {}
function input.new(npc, fov, dist, input_tbl)
local npc_cf = npc:GetPivot()
local input_ = {}
for i = 1, #input_tbl do
local value = input_tbl[i]
local angle = (fov / (#input_tbl - 1)) * i - (fov / 2)
local cf = npc:GetPivot() * CFrame.Angles(0, math.rad(angle), 0)
local origin = npc:GetPivot().Position
local ray = Ray.new(origin, cf.LookVector * dist)
local i_, position = workspace:FindPartOnRayWithIgnoreList(ray, {workspace.ignore, npc.Parent})
local distance = (origin - position).Magnitude
if npc.visualize_input.Value then
input.visualize(npc, i, distance, position, origin, 0.05, i_ ~= nil, "visualize")
else
input.visualize(npc, i, distance, position, origin, 0.05, i_ ~= nil, "destroy")
end
if i_ then
input_[value] = distance
else
input_[value] = -1
end
end
return input_
end
function input.visualize(npc, id, distance, position, origin, size_x_y, obstructed, mode)
local part
if mode == "visualize" then
if workspace.ignore:FindFirstChild(id .. npc.Name) then
part = workspace.ignore:FindFirstChild(id .. npc.Name)
part.CFrame = CFrame.lookAt(position, origin) * CFrame.new(0, 0, -distance/2)
part.Size = Vector3.new(size_x_y, size_x_y, distance)
else
part = Instance.new("Part")
part.Anchored = true
part.CFrame = CFrame.lookAt(position, origin) * CFrame.new(0, 0, -distance/2)
part.Size = Vector3.new(size_x_y, size_x_y, distance)
part.Material = "Neon"
part.CanCollide = false
part.Name = id .. npc.Name
part.Parent = workspace.ignore
part.Transparency = 0.25
end
if obstructed then
part.Color = Color3.fromRGB(252, 26, 26)
part.Size = Vector3.new(size_x_y * 5, size_x_y * 5, distance)
part.Transparency = 0
else
part.Color = Color3.fromRGB(252, 250, 255)
part.Size = Vector3.new(size_x_y, size_x_y, distance)
part.Transparency = 0.25
end
elseif mode == "destroy" then
if workspace.ignore:FindFirstChild(id .. npc.Name) then
part = workspace.ignore:FindFirstChild(id .. npc.Name)
part:Destroy()
end
end
end
return input
By the way, I guess you looked at my older code, it’s outdated, the thing I changed is the input:
local inputs = {
"A", "B", "C", "D", "E",
"F", "G", "H", "I", "J",
"K", "L", "M", "N", "O",
"P", "Q", "R", "S", "T",
"U", "V", "W", "X", "Y",
"Z", "AA", "AB", "AC", "AD",
"AE", "AF", "AG", "AH", "AI", "AJ",
"AK", "AL", "AM", "AN", "AO", "AP"
}
local distances = input.new(npc, 180, 200, inputs)
local output = net(distances)
local temp_net = FeedforwardNetwork.new(
inputs, 2, 4,
{"right", "left", "front", "back", "jump", "click"},
feedforwardsettings
)
Hmm, okay. Thank you! This’ll help me out a ton.
Have you found a way to save the whole population? I just had this issue.
I have not achieved this unfortunately. Although you only lose a little bit of progress because you will lose 2nd, 3rd, … best performer but atleast you have the best network so it’s not so bad.
Ah, I see. Well, thanks for the responses. I appreciate it a ton!
I’m new to this module but it seems sort of limited for deep reinforcement learning. I’m experimenting with it right now trying to learn how it works and it seems like it could be possible to do like a basic DQN algorithm, not so much some of the extensions of DQN. I’m not sure about other algorithms like policy gradient methods, actor critics, PPO, TD3, etc.
Some issues I faced when theorizing on how to use this module to implement the DQN algorithm + extensions of DQN:
-
Prioritized experience replay: requires an input for an importance sampling weight, which you just multiply by the gradient
-
Noisy networks: I honestly couldn’t understand the math behind this one, but from what I read in the abstract of the paper it seems to basically have parameterized noise that is learned for each weight and bias? But the point is it seems difficult to do this with the module.
It seems feasible to add some extensions, however. Some plausible additions:
- Dueling DQN
- Double DQN
- N-step learning
- Tau or “soft update” - I’ve successfully been able to implement this in a function.
I haven’t learned about distributional DQN yet so I’m not sure if it’s a plausible addition.
Keep in mind, I’m not a Deep RL or math expert, so I could be wrong on some things.
Hey, I want to let you know that my deep and machine learning library has Q-Learning Neural Networks and SARSA neural network. But if you want the ones with experience replay, you have to take the Beta version of the library (can be either package or module script).
Here’s the link:
Ok, I will check it out. I’ve already started trying to implement DQN with this module though and I want some more experience implementing it so I’m still going to use this one.
So, after using this for a N-step D3QN deep RL snake project (not sure if it’s successful yet!) I have mixed opinions about this library.
Pros:
- API is well documented, sample code helps a lot when learning how to use the library
- Very simple and beginner-friendly
Cons:
- A bit too simple, lacks features like gradient clipping, learning rate decay, different loss functions, and the ability to customize in general.
By ability to customize, I mean things like being able to interact with the costs and modify them. The prioritized experience replay algorithm requires this because of a weight called the importance sampling weight which the gradient is multiplied by.
Not having gradient clipping is a huge negative for me. I’ve noticed that using gradient clipping (with my own neural network framework) has prevented exploding gradients which is a common issue.
Edit: Because this library has no gradient clipping, my snake agent experienced exploding gradients with the ReLU activation function for the hidden layers.
Overall thoughts: This library is very good for beginners looking to start machine learning on roblox. However, if you want more advanced features and customizability look for other libraries.
Better make your own posts when comparing different neural network libraries. Should be interesting.
Eh I don’t know. I’m not sure if I really want to compare libraries, I just felt like leaving a review because this library could definitely be improved. This library almost met all my needs. It just really lacks customizability and needs more features in general.
Why is there no way to save the entire population? Seems like a pretty common use case to pause training for a while and come back to it later.
saving one network in the population can be hard on its own already, so thats most likely why
I went ahead and made a basic module for saving the entire population. Seems like a lot of people wanted it on this thread so here it is
Save Entire Neural Net Population.rbxm (28.8 KB)
Readme with instructions inside.
i think i love you
(thank you ill try it out)
When are you going to add Multi Threading Training to the Library ? Using Actors and Parallel Execution?