LuNeT 1.1 - Neural Network library (very) inspired by PyTorch

ENTITAN2010 · March 2, 2025, 6:29pm

Hello,
In my initial post announcing that I had created a machine learning library inspired by (or copied from) PyTorch, I want to inform you that I have added and upgraded several features.

Changelog

What I Added

Module

LSTM Long Short Term Memory.
A module that can handle long-term sequential data. I do not recommend using it because it is very slow.
GRU Gated Recurrent Units.
A faster and simpler version of LSTM. I recommend using it instead of LSTM because it is approximately three times faster.
RNN Recurrent Neural Networks.
A slightly faster alternative to GRU; however, it has difficulty “remembering” sequences and past states.

Optimizer

Adam Optimizer
Uses a completely different formula for updating parameters. Recommended to use it instead of SGD.

Tensor

View Function
Allows changing the dimensions of a tensor without altering its data. It only works if the new dimensions contain the same number of elements as the original (e.g. a tensor of shape
[1, 1, 4, 4] has the same number of elements as a tensor of shape [1, 16] because 1*1*4*4 = 1*16).
Flatten Function
Flattens a tensor by specifying a start dimension and an end dimension, computes the new shape, and uses the view function with the new shape.
Slice Function
Extracts an element or a group of elements between specified indices along a given dimension.
Select Function
Selects a specific index along a given dimension.

What I Improved

Tensor

Improved the dot product speed, making it approximately 1.3× faster.
If anyone has suggestions for further speed improvements, please reply to this post. I tried using multi-threaded parallelism by employing a script and placing it into an actor, but it performed even worse because sending a message to the script and retrieving the value took too much time.

The Name

I have chosen a new name: LuNeT. “Lu” stands for Lua, “Net” for networks, and the uppercase “T” for Torch. It also sounds like the word “lunette” in French, which translates to “glasses.”

Documentation

This is not the full documentation. Please check this post for the previous version.

1. LSTM

We assume that you have already created a model with the necessary setup. It should look like this:

local Module = require(script.Parent.LuNeT.nn.Module)
local nn = require(script.Parent.LuNeT.nn.nn)

local Model = {}
Model.__index = Model
setmetatable(Model, { __index = Module })

function Model.new()
	-- Create a new module instance
	local self = Module.new()
	-- Initialize the model using the Module's initialization function
	self.init(self, Model)

	return self
end

return Model

Now we create the LSTM module with a linear layer for classification:

function Model.new()
	local self = Module.new()
	self.init(self, Model)
	
	local inputSize = 1
	local hiddenSize = 1 -- The output size of the LSTM
	local numLayer = 1 -- The number of layers the sequence will pass through
	local batchFirst = true -- Recommended: Indicates that the first dimension is the batch size, not the sequence length
	
	self.lstm = nn.LSTM(inputSize, hiddenSize, numLayer, batchFirst)
	self.main = nn.Linear(hiddenSize, 1) -- Classifier layer
	self:add_module("lstm", self.lstm)
	self:add_module("main", self.main)

	return self
end

Next, note that the LSTM outputs two values. One output (if batchFirst is true) has the shape [batch_size, seq_len, hidden_size] and contains the logits for each element in the sequence. The other output is a table containing two tensors: the hidden state with the shape [num_layer, batch_size, hidden_size] and the cell state, which is used solely to allow the LSTM to “remember” without recomputing the entire sequence.

We will focus on the hidden state. The hidden state represents the state of each layer at the last element of the sequence. In the forward function, we take the state of the last layer and pass it to the linear layer:

function Model:forward(x)
	-- The LSTM returns an output tensor and a table with two tensors: (hidden_state, cell_state)
	local _, h = self.lstm(x)
	local h_n = h[1] -- Retrieve the hidden state
	local c_n = h[2] -- Retrieve the cell state (not used)
	-- Pass the hidden state of the last layer to the classifier; the -1 index comes from Python conventions and works with this library
	return self.main(h_n[-1])
end

So our complete model should look like this:

local Module = require(script.Parent.LuNeT.nn.Module)
local nn = require(script.Parent.LuNeT.nn.nn)

local Model = {}
Model.__index = Model
setmetatable(Model, { __index = Module })

function Model.new()
	local self = Module.new()
	self.init(self, Model)
	
	local inputSize = 1
	local hiddenSize = 1 -- The output size of the LSTM
	local numLayer = 1 -- The number of layers the sequence will pass through
	local batchFirst = true -- Recommended: Indicates that the first dimension is the batch size, not the sequence length
	
	self.lstm = nn.GRU(inputSize, hiddenSize, numLayer, batchFirst)
	self.main = nn.Linear(hiddenSize, 1) -- Classifier layer
	self:add_module("lstm", self.lstm)
	self:add_module("main", self.main)

	return self
end

function Model:forward(x)
	-- The GRU returns an output tensor and a table with the hidden state (and optionally cell state)
	local _, h_n = self.lstm(x)
	-- Pass the hidden state of the last layer to the classifier; the -1 index comes from Python conventions and works with this library
	return self.main(h_n[-1])
end

return Model

2. GRU

The GRU module is similar but does not return a cell state. The forward function is:

function Model:forward(x)
	-- The GRU returns the output and the hidden state
	local _, h_n = self.lstm(x)
	-- Use the hidden state of the last layer; the -1 index comes from Python conventions and works with this library
	return self.main(h_n[-1])
end

3. RNN

The RNN module is similar to GRU, but you need to specify an activation function. For example:

local inputSize = 1
local hiddenSize = 20 -- The output size of the RNN
local numLayer = 2 -- The number of layers the sequence will pass through
local non_linearity = "tanh" -- Supported options: "tanh" and "sigmoid"
local batchFirst = true -- Recommended: Indicates that the first dimension is the batch size, not the sequence length
	
self.lstm = nn.RNN(inputSize, hiddenSize, numLayer, non_linearity, batchFirst)

4. The Training

Now that we have created the model, we will train it to predict the next number in a sequence. For example, given the list [1, 2, 3, 4, 5, 6, 7, 8], the model should predict a value close to 9. For more details on how the training works, please refer to the previous documentation; the difference here is that we use Adam instead of SGD:

local Example = require(ReplicatedStorage.Example)
local lunet = require(ReplicatedStorage.LuNeT.lunet)

local myModel = Example.new()
local optimizer = optim.Adam(myModel:parameters(), 0.01)
local criterion = nn.MSELoss()
myModel:train() -- Currently, this does not affect the model

-- Training loop
for i = 1, 1000 do
	local seq_len = math.random(1, 10)
	local sequence = {}
	for j = 1, seq_len do
		table.insert(sequence, j)
	end
	
	local true_label = seq_len + 1
	
	-- Convert the sequence into a tensor with a valid shape for LSTM, GRU, or RNN
	local y_input = lunet.tensor(sequence):view({1, seq_len, 1})
	local y_true = lunet.tensor({{true_label}})
	
	-- Predict the next value
	local y_pred = myModel(y_input)
	
	-- Calculate the loss between the prediction and the true value
	local loss = criterion(y_pred, y_true)
	
	optimizer:zero_grad() -- Reset gradients
	loss:backward()     -- Compute gradients
	optimizer:step()    -- Update parameters
	
	print("Iteration: " .. tostring(i) .. " Loss: " .. tostring(loss))
	task.wait() -- Prevent freezing
end

-- Inference
myModel:eval() -- Disable training mode
local seq_len = math.random(1, 10)
local sequence = {}
for i = 1, seq_len do
	table.insert(sequence, i)
end
local true_label = seq_len + 1

local y_input = lunet.tensor(sequence):view({1, seq_len, 1})
local y_pred = myModel(y_input)

print("Sequence", sequence)
print("Predicted Value", y_pred:item())
print("True Value", true_label)

The library is very slow. I am trying to find solutions to improve its speed, but it is hard because Luau does not fully support multi-threading. While multi-threading is possible using actor objects, the overhead of creating them takes more time than simply performing a dot product. It would be more efficient if the task.spawn function could run on a different CPU core or if GPU acceleration were available.

In the next update, maybe I’ll to add support for exporting and importing models, and possibly even importing models from .pth files using HTTP requests.

If you want to try the library, it is available in this game under ReplicatedStorage.

If you have any ideas for improving the speed or the library, reply to this post.

xor25th · March 2, 2025, 6:52pm

seeing the snake game AI learn was amazing! good work.
how do you plan on making the export model & import?

ENTITAN2010 · March 2, 2025, 8:47pm

Thanks! For export, I can create a system to retrieve all parameters, encode them into a .pth file (which is the file extension for PyTorch), and retrieve the file via an HTTP request. OR, I can store all parameters in a datastore where you can reimport them.

xor25th · March 2, 2025, 9:14pm

well you could make a plugin that does that without the need of all dat, add me on discord and we can figure something out @xor25th

luke_binder · March 3, 2025, 2:16am

I had been trying to figure out reinforcement learning for so long and gave up. this is great work! I best I could understand was the Reinforce Algorithm, but nothing practically useful like DQNs and PPO.

ENTITAN2010 · March 3, 2025, 4:45pm

What’s the purpose of making a plugin for my lib?

xor25th · March 4, 2025, 8:52pm

to train models easily & export models aswell