A look into Object Oriented Programming and performance

roalex2008 · May 4, 2025, 5:37am

This is a follow-up on my previous post. The previous post focused on DUPCLOSURE and GETIMPORT, two pieces of the optimising puzzle in Luau.

This post is WIP, provide feedback, suggestions and other important feedback, please. Cheers!

Post Co-Author: @i0rtid

Today, however, we will be focusing on one of the core pillars of ROBLOX game scripting, OOP.

OOP, also known as object-orientated programming, is simply a way of designing our game where we use ‘classes’ that represent the state of our game. Say we have a tycoon; we would create a class called Tycoon that holds the Player, the Money and the ProgressIndex of the tycoon. There are many ways to do this pattern; however, we will be focusing on the two most commonly used and shared around.

The disassembly present on this post was obtained using RbxStu V4, my Roblox Studio executor. This is due to it using a compiler that is mostly accurate to the one used in Roblox (Opt 2 (due to RCC now using it by default on released games), mutable globals properly configured, Vector lib set, …). Comparable results can be done in external Luau compilers with proper settings, which I have documented on RbxStu’s standard page here due to getfunctionhash requiring the proper compiler configuration.
However, if you are not as adventurous as getting V4 (currently not public per-se) and neither adventurous enough to build your own luau with a custom compiler for the bytecode, you can continue to use Lonegwadiator’s bytecode explorer => Luau Bytecode Explorer.

The disassembler used in RbxStu V4 is Konstant V2.1, by plusgiant6 (also known as plusgiant5) “a Luau disassembler written in Luau” (Those are their ROBLOX usernames if anyone is interested on their projects!)

metatables and closures.

In metatable-based OOP we use metatables to our advantage; an example OOP module using this method would be the following:

local lib = {}
lib.__index = lib
function lib.hi()
        print("hi")
end
function lib.new(...)
        local newObject = { --[[ Initialize Fields or Properties... ]] }
        -- Additional Initialization (if any)
        return setmetatable(newObject, lib)
end

This method’s main advantage is memory, because of the fact that we are not keeping the functions around on the table we create every time (this is an oversimplification; read on DUPCLOSURE to understand this point more thoroughly; the truth is that we are going to be keeping a new table ‘node’ each time, which is a key-value pair for the table internally. Nodes are roughly 32 bytes in size; the closure is not reallocated if the DUPCLOSURE op. code is used properly!)

The thing is: using this method of OOP takes us up in our execution time. Every time we index our table we could take either of two VM paths:

Index the table (Fastest path)
Index the metatables __index metafield (where __index can be a function or table, which presents the same conundrum again). (Slower path)

We want to maximise our fast paths, and assuming that we will be repeatedly indexing a function, it may be better to simply ‘inject’ it into our table to reduce the index time and improve performance when calling these ‘hot functions’.

This takes us to our second method for OOP, closures. This time around, we don’t necessarily use metatables for everything. Instead, each class holds all its state, as well as its functions without needing to use metatables at any point. An example of closure in OOP would be the following:

local lib = {}

function lib.new()
        local current = { --[[ Initialize Fields or Properties... ]] }
        -- Additional Initialization (if any)
        function current.hi(self) -- Define methods
            print("hi")
        end
        
        return current
end

Here we benefit from runtime performance but slightly lose on our memory usage, as we are constantly creating table nodes for each object in our table.

Now, I’ll jump straight to my test results and what I have learnt from them so far, and what the most performant methods are (which are hilarious).

For metatables, the following is the fastest method of instantiation (x2 faster than the normal, conventional method):

local lib = {}
lib.__index = lib
local cachedTableWithStubs = { --[[ Stub fields ]] }

function lib.new(...)
        local newObject = table.clone(cachedTableWithStubs)
        -- Update the properties and fields with your custom state
        return setmetatable(newObject, lib) -- METATABLE MUST BE SET AFTER CLONING! (see remarks)
end

As for closures, the following is the fastest method of instantiation (x2.3 faster than the normal, conventional method):

local lib = {}
local cachedTableWithStubs = { --[[ Stub fields and functions here ]] }

function lib.new(...)
        local newObject = table.clone(cachedTableWithStubs)
        -- Update the properties and fields with your custom state
        return newObject
end

Remarks

Spooky remarks! We aren’t cloning the table with the stubs with a metatable already set. Why? Internally, Luau has the __metatable metafield. This metafield is checked on table.clone to verify if the table can be cloned.

Because of this, if we call table.clone with a metatable set, we will be incurring an index into the metatable of the table to check for the metafield. This is SLOW! It is SO slow, in fact, that, while slower than using the normal instantiation (where we just table {} in there for our indexes), the method above is almost 2x faster.

Details: Optimisation 2 (inlining): Objects were instantiated 10000 for all three tests. This screenshot is part of the benchmarks run using Benchmarker!

As you can see, this method of instantiation, where we keep a ‘cached’ table that we then replace with our dynamic state (The method is named Metatable-based Creation [Optimized (Metatable set each call)] I’m a programmer; don’t judge the naming ), is cheaper in both ‘closure’ and ‘metatable’ OOP class instantiation.

Why choose one or the other?

According to your preferences, or maybe if you simply side with the ‘metatable’ or ‘no-metatable’ part of the community, however, there is a clear point if we look at them from simply facts.

Do you value memory or runtime performance?

The average developer will likely be more interested in memory performance; it is more significant in the grand scheme of things, as you may be running on devices with high constraints, and where you’re more likely to come across an OOM (out-of-memory) crash, where you simply have no more memory to make tables or any other resource in general.

This, however, we can ignore because we are simply looking at what is the fastest, not at memory complexity. Measuring memory complexity is more complicated, because we cannot just ‘spam’ call it to really get a good reading; we would have to calculate the number of bytes each method produces and then ‘diff’ them out to get a good reading. However, the equation for getting the bytes is roughly the following. NodeCount * 32, that will result in the number of bytes your table would allocate for all your nodes roughly. You can calculate the number of nodes by simply getting the number of key-value pairs that are not arrays. This means that, yes, you will regardless lose memory in closure-based OOP due to it allocating more nodes for each class, so if you value memory, you want to use metatable OOP all the way because theoretically it should result in good, low memory allocations, with slightly slower runtime performance.

Now, let’s go into the graphs, the reasons, and the disassembly of the test cases to notice WHAT happens.

This output graph has some parts removed to show what I want to focus on first, the creation of the objects; later on in this post, we will be delving into the performance when using the objects!

All graphs, including the timings on the left.

This is a comparison of the ‘classic’ methods with the optimised methods we found during testing. The ‘classic methods’ are monikered [Normal], while our optimised methods are monikered [Optimized ...]. As you can see, the classic closure-based OOP instantiation is almost TWO TIMES slower than a normal metatable-based OOP instantiation. However this does not come close to the 2x and 4x instantiation performance improvements for each metatable and closure’s optimised implementations.

Part 1: Closures Normal vs Optimised.

The key difference between the two lays in the disassembly. When you create a table, programatically, each index you add is transformed into a SETTABLE op. code, Luau contains an ‘optimized’ one for constant indexes. SETTABLEKS. The op. code, as defined on lbytecode.h is as follows:

    // SETTABLEKS: store source register into table using constant string as a key
    // A: source register
    // B: table register
    // C: predicted slot index (based on hash)
    // AUX: constant table index
    LOP_SETTABLEKS,

If the op. code was to be translated into English, it would be Set the value in this table using this constant string as a key.

Disassembly of the normal closure method:

-- Disassembled with Konstant V2.1's disassembler, made by plusgiant5
-- Disassembled on 2025-05-03 21:23:28
-- Luau version 6, Types version 3
-- Time taken: 0.000244 seconds

[0] #1 [0x00000041]          PREPVARARGS 0                     ; -- Prepare for any number (top) of variables as ...
[1] #2 [0x00040036]          DUPTABLE 0, 4                     ; var0 = {}
[2] #3 [0x00000104]          LOADN 1, 0                        ; var1 = 0
[3] #4 [0xC9000110]          SETTABLEKS 1, 0, 201 [0]          ; var0.balance = var1
local function Withdraw() -- Line 3
	[0] #1 [0xC900020F]          GETTABLEKS 2, 0, 201 [0]          ; var2 = var0.balance
	[2] #2 [0x01020222]          SUB 2, 2, 1                       ; var2 -= var1
	[3] #3 [0xC9000210]          SETTABLEKS 2, 0, 201 [0]          ; var0.balance = var2
	[5] #4 [0x00010016]          RETURN 0, 1                       ; return
end
[5] #5 [0x00050140]          DUPCLOSURE 1, 5                   ; var1 = Withdraw
[6] #6 [0x6A000110]          SETTABLEKS 1, 0, 106 [1]          ; var0.Withdraw = var1
local function Deposit() -- Line 6
	[0] #1 [0xC900020F]          GETTABLEKS 2, 0, 201 [0]          ; var2 = var0.balance
	[2] #2 [0x01020221]          ADD 2, 2, 1                       ; var2 += var1
	[3] #3 [0xC9000210]          SETTABLEKS 2, 0, 201 [0]          ; var0.balance = var2
	[5] #4 [0x00010016]          RETURN 0, 1                       ; return
end
[8] #7 [0x00060140]          DUPCLOSURE 1, 6                   ; var1 = Deposit
[9] #8 [0x9F000110]          SETTABLEKS 1, 0, 159 [2]          ; var0.Deposit = var1
local function GetBalance() -- Line 9
	[0] #1 [0xC900010F]          GETTABLEKS 1, 0, 201 [0]          ; var1 = var0.balance
	[2] #2 [0x00020116]          RETURN 1, 2                       ; return var1->var1
end
[11] #9 [0x00070140]         DUPCLOSURE 1, 7                   ; var1 = GetBalance
[12] #10 [0xBC000110]        SETTABLEKS 1, 0, 188 [3]          ; var0.GetBalance = var1
[14] #11 [0x00020016]        RETURN 0, 2                       ; return var0->var0

Code disassembled:

local self = {
	balance = 0,
	Withdraw = function(self: { balance: number }, amount: number)
		self.balance -= amount
	end,
	Deposit = function(self: { balance: number }, amount: number)
		self.balance += amount
	end,
	GetBalance = function(self: { balance: number }): number
		return self.balance
	end,
}

return self

As you can see above, the disassembly of our source code has plenty of SETTABLEKS op. codes. This is normal on this kind of design, after all we are adding 4 new fields into the table before returning it. However, this is slow to do repeatedly. Instead, what if we make a base object? One that its state is all just, stub values, and we clone it and then use that instead?

That is exactly what the optimized version does! It only pays the cost of SETTABLEKS four times (because of the initial instantiation), after which the table is cloned. Cloning a table like this is really fast, except in one edge case, which I will delve into in a bit; only its dynamic state is set, that being balance. This way, we no longer have to deal with that many SETTABLEKS op. codes, but rather just one time for the balance field. This sets us on rather equal footing to metatable-based OOP creation, but it is faster on creation as well. This could be due to the interpreter’s DUPTABLE op. code, which duplicates a table from its constant version in the bytecode being slower at executing than table.clone is, since table.clone is much more straight forward, simply cloning the table without much beating around the bush most of the time.

Part 2 Metatable Normal vs Optimised:

Disassembly of the normal metatable method:

-- Disassembled with Konstant V2.1's disassembler, made by plusgiant5
-- Disassembled on 2025-05-03 21:42:16
-- Luau version 6, Types version 3
-- Time taken: 0.000266 seconds

[0] #1 [0x00000041]          PREPVARARGS 0                     ; -- Prepare for any number (top) of variables as ...
[1] #2 [0x00010136]          DUPTABLE 1, 1                     ; var1 = {}
[2] #3 [0x00000204]          LOADN 2, 0                        ; var2 = 0
[3] #4 [0xC9010210]          SETTABLEKS 2, 1, 201 [0]          ; var1.balance = var2
[5] #5 [0x0003020C]          GETIMPORT 2, 3 [0x40200000]       ; var2 = lib
[7] #6 [0x03013D4A]          FASTCALL2 61, 1, 3                ; ... = setmetatable(var1, var2) -- Uses results from call at [11]. If successful, goto [12]
[9] #7 [0x0005000C]          GETIMPORT 0, 5 [0x40400000]       ; var0 = setmetatable
[11] #8 [0x02030015]         CALL 0, 3, 2                      ; var0 = var0(var1, var2)
::8::
[12] #9 [0x00020016]         RETURN 0, 2                       ; return var0->var0

Code dissassembly:

return setmetatable({ balance = 0 }, lib)

In this sample, we create our table, then we set its field, and then we set its metatable to our lib.

May I remark that lib is not a global, its an upvalue, however as this sample was compiled in isolation, that is not properly observed in the bytecode!

The op. code would go through DUPTABLE, as expected, however for some reason we benefit from using luaH_clone directly without the interference of the Luau VM’s interpreter, something which I have not really wrapped my head around quite honestly! On the ‘optimised’ version, we do something similar to the DUPTABLE op. code, except we are doing it more ‘explicitly’. However, somehow, this ends up benefitting us in performance as much as 2x!

Part 3: table.clone edge-case

table.clone has a specific edge-case, where it will perform really slowly, so slow in fact that it somehow almost 2x’s the time it takes to execute .new. This edge-case appears only on tables whose metatable is set. table.clone check the __metatable metafield, and if it is set, it prohibits the clone with an error. Because of this, every time you make use of table.clone with a table that has a metatable, it is slightly slower, since it is additionally indexing into the metatable of the object to check for that metafield.

I found this out during testing, and wrote a small little comment for it, perhaps this could explain it slightly better!

--[[
    table.clone makes a check on the metatable of the given argument.
    
    If the table has a metatable it will index the metatable in search of the '__metatable' field.
    If the metatable has a '__metatable' field, it will error.

    This is great, however this incurs a penalty when cloning ANY table with the '__metatable' field set to anything that isn't nil.

    This explains why newDupeWithMetatableSet is much slower than newDupeWithMetatableNotSet and new, because table.clone is silently checking the metatable for the '__metatable' metafield.
]]

This is to explain why if we are creating the object that we will duplicate with table.clone it is preferrable for it not to have a metatable.

Below is the test code for all this post, which you can test yourself with the Benchmarker plug-in:

--!optimize 2
local ClosureAccount = {}

local cachedClosureAccount = {
    balance = 0,
    Withdraw = function(self: { balance: number }, amount: number)
        self.balance -= amount
    end,
    Deposit = function(self: { balance: number }, amount: number)
        self.balance += amount
    end,
    GetBalance = function(self: { balance: number }): number
        return self.balance
    end,
}

function ClosureAccount.dupeNew()
    local self = table.clone(cachedClosureAccount)
    self.balance = 0
    return self
end

function ClosureAccount.new()
    local self = {
        balance = 0,
        Withdraw = function(self: { balance: number }, amount: number)
            self.balance -= amount
        end,
        Deposit = function(self: { balance: number }, amount: number)
            self.balance += amount
        end,
        GetBalance = function(self: { balance: number }): number
            return self.balance
        end,
    }

    return self
end

local Metatablebased = {}
Metatablebased.__index = Metatablebased
function Metatablebased.Withdraw(self: { balance: number }, amount: number)
    self.balance -= amount
end

function Metatablebased.Deposit(self: { balance: number }, amount: number)
    self.balance += amount
end

function Metatablebased.GetBalance(self: { balance: number }): number
    return self.balance
end

function Metatablebased.new()
    return setmetatable({ balance = 0 }, Metatablebased)
end

local duped2 = { balance = 0 }

function Metatablebased.newDupeWithMetatableNotSet()
    local duped = table.clone(duped2)
    duped.balance = 0
    return setmetatable(duped, Metatablebased)
end

local accountClosures = ClosureAccount.dupeNew()
local accountMetatable = Metatablebased.newDupeWithMetatableNotSet()

return {
    ParameterGenerator = function()
        return
    end,

    Functions = {
        ["Metatable-based Creation [Normal]"] = function(Profiler)
            for i = 1, 10000 do
                Metatablebased.new()
            end
        end,

        ["Metatable-based Creation [Optimized (Metatable set each call)]"] = function(Profiler)
            for i = 1, 10000 do
                Metatablebased.newDupeWithMetatableNotSet()
            end
        end,

        ["Closure-based Creation [Normal]"] = function(Profiler)
            for i = 1, 10000 do
                ClosureAccount.new()
            end
        end,

        ["Closure-based Creation [Optimized]"] = function(Profiler)
            for i = 1, 10000 do
                ClosureAccount.dupeNew()
            end
        end,

        ["Closure-based Usage"] = function(Profiler)
            for i = 1, 10000 do
                accountClosures.Deposit(accountClosures, 0)
                accountClosures.Withdraw(accountClosures, 1)
            end
        end,
      
        ["Metatable-based Usage"] = function(Profiler)
            for i = 1, 10000 do
                accountMetatable.Deposit(accountMetatable, 0)
                accountMetatable.Withdraw(accountMetatable, 1)
            end
        end,
      
        ["Closure-based Usage (Namecall)"] = function(Profiler)
            for i = 1, 10000 do
                accountClosures:Deposit(0)
                accountClosures:Withdraw(1)
            end
        end,
        
        ["Metatable-based Usage (Namecall)"] = function(Profiler)
            for i = 1, 10000 do
                accountMetatable:Deposit(0)
                accountMetatable:Withdraw(1)
            end
        end,
    },
}

Part 4: Call performance

As expected, no metatables has the best call performance, being roughly x1.25 faster to call when using dot-indexing with a known index. This detail is significant, since this test is done on the assumption that the op. code used is GETTABLEKS, not GETTABLE!

It is worth noting that there is a difference with namecall and index calls. NAMECALL is its OWN op. code unlike, others may believe, and it invokes its own path on the luau interpreter. NAMECALL has some specific paths for __index calls due to the fact they’re really frequent on userdata both table objects.

Regardless, what we can take away from all this is simple:

Closure-based OOP is less performant memory wise, but the calling performance is slightly faster.
Metatable-based OOP is more performant on the memory aspect, but the calling performance is slightly slower.

However remember! You must always test on your own, and prove the post that you are reading, potentially, wrong! In the end, both methods have their uses, they just need to be properly managed.

The reason why the ‘optimized’ versions are better is likely also to benefit because of maybe one of the following things:

The slots for the ‘nodes’ of the tables will be allocated and indexing them will succeed.
Modifying the index on the cloned tables likely does not incur a re-size.
Due to the table being created once, the cost of all the ‘SETTABLEKS’ op. codes is much more significant on closure-based OOP, where we set new indexes for each function, however it can display benefits when you have a ‘default’ state for your class, where you would benefit from these stub values.

In the end, this is on the level of micro-benchmarking.

Cheerio to whoever reads this, this post was written hastily. I sincerely believe it is more of an ‘information dump’ than a proper explanation documenting everything that is truly going on :< However, I’m sure that with this you, developer, can make proper usage of this information for maybe your next project or to develop even more insane ways to save on performance, if only we all looked at the interpreter and other parts of it to try and exploit every single bit of performance

And remember…

If you would be a real seeker after truth, it is necessary that at least once in your life you doubt, as far as possible, all things.

René Descartes (1596-1650)

HexadecimalLiker · May 4, 2025, 6:06pm

This is very cool, make more stuff like this!

Yarik_superpro · May 4, 2025, 6:17pm

“Second OOP” method you showed will lose to metatable OOP badly because you used it improperly
Second method is dependent on strict types
I made a post explaining them: Strictly typed <<Object Oriented Programming>> (Better than metatable OOP)
You should always use them in a strict mode
Constructor can be independent of class tho like in the example you showed, and you should probably reference it directly instead of “lib” or other shenanigans

HexadecimalLiker · May 4, 2025, 6:18pm

What do you mean Second OOP method?

roalex2008 · May 4, 2025, 6:24pm

Defined ‘used improperly’ please, I’d be glad to understand where I have messed up to edit and fix up the post.

Yarik_superpro · May 4, 2025, 6:24pm

Sure!
You should make everything properly typechecked.
You need to create a separate type for your class.
You can use typeof to get type of a method
Example:

 type Cat = {
--Methods
	new:typeof(Constructor);
	Meow:typeof(Meow);
	Say:typeof(Say);
	Info:typeof(Info);
-- Properties
	Name:string;
	Age:number;
}

HexadecimalLiker · May 4, 2025, 6:32pm

Well you’re just spreading misinformation in public hence why I am addressing them in public – your comment about DMs is pretty hypocritical considering you are defaming me in public.

You haven’t elaborated on what the second OOP method was – I’d appreciate if you could tell me what you’re referring to.

roalex2008 · May 4, 2025, 6:33pm

Even if it were to be typed, the compiler will very likely not emit more efficient instructions, some specific cases DO cause different emitting according to type information, however I cannot back if the implementations would benefit from it

The method described in your post likely forces the compiler to emit GETGLOBAL and SETGLOBAL in the initial creation of the factory method, following up, it is almost identical to my method, except I don’t clone self and rather do an upvalue.

.After a bit of thought they’re likely to run at the same timings, as we do virtually almost the same, however yours is still slightly inferior to my version on the looks, since as I already explained why I keep a ‘cached’ already constructed object is benefitial, because the key-value pairs needn’t be reallocated as explained at the bottom of the post. On your method, due to cloning self which doesn’t have these already allocated you could be incurring a reallocation of your table node part, which will take precious time and memory.

I’m still confused on what part it would ‘lose’. As I already described, we have two big groups we can categorize the methods, Memory and RunTime performance, metatables allow you to save memory, closures RunTime.

I do not understand what you mean, quite honesty. I would value if you could rephrase it and develop your idea further however, perhaps we can reach the truth in this matter and later compare the disassembly of both your method and mine and the differences, however overall our methods should have roughly equal time for closures OOP, almost nearing just the margin of error or being equal to it, yours potentially taking slightly more on object construction due to it incurring a table node reallocation.

I hope it can he understood, however if you didn’t understood, drop another post and I’ll rephrase it further. Cheers!

Yarik_superpro · May 4, 2025, 6:38pm

Where do i spread misinformation buddy?

You literally edited own post

Well isnt it hypocritical of you to changing comments entirely to look innocent?
And now you trying to make your replies look “P R O F E S S I O N A L”

???I replied acordingly when it literally was “It is his post through” (or something 99% similar to that in a passive agressive way) and now you changed the comment to make it ask about “Second OOP” bro this is literally stuipid.
Also im pretty sure you just made up stuipid responce and now have to deal with it as since its obvious that by second OOP i meant “closures”

roalex2008 · May 4, 2025, 6:39pm

Ok, I’m not one to complain much; however I’d appreciate if you could stay on topic I don’t want to report or call moderation myself, but I beg you pardon that you please stay on topic so we can reach a discussion and what we want here: which is knowing and answering what you said on your response to the thread. Cheers

Yarik_superpro · May 4, 2025, 6:40pm

From my benchmarking not typechecking it in a mega strict way were resolting into heavy loss of perfomance
Ig it depends a lot but better be safe with strict code than guessing in my opinion.

HexadecimalLiker · May 4, 2025, 6:45pm

You could’ve just told me from the beginning that you were talking about the Closure OOP method – then we could’ve wrapped this up quicker. Also no, as @roalex2008 said, the performance will be nearly identical.

I prefer being formal when speaking with people I don’t know – it keep the discussion respectful.

roalex2008 · May 4, 2025, 6:46pm

Had you tested it using --!native? Type information is indeed used when native code generation is performed, when comparing the native generation I’m certain we will find a difference on the generated native code, but we’re unlikely to find one in Luau instructions.

I will run the relevant benchmarks, and if it indeed appears to be the case I will try to find a reasonable explanation, just as I have tried finding one to why table.clone would be faster than DUPTABLE (which I have somewhat gathered it could be due to upvalues being resolved with one less pointer de-reference internally, resulting in potentially better run-time performance. However I cannot confirm it fully myself.

Yarik_superpro · May 4, 2025, 6:47pm

i were unable to do so physically becouse your original NON EDITED comment were never asking that.
PLEASE STOP; It just beyond stuipid keeping arguing over that already

Yarik_superpro · May 4, 2025, 6:48pm

i did both
And result was similar in both
It has to be very heavily typed to benefit from it.
Result in native without strict types was horrifying tho
Non native code does benefit from strict types too

roalex2008 · May 4, 2025, 6:50pm

Yes, the latter is true, because then the code generator likely needs much more checks to be safe, unlike with types, however sadly I cannot recall if native code is fully rolled-out on client to really value it. However the types should be A-OK, after you explicitly define self as a valid type with the valid fields.

I will attempt tests on this when I have time, however for now, if you could compare both methods and obtain Benchmarker metrics I’d be grateful!

roalex2008 · May 4, 2025, 6:56pm

After looking further, you don’t appear to be testing using Benchmarker, you may be running into specific edge cases due to not averaging properly and other small things that could affect the result of the test. I will run the tests myself when I have free time if you do not have Benchmarker to compare them with the method I made and yours, I will make sure to use your sample to be 100% sure it’s up-most accurate and then create a follow-up post to this one

Yarik_superpro · May 4, 2025, 7:02pm

Huh
Looks weird
Yeah it probably really depends
Time: 0.0672294000396505
Code:

--!optimize 2
--!strict

local function say(self:class,say:string):()
	
end

local class:class = {
	Name="";
	Say=say
}

local function construct(Name:string):class
	local nelf = table.clone(class)
	nelf.Name=Name
	return nelf
end

type class = {
	Name:string;
	Say:typeof(say)
	
}

local t = os.clock()
local Name = "HII"
for i=1,999999 do
	construct(Name):Say(Name)
end
print(os.clock()-t)

Time:0.0675491999136284

--!optimize 2
local function say(self,say)
	
end

local class = {
	Name="";
	Say=say
}

local function construct(Name)
	local nelf = table.clone(class)
	nelf.Name=Name
	return nelf
end


local t = os.clock()
local Name = "HII"
for i=1,999999 do
	construct(Name):Say(Name)
end
print(os.clock()-t)

roalex2008 · May 4, 2025, 8:03pm

Having run the benchmarks, here is a simple breakdown:

No change when running using the Interpreter.

In fact, for some ackward reason, it was quicker to have no types…? Likely just margin of error, however.

Strict mode test (types defined):

Strict mode not specified, types undefined test:

It is to me said that the test only measures object creation, never have I measured dispatch, since it likely will have the same speed as my already outlined method.

The tests were run over 100000 iterations on the construct method.


return {
	ParameterGenerator = function()
		return
	end,

	Functions = {
        ["[Native] Constructor"] = function(Profiler)
            for i = 1, 100000 do
                construct("hi") -- paste in your original code above.
            end    
        end,
	},
}

Regardless, the results vary between runs, around 0.2ms~ anyway.

And, just for fun, I did it without table.clone, relying on DUPTABLE and the results are hilariously bad:

The code for this test is the following:

--!optimize 2

local function say(self: class,say: string):()

end

local class: class = {
    Name="";
    Say=say
}

local function construct(Name: string): class
    local nelf = {Name = Name, Say = say }
    return nelf
end

local function construct2(Name: string): class
    local nelf = table.clone(class)
    nelf.Name = Name
    return nelf
end


type class = {
    Name: string;
    Say: typeof(say)
}

return {
	ParameterGenerator = function()
		return
	end,

	Functions = {
        ["Constructor (DUPTABLE)"] = function(Profiler)
            for i = 1, 100000 do
                construct("hi")
            end    
        end,
        ["Constructor (table.clone)"] = function(Profiler)
            for i = 1, 100000 do
                construct2("hi")
            end    
        end,
	},
}

And just for good measure, I did the same without any type definitions.

The code for this test is simply the following:

--!optimize 2

local function say(self, say)

end

local class = {
    Name="";
    Say=say
}

local function construct(Name)
    local nelf = { Name = Name, Say = say }
    return nelf
end

local function construct2(Name)
    local nelf = table.clone(class)
    nelf.Name = Name
    return nelf
end


return {
	ParameterGenerator = function()
		return
	end,

	Functions = {
        ["[Types Removed] Constructor (DUPTABLE)"] = function(Profiler)
            for i = 1, 100000 do
                construct("hi")
            end    
        end,
        ["[Types Removed] Constructor (table.clone)"] = function(Profiler)
            for i = 1, 100000 do
                construct2("hi")
            end    
        end,
	},
}

In the end, @Yarik_superpro as you can see, types hold little to no importance to the interpreter to perform its task; it could in the future help with the op code emission the compiler does. Our methods are virtually on the same level of performance.