I believe it may be self-explanatory, but I still wish to know what these lines are or how they work, or if there are other kind of “magical lines” like this, from what I’ve researched, they are related to some sort of “native luau” stuff, but I don’t know many details as to when these should be applied, limitations, improvements, and such, because for how “magical” they are, there should be more behind the scenes that they are doing.
Long post warning.
Lua does not compile to an instruction set for which any hardware exists to run it, the way classical languages do. Instead, there is a virtual machine written in such a language (Lua’s official implementation is in C) which runs Lua’s instruction set or bytecode. !native supposedly bypasses this and generates x86/Arm/etc code to run directly. This has performance benefits, but not nearly as many as you’d think. Here’s why people tend to say “magical” things.
Roblox’s implementation of Luau is not public. This is important to note when people make any claims about Luau performance, the only way they can know if something is faster is if they’ve done a benchmark comparison, otherwise they are not to be trusted. People will also confuse the implementation with the language itself. How you design the language (the grammar and semantics) has no effect on the implementation, they are kept separate on purpose for design reasons. There exist languages that do not, and cannot run.
There are two places you could optimize Lua/Luau. If it’s possible to generate more than one correct set of bytecode for some given source doe, and one is most efficient, then finding that version is an optimization (refinement). You are not allowed to break the rules of the language.
The other way to optimize is to improve the implementation of the Lua virtual machine, which Roblox has claimed in many ways to have done. I believe these when they have hard details, otherwise I assume they aren’t real. For example Roblox optimizes access to certain builtin functions which can never be replaced, the math library Vector3, and others.
Again, in both these situations, you are not allowed to break the rules of Lua. I.e. you cannot make a change that causes different behavior for the same code, versus the original implementation without optimizations. Finding allowable optimizations is difficult because of Lua’s design:
- Types are dynamic, it is not possible to assign a variable a type that is valid for its entire existence, so types always need to be checked.
- It is garbage collected, this centralizes a lot of execution state in a way that makes multithreading very annoying to implement
- Operators can be overloaded, so any variable that’s a table needs to have its operators looked up each time, since they could change.
There are optimizations you can do to Lua specifically that can make it generate fewer instructions, this tends to result in faster code, but it’s difficult to translate between fewer instructions and faster code for computer architecture reasons. The reasons native code is typically faster are it has better data locality, less indirection, smaller code size, and less allocation.
Having an !optimize option makes no sense. I think they just did it because it’s something compilers tend to have. If its possible to generate more optimized code you should always do it. For that reason I refuse to believe it does anything. Optimization options for compilers exist to cope with the fact that different machines react differently to optimizations, and might need some manual tweaking, done hand-in-hand with profiling. Roblox controls the only implementation of Luau in Roblox, so what could the user possibly be tweaking towards???
The optimize option does actually do something, it instructs the compiler “to use more aggressive optimizations by enabling optimization level 2”.
I believe live experiences already use optimization level 2, but in studio it’s typically level 1, to improve debugging I believe.
Ok, show me some code where it runs faster with --!optimize. I’ll even write the test harness myself, just provide any code that shows a performance difference with or without optimize.
I believe live experiences already use optimization level 2, but in studio it’s typically level 1, to improve debugging I believe.
What is it you believe is made harder to debug at higher levels?
![]()
--!optimize 2
local timings = {}
local function timeFunction <T>(label: string, func: (...T) -> (), ...: T?)
local start = os.clock()
func(...)
timings[label] = os.clock() - start
end
local t = table.create(10000, 0)
timeFunction("pairs, level 2", function()
for _, v in pairs(t) do
end
end)
for label, timing in timings do
print(`[{label}]: {timing * 1000}ms`)
end
Maybe the compiler performs more aggressive caching or something that would prevent debugging functions from working properly, things like getfenv cause deoptimizations.
Thats a very significant difference but the for loop has no effect and t is dead.
This isn’t 100% true, its true that Luau always compiles to it own bytecode, but !native instructs it (where possible) to compile the bytecode to assembly instructions.
This is known as just-in-time (JIT) compilation, and is also used in languages like Java or C#.
Thats better and easier than what I was envisioning.

Made it add 1 to every value, as you can see, optimizations have an effect!
Test it with generic iteration, that being this syntax:
local t = {}
for i, v in t do
end
I was demonstrating fastcalls optimized table iteration.
But still has a slight effect.
![]()
![]()
Luau implements a fully generic iteration protocol; however, for iteration through tables in addition to generalized iteration (
for .. in t) it recognizes three common idioms (for .. in ipairs(t),for .. in pairs(t)andfor .. in next, t) and emits specialized bytecode that is carefully optimized using custom internal iterators.
So it does run faster. I did my own and got similar ratios. Now i’m more confused as to why it isn’t on by default, and why you can control it per-script?
Optimization level 2 is used in live experiences (I believe), but as I’ve stated, I believe it causes some potential issues with debugging.
I used a tool to compare the bytecode output of a for … in pairs() loop to compare them.
!optimize 0
!optimize 2
The only relevent difference is that it uses GPREP_NEXT instead of just GPREP, so I wonder if thats the optimization. (function out of scope creates a char table for chars 0-255)
I suspect a lot of the actual difference is in the JIT?
My test: Benchmark Button - Roblox
This thread is scarily similar to what I made just before this post
I was wondering why my benchmark had such a big performance gap in between each one then I remembered --optimize 2 But I didn’t know what it did So I tried it and now it is more accurate
I apologize if I may have misinterpreted your response, I truly thank your useful information. So what native does is basically generate less instructions in the luau bytecode? and if so, does it affect some coding elements? (in this case, the things you mentioned like types, garbage collection, etc.). Either way, I guess I have to learn some specifics of deep technical programming.
This is actually one of the reasons I made this thread, I see !native everywhere when checking scripts, and now I see !optimize, and I couldn’t really understand the limitations or actual benefits of this besides better performance.

