Native has been enabled on the client unannounced for some reason
Oh, damn. Finally! I hope it stays like this and they don’t just randomly remove it.
This is the first time a feature request has been implemented in ages.
As far as I know what they meant in the announcement was that native codegen gets better speeds via types. Types don’t increase speeds by themselves, lua is a dynamically typed language while luau provides a typechecker ontop of it to ease development.
From my tests, it actually does affect the bytecode to some extent already. At least in --!strict mode
Huh? What do you mean change the bytecode?
I thought types were discarded during runtime, could I see your benchmark code?
Based on my own, (doing vector3 & scalar math without ncg) they seem 1:1.
Maybe I’m doing something wrong
Code
--!strict
-- strict typechecking speed benchmark, compared to non-typed.
local ITERATION_COUNT = 4096
local typedVectorMath: {number} = {}
local typedScalarMath: {number} = {}
local nonTypedVectorMath: {number} = {}
local nonTypedScalarMath: {number} = {}
--// A function to get the median percentile of the number array
local function calculateMedian(arr: {number}): number
table.sort(arr)
local n = #arr
return n % 2 == 0 and (arr[n//2] + arr[n//2 + 1])/2 or arr[math.ceil(n/2)]
end
local function formatTime(seconds)
if seconds >= 1e-3 then -- Millisecond range (>= 0.001 seconds)
return string.format("%.3f ms", seconds * 1000)
elseif seconds >= 1e-6 then -- Microsecond range (>= 0.000001 seconds)
return string.format("%.3f μs", seconds * 1e6)
else -- Default to nanoseconds
return string.format("%.3f ns", seconds * 1e9)
end
end
-- ===== BENCHMARKS ===== --
-- Typed vector math
for _ = 1, ITERATION_COUNT do
local begin = os.clock()
local vec: Vector3 = Vector3.new(1,1,1)
local vec2: Vector3 = Vector3.new(2,2,2)
local temp: Vector3 = vec * vec2
table.insert(typedVectorMath, os.clock() - begin)
end
-- Typed scalar math
for _ = 1, ITERATION_COUNT do
local begin = os.clock()
local a: number = 1.5
local b: number = -4.2
local c: number = 2.8
local d: number = 3.14
-- ax³ + bx²y + cxy² + dy³
local x: number = 2.5
local y: number = 1.8
local term1: number = a * (x^3)
local term2: number = b * (x^2) * y
local term3: number = c * x * (y^2)
local term4: number = d * (y^3)
local result: number = ((term1 + term2) * (term3 - term4)) / (x^2 + y^2)
table.insert(typedScalarMath, os.clock() - begin)
end
task.wait()
-- Non-typed vector math
for _ = 1, ITERATION_COUNT do
local begin = os.clock()
local vec = Vector3.new(1,1,1)
local vec2 = Vector3.new(2,2,2)
local temp = vec * vec2
table.insert(nonTypedVectorMath, os.clock() - begin)
end
-- Non-typed scalar math
for _ = 1, ITERATION_COUNT do
local begin = os.clock()
local a = 1.5
local b = -4.2
local c = 2.8
local d = 3.14
local x = 2.5
local y = 1.8
local term1 = a * (x^3)
local term2 = b * (x^2) * y
local term3 = c * x * (y^2)
local term4 = d * (y^3)
local result = ((term1 + term2) * (term3 - term4)) / (x^2 + y^2)
table.insert(nonTypedScalarMath, os.clock() - begin)
end
-- ===== RESULTS ===== --
print("Typed Vector Math (Median %ile):", formatTime(calculateMedian(typedVectorMath)))
warn("Non-typed Vector Math (Median %ile):", formatTime(calculateMedian(nonTypedVectorMath)))
print("Typed Scalar Math (Median %ile):", formatTime(calculateMedian(typedScalarMath)))
warn("Non-typed Scalar Math (Median %ile):", formatTime(calculateMedian(nonTypedScalarMath)))
This initially got my hopes up… until I realized you tried this in Studio.
Just to clarify, NCG has always been enabled by default in Studio, working with both LocalScripts and ServerScripts.
However, after testing this on live servers, I can confirm it’s still not yet enabled there.
I’m guessing it was a bug while lowering the IL, and iirc calls to the buffer
library’s methods are optimized away with the necessary instructions, so the lowerer probably messed something with the stack pointer (and not Lua’s stack, which isn’t a stack in the first place, which is what I think @SyntaxMenace was talking about)
They affect internally; things like type, typeof remain unchanged.
not realistic
roblox supports a lot of platforms, doing this would require building for each of these platforms:
- ARMv7 Android
- ARMv8 Android
- x86_64 Android
- ARMv8 iOS
- x86_64 Mac
- ARMv8 Mac
- x86_64 Windows
- ARMv8 Windows
- x86_64 Xbox One and Series
- x86_64 PS4 and 5
Not to mention that each one of those x86 and arm subversions would need different builds to take advantage of every possible featureset that they can have (AVX2, AVX, no AVX, etc) and the consoles each would need their own console specific build
roblox uses some arch-dependant libraries, so this still wouldn’t be a problem
The Operating System here doesn’t affect the compatibility iirc, as the codegen will only emit calls to lua C functions (eg luaH_getn
) / optimize them away entirely. All you need to support here are the different CPU architectures (x86_64, ARMv7, ARMv8).
LLVM exists which can do all of that
LLVM has been around for what? 25 years or so. I used it and gcc for writing kernel level code. It wouldn’t be hard to build an LLVM front end for LUA to translate it directly into the AST (Abstract Symbol Tree) and then pass that on to the code generator for translation into IML. After translation into the AST step, everything else is the same.
So what Roblox could do is use an expanded LLVM or gcc compiler and build the object code for multiple targets when the code is published. Once built, there is no further processing cost for the code other than to figure out which version to send to the client and send it.
The most problematic platforms are the mobile ones with three major operating systems available depending on the manuacturer:
- iOS (Apple Only)
- Android (Everyone else…even the Amazon Fire tables are Android, I have one)
- Windows (Microsoft Surface, and a few others but not any major market share)
With iOS, there is only ONE manufacturer so that’s pretty much a no-brainer there. Windows implements Bill Gates dream of abstracting the hardware to higher levels of software which gives applications a uniform platform to run on, hence the enhanced compatibility. Android, for those that don’t know, is based on Linux. There are multiple manufacturers and versions which present a challenge. I think the current Android version is 15. My phone is using 8.1. There’s quite a bit of variance in between.
However, by compiling the LUA code directly to assembler and then to machine language and linking it to Roblox provided libraries, Roblox can still maintain control of the execution environment as well as maintain security if the compiled LUA code uses the same function entrypoints that native LUA bytecode uses.
Considering what they have already, it wouldn’t be much to add that guarantee.
EDIT:
There is one issue that could present a problem that I just thought about. With the security that operating systems implement now with read/write and no execute permission bits in the descriptor tables for each memory page (4096 bytes, or 2MB, depending on OS), that could be a hurdle. The code would have to download into a memory segment that is marked as R/W and execute, which many operating system will not allow due to exploits. You have to be at Ring 0 to touch that stuff.
That could be a very real technical reason as to why they haven’t turned it on.
Adding an LLVM backend should be trivial considering most of native code generation is OS agnostic anyways. The way native works now is bytecode is compiled directly to architecture specific machine code. LLVM front end would simply replace that step.
IIRC the main reason native will never come to client is Apple operating systems not allowing for dynamic executable code like you said