Luau Recap: November 2019

There were no recent changes targeted to reduce memory consumption. Shortly before the Luau release there was some work to optimize memory consumption for certain objects, mostly userdata objects got much smaller (e.g. instance references got 1.5x smaller) and table objects might have gotten a tiny bit smaller. This shipped in June though.

There is definitely some opportunity for further memory consumption improvements but not quite as large as there is for performance.

2 Likes

I have another performance based question. Out of habit I tend to “precalculate” table length like so:

local length = #tbl

-- Loop that uses length

Will table length be dynamically updated or will it be recalculated somehow? I don’t have enough knowledge of this at a low level but my guess is that it is dynamically updated.

Is this any slower than the above?

-- In loop
local length = #tbl
5 Likes

Will we ever get something similar to the ffi of LuaJIT?

I want it specifically for statically typed long arrays for both better performance and memory management.

Something like;

local hash = ffi.new("uint8_t[]", 256*256*256) --16 MB big table, 256 MB big in Luau

for creating a hashmap for making a minecraft-like game with voxels where each point could be one of 256 different voxel types.

Memory Benchmarks
  • LuaJIT ffi:
local size = 256*256
local ramt = collectgarbage("count")*1024 + 16 --16 byte overhead

local ffi = require"ffi"
local hash = ffi.new("uint8_t[?]", size)

print((collectgarbage("count")*1024 - ramt) / size) --1
  • LuaJIT:
local size = 256*256
local ramt = collectgarbage("count")*1024 + 64 --64 byte overhead

local hash = {}
for i = 1,size do
	hash[i]=0
end

print((collectgarbage("count")*1024 - ramt) / size) --8 times more space than required
  • Luau:
local size = 256*256
local ramt = collectgarbage("count")*1024 + 64 --64 byte overhead

local hash = {}
for i = 1,size do
	hash[i]=0
end

print((collectgarbage("count")*1024 - ramt) / size) --16 times more space than required

With Luau, I use 16 times more memory than I be would from using LuaJIT with ffi.

That’s only the memory optimization. Accessing that array is also much faster since LuaJIT doesn’t need to do a typecheck for each access and can atomically optimize the operations.

Speed Benchmarks
  • LuaJIT ffi:
local size = 256*256*256
local ffi = require"ffi"
local tick do --Code for a LuaJIT tick() function
	ffi.cdef [[typedef unsigned long DWORD, *PDWORD, *LPDWORD;
		typedef struct _FILETIME {
		  DWORD dwLowDateTime;
		  DWORD dwHighDateTime;
		} FILETIME, *PFILETIME;

		void GetSystemTimeAsFileTime ( FILETIME* );]]
	local ft = ffi.new ("FILETIME[1]")
	tick = function()
		ffi.C.GetSystemTimeAsFileTime(ft)
		return tonumber(ft[0].dwLowDateTime)/1e7 + tonumber(ft[0].dwHighDateTime)*(2^32/1e7) - 11644473600
	end
end

local hash = ffi.new("uint8_t[?]", size)
for i = 0,size-1 do --Because C arrays start from 0
	hash[i]=i%127
end

local t = tick()

for i = 0,size-1 do
	hash[i] = hash[i]*2
end

print(tick()-t) --0.01 seconds
  • LuaJIT:
local size = 256*256*256
local ffi = require"ffi"
local tick do --Code for a LuaJIT tick() function
	ffi.cdef [[typedef unsigned long DWORD, *PDWORD, *LPDWORD;
		typedef struct _FILETIME {
		  DWORD dwLowDateTime;
		  DWORD dwHighDateTime;
		} FILETIME, *PFILETIME;

		void GetSystemTimeAsFileTime ( FILETIME* );]]
	local ft = ffi.new ("FILETIME[1]")
	tick = function()
		ffi.C.GetSystemTimeAsFileTime(ft)
		return tonumber(ft[0].dwLowDateTime)/1e7 + tonumber(ft[0].dwHighDateTime)*(2^32/1e7) - 11644473600
	end
end

local hash = {}
for i = 1,size do
	hash[i]=i%127
end

local t = tick()

for i = 1,size do
	hash[i] = hash[i]*2
end

print(tick()-t) --0.02 seconds (twice as slower than with ffi)
  • Luau:
local size = 256*256*256

local hash = {}
for i = 1,size do
	hash[i]=i%127
end

local t = tick()

local total = 0
for i = 1,size do
	hash[i] = hash[i]*2
end

print(tick()-t) --0.36 seconds

Luau is 36 times slower than LuaJIT with ffi and 18 times slower without ffi in this benchmark example on my i5 3570. The potential of optimizations for Luau are still great and some library like ffi would be a great addition to it. It’s not only but custom structs with custom metamethods in long arrays would be even more efficient.

9 Likes

FFI isn’t memory safe. We will never ship any features that aren’t memory safe.
This doesn’t necessarily preclude us implementing native memory efficient arrays, but that won’t have much to do with FFI.

The performance comparisons are also somewhat unfair here, how fast is LuaJIT with JIT disabled on this benchmark?

2 Likes

I missed this, sorry. If the table doesn’t reallocate then it’s indeed faster to “cache” the table length as you do right now. Computing the length is not free - it’s very cheap for “proper” arrays with exact size but gets expensive when arrays get reallocated. I’ll spare the details but it can take time, e.g. in one of our benchmarks that tested array assignment performance, we had to change a loop like this

for i=1,N do t[#t+1] = i end

to a loop like this:

for i=1,N do t[i] = i end

to make sure the extra overhead of # didn’t “pollute” the benchmark numbers.

I have a dream to fix this but it involves doing a deep semantical change to how Lua tables/arrays behave with length. It will have compatibility issues, and I’m not sure we’ll be able to deploy it safely (this is because it would change the semantics of # operator on tables with holes)

5 Likes

How about static objects and static arrays for less overhead? Metamethods could be created for them and they could be checked before runtime if they are memory safe or not. Calling these methods could be faster as they could be atomically optimized without memory checking at runtime.

jit.off() pretty much kills the ffi casting. Benchmark with the ffi array becomes 1.8 seconds (5 times slower than Luau) but the one without the ffi and jit is still 3.6 times faster than Luau. jit.off() makes my point of using ffi mute since (for LuaJIT) it’s dependant on jit component.

The reason why we can’t have runtime compilation is because Apple doesn’t allow it which in turn means jit can’t be used. Apple Developer Program License Agreement, APIs and Functionality, 3.3.2

But why not jit compile Luau for servers? Is it not worth the time to implement it? Personally I’d like a speed boost most for clients but a way faster server could open up more possibilities.

The interpreter can check the boundaries of the arguments and error before running the code. Doesn’t allow uninitialized/null pointers and doesn’t allow dynamic access to non-memory safe ffi library.

You most likely entertained all these ideas before, but we’d like reasons as to why for example Roblox isn’t developing jit compilation for servers?

I’ve added a backlog item to look at the performance for the simple array fill loop; I don’t think this has come up in our benchmarks (table.create et al is a better fit usually) but we can make a few optimizations to make that faster in the interpreter.

As for your other questions, all good things in due time. You may be interested in

10 Likes

Looking at that video, should we expect the syntax for the “Typed Lua” on the roadmap to be similar to the syntax in the video? Or was this merely just for the lua JIT compiler.


3 Likes

Also very curious on this point. It would be a shame if typed lua didn’t have rich interface types similar to typescript; I just wonder how types will be exported between modules, since right now ModuleScripts can only return a single value.

2 Likes

Something that I would love to see in Lua would be the simple math operations(?). By simple math operations mean the ++, +=, --, -=, *=, /=, etc. These are in many programming language, and they are very easy to use. They can help you shorten your code, and you will not need to rewrite a variable twice just to add one to it (Variable = Variable + 1). I would love to see these in Lua as they would make simple math operations very easy and short.

1 Like

It would be interesting to see them added, but you’d likely need to bring that up with Lua themselves rather than Roblox.

1 Like

If you’re interested in those being added, you should go make a feature request for it – or support an existing one, since I’m sure these have been requested before. I believe they’re called assignment operators.

I don’t think there’s high odds of ++ and -- being added at the very least; -- happens to be the syntax for comments so that would be a bit awkward.

2 Likes

This is the syntax for typed Lua, which should go in beta this week. We automatically support exporting types declared in a module right now. All of this is subject to change - our plan is to evolve type support based on the feedback we get once it goes beta.

5 Likes

Just a note that while we will certainly look at new features that get added to mainline Lua, we will only port them to Luau if the feature doesn’t have notable downsides and/or is very valuable. We have no intention right now to take some of the newer Lua 5.4 features like toclose or const for example.

5 Likes

Increment operators are just a quality of life feature. For loops are the main usage case in other languages and Lua operates with a range instead of an expression. This alone would make all of the increment operators just as often used if added. As -- cannot be implemented since it conflicts with comments, it would be logical not to add them at all.

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.