Luau Recap: November 2019

Is there any news you can share on multi-threaded Lua execution? Would it allows us to use 100% of the CPU power through Lua?

18 Likes

Alrighty. Well in that case, is there a chance we could get these fastcall optimizations for typeof too? :stuck_out_tongue: I prefer to use it in Roblox for basically the same reason you’d use type (type checking arguments, primarily) so having it be faster would be a plus.

I would also like that sort of optimization on string.byte because it ends up in stuff like hashing or base conversion a bunch. There may not be common enough use to justify it though.

1 Like

Thank you for clearing that up! That does make a lot of sense. The above reply also made me think… Are string functions already using this optimization? Some of my code does tons of string processing (gsub, sub, gmatch, etc). Depending on the size of the content I’m processing I could be making hundreds to hundreds of thousands of string calls.

I wrote a basic compression algorithm which does this and I have had 10-15kb strings that get fed through it which is 10-15k iterations and many more calls.

2 Likes

We had a version of string.byte using this that accelerates our Base64 benchmark, but string.byte is slightly tricky so we haven’t completed the implementation to be production-ready yet.

Please share the code for this with us if you can (PM works) - we can then add this to our benchmark suite, so that we can focus on updates that clearly improve it. It’s hard to tell without testing whether a specific optimization is impactful or not, and we want to carefully expand the builtins because for them to be fast, they by necessity have to replicate the original function’s logic so we need to be careful to keep the behavior exactly identical.

3 Likes

Out of curiosity, do you use typeof as a blanket replacement for type and use it for primitive types a lot, or do you mostly use type for primitive types and typeof for Roblox types? Code examples where you often use typeof would be appreciated - I think our typeof coverage in our benchmark suite is lacking. type specifically improved our Roact benchmark by like 5% because it had a number of type assertions.

3 Likes

Not ready to talk about this yet, sorry - we have not started the implementation. The goal for this project is indeed to allow Lua to use ~100% of CPU power, but there’s going to be caveats and this is going to ship late 2020.

12 Likes

We don’t have plans to implement table.map. Our general policy for new Lua functions is:

  • A function gets implemented if it’s impossible to implement yourself in Lua or the implementation is really involved
  • A function gets implemented if it’s very often used and everybody ends up reimplementing it
  • A function gets implemented if it gives non-trivial performance benefits over a reimplementation in Lua

Generally two of these should be true for a function to be added.

The reason why we implemented table.create is a combination of 1 and 3. You can not implement an efficient replacement for this function in some cases, and in some cases you can but the implementation is crazy (I think @Tomarty has one?)

The reason why we implemented table.find is a combination of 2 and 3. It’s often used so it makes sense to have it in the library, and it’s 2-3 times faster than Lua implementation.

table.map doesn’t really fit these right now:

  • It’s easy to implement in Lua
  • It’s not a commonly used mechanism in typical Roblox code (I recognize that some people are familiar with functional programming constructs, but the scope is just different vs table.find et al)
  • It’s not going to be faster if we implement it in C
  • Moreover, it is likely to be slower because every function call will go through C->Lua boundary
  • Every time you call this function, you’d have to allocate a closure for the transform function. So we would be inviting inefficient code.

Coincidentally we plan to implement some closure allocation optimizations that may make the last point a non-issue in the future, but the design hasn’t been finalized - as usual, there’s some odd interactions with getfenv/setfenv (aka my worst enemy).

So yeah, please use for loops for now.

5 Likes

As an aside to anyone who is interested, map is included in the rodash library, along with loads of other utilities to support functional programming constructs.

https://codekingdomsteam.github.io/rodash/api/Tables/#map

It’s not blazing fast, and shouldn’t be used in performance sensitive code, but it helps develop clean code which in most cases is what you’ll want to be writing.

1 Like

table.foreach is pretty much what you’re asking for. I believe it’s deprecated due to the stuff mentioned above.

@zeuxcg I’ll make sure to get my code to you sometime soon. It’s also a bit messy since it’s kind of a prototype of my algorithm so I may comment it for you if it helps out with testing.

1 Like

There are so many game-changing things in this post - thanks for investing the time into this.

“On Windows and Xbox, we’ve tuned our interpreter to be ~5-6% faster on Lua-intensive code”

Are there any changes to the performance of mobile devices? Particularly the lower-end devices.

1 Like

Most of the updates improve performance across the board; this one was highlighted in particular because it was specific to the compiler we use on Windows/Xbox builds. We didn’t do any additional mobile-specific tuning yet, although of course the Luau release itself improved the script execution performance noticeably on mobile devices as well (more so on Android than on iOS due to the hardware differences). I don’t think we have up-to-date numbers on this (performance got better), but here are numbers from April this year from an early unreleased version of Luau:

iOS, iPhone6
Lua time / Luau time 04/12
TerrainGenerator 1.61
N-Body 1.74
Life 1.74
Android, Pixel1
Lua time/Luau time 04/12
TerrainGenerator 1.64
N-Body 2.68
Life 3.12
Factorial 2.86
4 Likes

Excited to see these changes in action, especially with some of those table functions! Great job :slight_smile:

2 Likes

Excellent, thanks for clarifying

1 Like

I try to use typeof over type in all cases on Roblox, yes. The only time I wouldn’t is if I were trying to filter out userdata specifically because it’s easier to write type(x) ~= "userdata" than it is to write it using typeof.

I don’t have any great examples for using typeof a bunch since in cases where I’d have to, I cache the result to avoid calling it a bunch. That being said, I do have some code that with some slight modification would result in typeof getting called a bunch. It’s a function to convert data types to a string containing a constructor for them that I used in an old plugin. If I removed the caching for the result of typeof it could reasonably be used as a benchmark I think.

I’ll clean it up and send it to you later today if you’re interested.

1 Like

If anyone’s curious about this:

7 Likes

To my understanding, using locals to reference variables in a default environment gives direct access to said variables. If I was to localize needed variables, then modify the Lua environment, would previously localized variables still be optimized or impacted by deoptimization?

1 Like

touching on table.create(), is it recommend to switch all table creation functions and whatnot to this new format for performance, or is the change not that big (i.e: not a huge game-changer for lag, etc)?

1 Like

It depends. Generally, if you’re not sure why/if table.create would help you, it won’t though; a lot of games and developers won’t have a use for it.

2 Likes

Alright, thanks for the clarification.
I did not know that those table functions were run in C, interesting to know :).

1 Like

I just encountered a correctness error:

for i = 0.05, 0, -0.05 do print("A", i) end
for i = 0, 1 do print("B", i) end

print("========")

for i = 0.05, 0, -0.05 do print("A", i) end for i = 0, 1 do print("B", i) end

Observe that the two sets of loops do the same thing, with the only difference being a newline. When ran within Play Solo or Run, the following output is produced:

A 0.05
A 0
B 0
B 1
========
A 0.05
A 0
B nil
6 Likes