Is there any news you can share on multi-threaded Lua execution? Would it allows us to use 100% of the CPU power through Lua?
Alrighty. Well in that case, is there a chance we could get these fastcall optimizations for typeof
too? I prefer to use it in Roblox for basically the same reason you’d use
type
(type checking arguments, primarily) so having it be faster would be a plus.
I would also like that sort of optimization on string.byte
because it ends up in stuff like hashing or base conversion a bunch. There may not be common enough use to justify it though.
Thank you for clearing that up! That does make a lot of sense. The above reply also made me think… Are string functions already using this optimization? Some of my code does tons of string processing (gsub, sub, gmatch, etc). Depending on the size of the content I’m processing I could be making hundreds to hundreds of thousands of string calls.
I wrote a basic compression algorithm which does this and I have had 10-15kb strings that get fed through it which is 10-15k iterations and many more calls.
We had a version of string.byte
using this that accelerates our Base64 benchmark, but string.byte
is slightly tricky so we haven’t completed the implementation to be production-ready yet.
Please share the code for this with us if you can (PM works) - we can then add this to our benchmark suite, so that we can focus on updates that clearly improve it. It’s hard to tell without testing whether a specific optimization is impactful or not, and we want to carefully expand the builtins because for them to be fast, they by necessity have to replicate the original function’s logic so we need to be careful to keep the behavior exactly identical.
Out of curiosity, do you use typeof
as a blanket replacement for type
and use it for primitive types a lot, or do you mostly use type
for primitive types and typeof
for Roblox types? Code examples where you often use typeof
would be appreciated - I think our typeof
coverage in our benchmark suite is lacking. type
specifically improved our Roact benchmark by like 5% because it had a number of type assertions.
Not ready to talk about this yet, sorry - we have not started the implementation. The goal for this project is indeed to allow Lua to use ~100% of CPU power, but there’s going to be caveats and this is going to ship late 2020.
We don’t have plans to implement table.map
. Our general policy for new Lua functions is:
- A function gets implemented if it’s impossible to implement yourself in Lua or the implementation is really involved
- A function gets implemented if it’s very often used and everybody ends up reimplementing it
- A function gets implemented if it gives non-trivial performance benefits over a reimplementation in Lua
Generally two of these should be true for a function to be added.
The reason why we implemented table.create
is a combination of 1 and 3. You can not implement an efficient replacement for this function in some cases, and in some cases you can but the implementation is crazy (I think @Tomarty has one?)
The reason why we implemented table.find
is a combination of 2 and 3. It’s often used so it makes sense to have it in the library, and it’s 2-3 times faster than Lua implementation.
table.map
doesn’t really fit these right now:
- It’s easy to implement in Lua
- It’s not a commonly used mechanism in typical Roblox code (I recognize that some people are familiar with functional programming constructs, but the scope is just different vs
table.find
et al) - It’s not going to be faster if we implement it in C
- Moreover, it is likely to be slower because every function call will go through C->Lua boundary
- Every time you call this function, you’d have to allocate a closure for the transform function. So we would be inviting inefficient code.
Coincidentally we plan to implement some closure allocation optimizations that may make the last point a non-issue in the future, but the design hasn’t been finalized - as usual, there’s some odd interactions with getfenv
/setfenv
(aka my worst enemy).
So yeah, please use for loops for now.
As an aside to anyone who is interested, map
is included in the rodash
library, along with loads of other utilities to support functional programming constructs.
https://codekingdomsteam.github.io/rodash/api/Tables/#map
It’s not blazing fast, and shouldn’t be used in performance sensitive code, but it helps develop clean code which in most cases is what you’ll want to be writing.
table.foreach is pretty much what you’re asking for. I believe it’s deprecated due to the stuff mentioned above.
@zeuxcg I’ll make sure to get my code to you sometime soon. It’s also a bit messy since it’s kind of a prototype of my algorithm so I may comment it for you if it helps out with testing.
There are so many game-changing things in this post - thanks for investing the time into this.
“On Windows and Xbox, we’ve tuned our interpreter to be ~5-6% faster on Lua-intensive code”
Are there any changes to the performance of mobile devices? Particularly the lower-end devices.
Most of the updates improve performance across the board; this one was highlighted in particular because it was specific to the compiler we use on Windows/Xbox builds. We didn’t do any additional mobile-specific tuning yet, although of course the Luau release itself improved the script execution performance noticeably on mobile devices as well (more so on Android than on iOS due to the hardware differences). I don’t think we have up-to-date numbers on this (performance got better), but here are numbers from April this year from an early unreleased version of Luau:
iOS, iPhone6 | |
---|---|
Lua time / Luau time | 04/12 |
TerrainGenerator | 1.61 |
N-Body | 1.74 |
Life | 1.74 |
Android, Pixel1 | |
---|---|
Lua time/Luau time | 04/12 |
TerrainGenerator | 1.64 |
N-Body | 2.68 |
Life | 3.12 |
Factorial | 2.86 |
Excited to see these changes in action, especially with some of those table
functions! Great job
Excellent, thanks for clarifying
I try to use typeof
over type
in all cases on Roblox, yes. The only time I wouldn’t is if I were trying to filter out userdata specifically because it’s easier to write type(x) ~= "userdata"
than it is to write it using typeof
.
I don’t have any great examples for using typeof
a bunch since in cases where I’d have to, I cache the result to avoid calling it a bunch. That being said, I do have some code that with some slight modification would result in typeof
getting called a bunch. It’s a function to convert data types to a string containing a constructor for them that I used in an old plugin. If I removed the caching for the result of typeof
it could reasonably be used as a benchmark I think.
I’ll clean it up and send it to you later today if you’re interested.
If anyone’s curious about this:
To my understanding, using locals to reference variables in a default environment gives direct access to said variables. If I was to localize needed variables, then modify the Lua environment, would previously localized variables still be optimized or impacted by deoptimization?
touching on table.create()
, is it recommend to switch all table creation functions and whatnot to this new format for performance, or is the change not that big (i.e: not a huge game-changer for lag, etc)?
It depends. Generally, if you’re not sure why/if table.create
would help you, it won’t though; a lot of games and developers won’t have a use for it.
Alright, thanks for the clarification.
I did not know that those table functions were run in C, interesting to know :).
I just encountered a correctness error:
for i = 0.05, 0, -0.05 do print("A", i) end
for i = 0, 1 do print("B", i) end
print("========")
for i = 0.05, 0, -0.05 do print("A", i) end for i = 0, 1 do print("B", i) end
Observe that the two sets of loops do the same thing, with the only difference being a newline. When ran within Play Solo or Run, the following output is produced:
A 0.05
A 0
B 0
B 1
========
A 0.05
A 0
B nil