Faster Lua VM: Studio beta

zeuxcg · June 25, 2019, 11:27pm

We’re excited to share an early preview of our “Faster Lua” project with you today!

What is it?

As the platform grows, we see more and more use of Lua and as a result, it’s more and more important to make sure Lua scripts run efficiently. We’ve spent some time over the last few years optimizing our reflection layer - the system that allows Lua scripts to communicate to the engine - but we’ve never done anything about Lua itself.

With the goal of having Lua run faster on all our platforms, we’ve evaluated a set of approaches and decided to build a new Lua implementation because we’re crazy. The goal is complete feature parity - scripts that run in the old VM should run in the new VM - and better performance.

We have written a brand new compiler and a brand new interpreter, and we’ve made a few changes to the reflection layer and garbage collector, but we haven’t changed the standard library, so depending on the performance characteristics of your code you will see varied gains. Reflection access got a bit faster in both old VM and new VM as a result of this work as well.

In usual scripts we see a mix of reflection access and other work - depending on what the scripts do you can get all the way between “it’s not faster at all” and “it’s several times faster” as a result. Our builtin terrain generator is about 2x faster with new VM, and some scripts we’ve looked at are up to 3x faster but it’s more common to see more moderate gains.

New VM also produces bytecode that’s a bit smaller than old VM does; in particular we no longer send local variable names to the client, which will make decompilers slightly less useful.

How do I get it?

Assuming you are enrolled in our beta program (if you aren’t, you’ll want to read this), you can access the list of beta features in the File menu:

And then enable this beta feature (don’t forget to restart Studio!):

After this, new VM will be active in Studio for all script execution, including game scripts, core scripts, plugins and command bar.

Debugger doesn’t work!

We currently only support a limited amount of interaction with the debugger - break on error and inspection of the variables on the call stack should work, but breakpoints and stepping doesn’t work yet.

We’re actively working on implementing debugger support in the new VM; we decided to share this early preview even though it’s not ready.

Note: in old VM, enabling debugger would adversely affect performance of game scripts in Studio; new VM will have a zero-overhead debugger (no overhead when debugger is enabled but you aren’t actively interacting with it), which will mean that performance in Studio test modes accurately reflects desktop performance.

My plugin / game breaks when I enable this!

Our intention is to support all existing Lua code that runs on Roblox platform without changes. We don’t know if it’s going to be possible but that’s the hope. If you have any scripts that misbehave in the new VM, please report this on this thread so that we can investigate this and, hopefully, fix this.

My game works! How do I get this performance benefit on actual client/server?

We plan to enable the new VM fully for all existing games on client/server eventually, but we want to be careful about doing this.

If your game is popular, you have tested your game in Studio and you would like us to enable the new VM for your game, please let us know (note that we won’t be able to do this immediately but we’ll try to do this in the following weeks, and we’ll personally notify devs who requested this once it’s live).

Once enough popular games have the new VM running well on production and we’re pretty sure we don’t have any outstanding bugs, we’ll try to enable the new VM for all existing games; once we confirm that there are no issues with the new VM anywhere, we will deprecate and remove the old VM.

How do I know what Lua constructs are fast in the new VM and what are slow?

We have a host of different optimizations in the new VM, and some of them render some performance advice, such as the necessity to cache certain table lookups like math.sqrt obsolete.

It’s also worth noting that in the new VM, you pay a more substantial penalty for use of getfenv/setfenv functions. We strongly recommend not using either of these in any scripts if at all possible, and to stop using module systems that inject globals using setfenv in favor of more traditional use of named require imports e.g. local module = require(Path.To.Module).

We are going to publish a script performance guide in the coming months that describes the performance characteristics of common Lua constructs in detail; also I will give a talk about this on RDC in August, which you’re welcome to attend or tune in to if we’re streaming it.

I want my scripts to run even faster!

We have more performance tuning we’d like to do on the new VM this year, so if you have code that’s particularly performance heavy feel free to share it with us and we’ll see what we can do! We are also starting to look into what it would take to enable multi-threading support for Lua but this will not happen this year.

Behavior / compatibility changes in new VM

game("GetService", "RunService") is no longer a valid way to invoke the GetService method. This way of calling methods was an accidental side effect of how we implemented namecall; for new VM we have revised the namecall implementation to be more efficient which means that this no longer works. Note that this was never a supported extension to Roblox Lua and we have warned about this breaking in the future without notice.
There’s now a limit on the nesting depth of complex expressions; previously 0+0+...+0 could compile for an arbitrary amount of intermediate additions, we now limit the depth of expression trees to a reasonable value (the current limit is 1000).
Local variable names and upvalue names are no longer mentioned in the error messages because VM no longer tracks them (except in the debugger)
~~0/0 and some other constant expressions of this kind produce 0 instead of NaN on 64-bit Windows Studio.~~ This has been fixed in Studio 392.
~~Scripts with a lot of different integer numbers take a lot of time to compile in 64-bit Studio.~~ This has been fixed in Studio 392.
~~Indexing unassigned globals (print(a.a)) results in nil.~~ This has been fixed in Studio 394.
~~When compiling very long chains of method call expressions, such as obj:Method1(1,2):Method2(1,2):....., compiler may run out of registers.~~ This has been fixed in Studio 396.
~~When compiling multi-line string literals with embedded Windows-style line endings, the resulting literal contains \r characters.~~ This has been fixed in Studio 396.

Wsly · June 26, 2019, 8:35pm

This is plain awesome, and every performance increase will allow complex games to push forward. However, as games grow more complex managing object references and utilizing them from multiple scripts can quickly become a choir. In the topic regarding setfenv and getfenv a lot of developers shared their technique to use these methods for reference handling and writing cleaner code that doesn’t require stacking of many WaitForChild()'s for each reference.

With the new VM taking a performance hit when getfenv and setfenv are used, implementing this strategy differently for future projects would be the way to go. However, as all games will eventually switch to the new VM, it raises the question whether the ‘New VM but performance hit’ situation is still faster than or equal to the current situation, or potentially worse. Knowing this would help developers decide whether it’s worth optimizing existing games already using these systems.

I’m also wondering whether the resulting performance hit when using setfenv/getfenv exclusively affects the specific scripts/threads, or the game in general. Are there VM/Game-wide differences when these methods are used?

lunoeh · June 26, 2019, 8:36pm

print(0/0*0) is 0 in new but -nan(ind) in old

zeuxcg · June 26, 2019, 8:38pm

Code that uses getfenv/setfenv doesn’t run slower in the new VM compared to the old VM based on our analysis, so you don’t have to rewrite existing code. You will be missing out on some performance improvements, which is why we advise against it.

Using getfenv/setfenv will penalize access to builtin globals (math, game, etc.) in any threads that were ran from the same script, and (with a subsequent set of changes that hasn’t happened yet) will disable optimizations on some built-in functions like math.max etc.

Also worth noting is that getfenv/setfenv will not be compatible with the Typed Lua - in typed scripts, using getfenv/setfenv will violate type invariants so code that seems type-safe won’t be. And I guess script analyzer has been warning about this for a while now. The current preview doesn’t have type support yet so that’s coming later this year, but worth keeping in mind - getfenv/setfenv is just not a sound mechanism going forward.

superhudhayfa · June 26, 2019, 8:40pm

Might be helpful to say what VM stands for for us skids (script kiddies) out there who aren’t as used to these terms.

EDIT: In case people reply to this, a search makes me come to the conclusion it stands for “Virtual Machine”, which I really should have already known, but yeah - if that’s wrong, then feel free to correct.

ThatTimothy · June 26, 2019, 8:48pm

This. Is. Awesome.

With the new VM, scripts run significantly faster! The thing I find the most cool about this is Roblox is reaching out to check with devs before implementing a cross-platform change.

Hope to see more development to this, have a good day!

colbert2677 · June 26, 2019, 9:05pm

Well, this is a welcome holiday gift (exams just finished yesterday ). I’m hoping that with a combination of good practice, code will see significantly better performance in production servers and whatnot. There’s still of course though, the need to improve practices to ensure we aren’t tanking our own performance even with the new VM.

NobleReign · June 26, 2019, 9:05pm

time to never attempt to optimize code again O_O
in all seriousness, this is awesome, especially for the people who can’t afford supercomputers and want to try to play some intensive game. maybe i can finally play Jailbreak on my laptop

also, where do we sign up to test this in a real game?

Jrelvas1 · June 26, 2019, 9:05pm

YES! Now AI-Centric games can go WILD, thanks for this change!

Maximum_ADHD · June 26, 2019, 9:12pm

If getfenv/setfenv are not advised for future use, could we have an include function that loads variables into the environment from a required ModuleScript that returns a dictionary of variables to load?

I think a lot of the primary use cases for environment manipulation revolve around avoiding redundant variable declarations across a project, such as having to locate a specific ModuleScript in the DataModel using WaitForChild.

grilme99 · June 26, 2019, 9:15pm

Will this update increase the speed of obfscurated code? I sell a service, and some parts of my product are pretty obfscurated, which can slow down the script quite a bit. This is particularly noticable when doing loops. If I tween a frame’s transparency with a for loop, it will be very slow and choppy.

This update looks pretty awesome, though!

Jrelvas1 · June 26, 2019, 9:16pm

I think this is a general increase in speed, so maybe it can get a little boost

Partixel · June 26, 2019, 9:18pm

Not sure if you’d want to fix this or not as it’s technically not a feature but I did find one difference that has caused me an error:
You can no longer call methods of an instance by calling the instance with the methods name as the first argument ( e.g. workspace( “FindPartOnRayWithIgnoreList”, ray, IgnoreList, false, true ) )

TheNexusAvenger · June 26, 2019, 9:32pm

Are there any new deviations from Lua 5.1 that the new VM allows for, like more than 60 upvalues in a given scope?

AxisAngle · June 26, 2019, 9:39pm

I genuinely appreciate the effort in making nans (I think) impossible to get from Lua instructions (you can still get the nans with Vector3s)

print(Vector3.new().unit.x)
print(-Vector3.new().unit.x)
>  -nan(ind)
>  nan

but there’s something a little strange about (-1)^(1/2) giving you 0, especially considering math.sqrt(-1) gives you -nan(ind).

All of the math functions can still give you nans, as far as I can tell, like math.sin(math.huge) and math.asin(2).

A recommendation might be to have math.nan = 0/0 and math.qnan = -(0/0) or whatever the default is. For example, I use these shortcuts in my serializer code to see if I need to store a nan or not.

buildthomas · June 26, 2019, 9:49pm

I’m confused about the math changes some people point out above and how far reaching that is – is this intended functionality? Is there more reading material or another post that explains this?

Nil_ScripterAlt · June 26, 2019, 10:10pm

Faster Lua! I can’t believe that Lua could be even faster! Amazing work! I’ve been trying it out and it has been going really well (however, I am concerned about the getfenv and setfenv changes)

zeuxcg · June 26, 2019, 10:13pm

The NaN handling is a bug, you should be able to get NaN using (-1)^(1/2) - we’ll fix this!

zeuxcg · June 26, 2019, 10:15pm

This is one thing we will not fix Hopefully you weren’t relying on it! It was an artifact of how namecall extention was implemented in old VM; new VM uses a faster implementation that doesn’t happen to have the same behavior in this case.

zeuxcg · June 26, 2019, 10:16pm

Thanks! We missed this; this will be fixed in the next release.