Luau Recap: March 2020

zeuxcg · April 2, 2020, 4:54pm

(yeah, I know, it’s technically April)

As a reminder, Luau (lowercase u, “l-wow”) is an umbrella initiative to improve our language stack - the syntax, compiler, virtual machine, builtin Lua libraries, type checker, linter (known as Script Analysis in Studio), and more related components. We continuously develop the language and runtime to improve performance, robustness and quality of life. Here we will talk about all things that happened since last update, which was just a month ago!

A lot of people work on these improvements; thanks @Apakovtac, @EthicalRobot, @fun_enthusiast, @zeuxcg! if you aren’t going to pat yourself on the back who else will?

If you have missed previous large announcements, here they are!

Let’s dive in.

New pcall/xpcall implementation

When the new VM was developed last year, we spent a lot of time profiling various scripts. In one of them, which was a benchmark of some simple tree manipulation in Roact, we were surprised to see pcall taking a pretty significant amount of time - replacing it with xpcall yielded large improvements in the benchmark throughput. The overhead comes from pcall being able to handle yields, which was grafted on top of the existing functionality the VM provides.

xpcall is fine and all, but many people don’t use it, and it would be awkward to recommend to use xpcall when you need more performance, and you’re sure the code inside it doesn’t yield. Additionally, because xpcall doesn’t support yielding, you can not debug the code running inside it.

To solve these problems, we rewrote the part of the VM that deals with coroutine resumption to support yielding across (some) C calls. This is supported by Lua 5.2; our implementation is somewhat different, and currently more constrained - for now we only support yields in pcall/xpcall - but more performant.

As a result:

pcall is now much faster - up to 30x for simple functions! The performance now matches that of xpcall
Inside pcall, calling debug.traceback will return the full stack including the callers; similarly, when stepping into pcall you’re going to see a full call stack in the debugger
If an error is generated inside pcall after the thread yields, we no longer print it to the output - this was a long standing issue that is fixed as a byproduct of this change.
xpcall now supports yielding (error function can’t yield but the main function can)
xpcall can now be debugged (step into & breakpoints work)

Please note that this change is not fully live on all client devices - it will take a few weeks for this change to propagate to mobile including older versions. It’s however live on desktop client, Studio and on the servers.

New debugger backend

The original debugger backend was written many years ago, and it relied on a VM mechanism called “hooks”. Briefly, in Studio when debugger was enabled in settings, every time the VM executed a line of code it called a C hook that had to check if the line had breakpoints set on it, or if it needed to step through the code.

This made an already not-very-fast VM much slower, and meant that the performance measurements you do while running scripts in Studio aren’t really representative. The old backend also had to rely on somewhat involved logic to filter out various debugger steps, and these complex interactions weren’t tested very well either.

We didn’t want to accept this state of affairs for the new VM and as such wrote a new debugger backend. This doesn’t impact the debugger UI - there’s a separate team working on improving that and the overall debugging experience! - but this does impact the low level debugging engine.

The new backend is more robust and doesn’t slow down script execution unless you’re actively stepping through the code. It works with the new VM (and only with the new VM), supports new pcall/xpcall and is thoroughly unit tested. We don’t expect any behavior regressions - there are a few slight differences around stepping, and some corner cases that the new backend handles better, but that’s about it.

Next week we’re going to ship a small improvement to the backend that will allow you to step over non-yieldable code (which old debugger couldn’t do either), for example, in this code:

local Class = {}
Class.__index = function(t, k) return rawget(Class, k) end
function Class:method()
    print('method')
end
local obj = setmetatable({val = 42}, Class)
obj:method()

Stepping into obj:method() breaks the script right now, but will work next week, bypassing the __index call and jumping straight into the method body.

New VM is 100% live

Because the new debugger wasn’t fully functional (it took us time and a few tries to get it right), we had to maintain two VMs - one for Studio test sessions, and one for everything else. Up until this week, you still used the vanilla Lua VM in Studio Play Solo for this reason. Well, that’s not the case anymore!

With the new debugger backend active, we enabled the new Luau VM in every single context in Studio where it previously wasn’t running.

This means that every user on the platform is now running the Lua code with consistent performance, and has consistent access to all features like continue or yieldable xpcall. It unlocks some further internal optimizations that were just too painful to do in a dual-VM world, and in general makes further progress on language features and performance easier to make.

Old VM has served us well for 15 years, but it’s time to say goodbye.

Type annotation syntax - upcoming changes

We’re getting closer to finalizing the syntax of type annotations. We’ve looked at the remaining issues and external/internal feedback and decided to make a change to the syntax as follows:

For function definitions, instead of using a “fat arrow” (=>), we now use a colon (:) to delimit the return type:

function foo(a: number, b: number): number
    return a + b
end

For function types, instead of using a “fat arrow” (=>), we now use a “thin arrow” (->) to delimit the return type:

type FooFunction = (number, number) -> number

This makes our syntax more consistent with the research project “Typed Lua” that was done by the university that develops Lua, as well as making us more consistent with some modern languages and makes it possible in theory for us to introduce clean shorthand lambda syntax later (not saying we will do this, but we wanted to have this option). Additionally, slim arrows are easier to read in type context like above since = cleanly separates the type alias from the type definition.

This change will happen next week; we are going to support the old syntax for a bit, but it will be removed in a month or so. After we introduce the new syntax, we will be ready to promise syntax compatibility - meaning, it would be safe to upload code with type annotations to production and have it work in the future. NOTE this has not happened yet! Because of the syntax change, existing code with fat arrows will not be supported long term.

Type checking improvements

The type checker is still in beta and it’s seeing continuous improvements. We’re looking at various code bases in both strict and non-strict mode and resolving issues that come up.

As part of this, the type checker is now handling recursive function calls and complex data flow much better than it used to, which should eliminate most cases where in non-strict mode the type of a function in the same script can’t be inferred correctly.

Additionally setmetatable didn’t correctly infer types in some cases and that was fixed as well.

Types can now work across require statements

Type aliases declared in modules are automatically exported and available, namespaced under the name of the local used to require:

local M = {}

type Sandwich = { slices: number }

function M.MakeSandwich()
    return { slices = 5 }
end

return M

local Foo = require(script.Bar)
local test: Foo.Sandwich = Foo.MakeSandwich()

Please note that we have some bugs and limitations around require paths right now, especially around paths that start from game - bear with us as we improve this over the coming weeks!

mistrustfully · April 2, 2020, 9:07pm

Amazing! Glad to see Luau is completely live now.

node_modules1 · April 2, 2020, 9:10pm

Lovely updates to the Language’s future! Am sure many can create epic creations with this right away!

sjr04 · April 2, 2020, 9:18pm

LuWOW.

I’m really excited for this update!! Though I have a few questions.

Why do function types use -> but not :?
- Also will variadic functions be possible with types (e.g. ...number and the function would accept a variable amount of numbers) ?

This has me excited the most. Is there any rough date on when it will be out of beta to use?

anon66957764 · April 2, 2020, 9:24pm

Will Luau types support complex metatable type stuff, even if I have to specify it explicitly?

For example, my class library lets me do this:

local Super = class('Super')
local Sub = class('Sub', Super)
Super.x = 5
print(Sub.x) -- 5

Is there, or will there ever be, a way to describe this with Luau types?

zeuxcg · April 2, 2020, 9:25pm

This syntax isn’t consistent with the industry practice (all other languages we looked at that use colon type delimiters use arrows in the function types, I think). This is actually important for readability. Consider something like a map:

function map<T, R>(data: array<T>, transform: (T): R): array<R>

vs

function map<T, R>(data: array<T>, transform: (T) -> R): array<R>

Even in this basic example, colons everywhere make visual parsing of the code really challenging.

We discussed this before internally but don’t have concrete plans around this yet…

So there’s going to be two separate things that happen:

We commit to supporting the syntax long term, at which point scripts with type annotations will become safe to publish. I am hoping this can happen next week. At this point the type system will still be in beta! But we will promise that your scripts will continue to parse & run long term.
The type system goes out of beta. This would mean that we’re pretty confident that existing code doesn’t generate false positives within reason, and that if you added some type annotations their meaning in Studio will be preserved, so you won’t suddenly open Studio and get more errors We aren’t there yet! This will take a bit more time.

We’ll go over this distinction once we update the syntax - we’ll update the typechecking beta thread with more information.

zeuxcg · April 2, 2020, 9:26pm

This wasn’t intentional, and shouldn’t happen now that we’ve enabled this change.

Halalaluyafail3 · April 2, 2020, 9:38pm

Would this include back porting some features to the already existing functions?
%g for patterns (isgraph)
\0 in patterns
Seperator for string.rep
init argument for string.gmatch
%p format

zeuxcg · April 2, 2020, 9:42pm

It could! We looked at some of those, or rather at all except for the first one. It didn’t seem like there’s a strong use case for any of these, really, so we haven’t bothered - but if they are important we can certainly look into this. %g wasn’t in any release notes in Lua’s history if I recall, or maybe it was just missed.

Here’s our internal spreadsheet we use for tracking this: https://gist.github.com/zeux/bb646a63c02ff2828117092036d2d174/raw/6135734902965fff440cdd7749fa65869208c62c/luau_features.pdf

posatta · April 2, 2020, 9:46pm

Is there an approximate month/quarter we can expect typed Lua to leave Beta by? It’d be really helpful to be able to convert some of my code to use this rather than a massive chain of asserts, but if it’s not releasing in the next few months I’d rather wait. I’d check the roadmap, but it doesn’t seem to have 2020 on it.

Anaminus · April 2, 2020, 11:16pm

Now that type namespaces are a thing, I think class types should be moved to a separate namespace. For example, you would use Instance.BasePart to define a type with the BasePart class, or Instance.Instance for any instance. Using “Instance” as the namespace creates a natural association with Instance.new.

In general, the following scenario can occur:

A module defines type X.
Subsequently, Roblox predefines type X (e.g. a new class was added).
The module cannot begin using predefined type X because it is shadowed by the self-defined type X.
The module cannot rename the self-defined type X without a major version change because other modules may depend on the exported type name.

New classes are added relatively often, so putting them in their own namespace eliminates this problem, at least for classes. Moreover, it reduces pollution of the top-level namespace.

Halalaluyafail3 · April 2, 2020, 11:30pm

It wasn’t, although it was added in 5.2
https://www.lua.org/manual/5.2/manual.html#6.4.1
https://www.lua.org/manual/5.1/manual.html#5.4.1
Some reasons for the features:
\0 in patterns seems better than doing %z, would benefit utf8.charpattern as it uses %z and range [%z\x01-\x7F\xC2-\xF4][\x80-\xBF]* which if \0 were allowed in patterns could be simplified to [\x00-\x7F\xC2-\xF4][\x80-\xBF]*.
%p format for whitelists/blacklists, currently to check if an instance is in a table a linear search must be done. With %p the table could be sorted once and then a binary search could be done to check if the instance is in the table. This can be done currently with tables assuming __tostring isn’t used, but all instances over ride __tostring so this isn’t possible.

local function udatasort(a,b)
	return string.format("%p",a) < string.format("%p",b)
end
local function udatafind(tbl,finding)
	-- binary search
	local str = string.format("%p",finding)
	local L = 1
	local R = #tbl
	while L <= R do
		local M = math.floor((L+R)/2)
		local strm = string.format("%p",tbl[M])
		if strm < str then
			L = M+1
		elseif strm > str then
			R = M-1
		else
			return M
		end
	end
	return nil
end
local n = {}
local tbl = {}
tbl[1] = {}
tbl[2] = n
tbl[3] = {}
table.sort(tbl,udatasort)
print(tbl[udatafind(tbl,n)]==n)

A separator would be useful when you want something on every line, but no extra new line

string.rep("abc\n",3)
--abc
--abc
--abc
--
string.rep("abc",3,"\n")
--abc
--abc
--abc

string.gmatch with init argument would be useful for applying a pattern after a prefix

local function processCmd(str)
    if string.sub(str,1,1) == ":" then
        for l in string.gmatch(str,".",2) do -- example
            print(l)
        end
    end
end

zeuxcg · April 3, 2020, 1:31am

Re: udatafind - you should be much better off with either table.find or filling a table with instances as keys and checking if the key is in that table. We’ll take a look at the other bits, %g/string.rep/string.gmatch seem straightforward at least.

Halalaluyafail3 · April 3, 2020, 1:48am

table.find is O(n), while a binary search is O(log n). I think I will go with the route of filling the table and checking if the key is present, although wouldn’t this benefit from a hash length argument to table.create? Because it would fill in #tbl hash entries in the table, creating the table with a hash length of #tbl should avoid re allocating the table.

metryy · April 3, 2020, 4:51am

I think I found a bug related to the new VM: Underscores are not parsed/ignored after a number:

zeuxcg · April 3, 2020, 5:31am

This is not a bug; we extended Lua number syntax with underscores, as it makes long numbers easier to read, for example 1_000_000. However, we don’t restrict the placement of underscores in any way - this is the standard approach taken by programming languages that have similar number literal support - so 1_0 is the same as 10 which is the same as 10__. Numbers can’t start with an underscore though - _10 is an identifier.

metryy · April 3, 2020, 7:24am

Thanks for the clarification! I’ve never seen this before so I wasn’t sure if this was intentional or a bug.

trzistan · April 3, 2020, 10:53pm

From what I’ve been aware of this, I look at the script analysis and see this popup most of the time: