Luau Recap: August 2021

I have a concern about the practicality of the current type system: The fact that the implementation and definition of table types happen in different areas makes types feel flimsy to interact with, especially in projects written outside of studio (we rely on 3rd party type engines to do that, which are not accurate in replicating luau’s linting). Needing to remember to update the type definition every time you change the type implementation, in practice, has made type definitions unreliable because they’re frequently not up to date with the implementation. As a result, we’ve ended up rarely defining table types and instead using annotations as a way of looking up constructors/implementations. Are there plans for the type engine to use constructor functions as a basis for generating locked-table-type definitions?

For example:

function Something.new(): MyType
    local t: MyType = {
        a: number|Instance = 0,            --Not valid syntax
        b: string? = "Hi"
    }
    return t
end

And there can be much cleaner syntax:

type MyType = {
    a: number|Instance,
    b: string?
}

function Something.new(): MyType
    local t = MyType           --Allocates a table of the defined type
--Alternatives:
    --local t = table.create(MyType)
    --local t = new MyType
    --local t = new MyType(a = 0, b = "Hi")
    return t
end

And something that feels missing is array-like objects with specified layouts. For example:

type ItemID = number
type StackSize = number
type InventorySlot = {
	[1]: ItemID,             --This syntax is invalid
	[2]: StackSize
}

type Inventory = {
	InventorySlot
}

This is a common design pattern for making objects easily serializable or easily addressable in communication between modules or client<->server, however it’s just not possible to tell the type system that an array is expected to have a specific type in a specific slot if it is not in all of the slots.

4 Likes

I think really what you’re saying is that writing type-safe code without a type-checker is difficult. This is true and I don’t think this can ever stop being true. We have long-term plans for fixing this, but they are in flux (and simultaneously in progress) so I don’t want to promise anything in particular until we can announce something with confidence.

So I don’t think it’s a syntax problem at all, that feels secondary.

First-class support for different types for indexed elements of tables was discussed; it’s currently not being worked on but we will consider that for the future, as that does come up occasionally.

The problem with your constructor idea is that there’s no defaults that are sensible for many types, so it’s not clear what exactly a dummy constructor like that would do. That said, we have plans to introduce a different way to define and work with objects in the future - we’re really focused as much as possible right now on working with existing data structures developers are used to, but at some point there’s going to be something that’s more ergonomic and performant in type-safe code. We don’t have detailed designs here yet so again nothing concrete to share - yet.

2 Likes

FYI, if you’re using types like that for performance reasons, that’s a bad idea and actually counter-productive in some scenarios. Field access is significantly more optimized in the Luau VM than it is in the vanilla Lua VM, so much so that making your run-time data structure an array of fixed indices rather than a struct with string field names can actually make your code slower rather than faster.

Even for data storage, unless you really need to squeeze every bit of space you can out of a data store key, using named fields is generally a better idea because it’s much easier to debug and migrate between data store structures as your game evolves.

3 Likes
local function f()
	return function()end
end
print(f()==f())

I just noticed that this now outputs true, will this be change be listed under the incompatibilities page on the Luau website?

1 Like

I wouldn’t say that this is an incompatibility of note because Lua 5.3, if memory serves (or maybe 5.2?), had the same behavior. We will note this optimization once it ships outside of Studio on the performance page. (it’s part of the release notes for the release today which is why it didn’t make it into the recap)

1 Like

From reading the 5.1 manual:
2.5.2 states Objects (tables, userdata, threads, and functions) are compared by reference: two objects are considered equal only if they are the same object. Every time you create a new object (a table, userdata, thread, or function), this new object is different from any previously existing object., and 2.5.9 states A function definition is an executable expression, whose value has type function. When Lua pre-compiles a chunk, all its function bodies are pre-compiled too. Then, whenever Lua executes the function definition, the function is instantiated (or closed). This function instance (or closure) is the final value of the expression. Different instances of the same function can refer to different external local variables and can have different environment tables..

These make it clear that every time a function expression is encountered, a new function instance is created and it compares unequal to other instances of the same function.

I also noticed that the incompatibilities mentions order of table assignment in a table constructor, when there is no guarantee of it happening in a specific order.

Sure, that’s fair - we can list it there as well. For us it’s conditional on fenv overrides so we can’t promise ref equality - it’s going to be equal “sometimes”.

A question from a beginner scripting learner, is this a replacement for LUA or something entirely different like an addition to it

Edit: after reading over the linked page, basically this is just an auto complete for lua? I’m still not 100% sure. If someone can clarify that would be great.

Since I understand some of the basics of roblox lua etc, I just want to make sure I can learn and apply it to this in case it actually is some new replacement in the future.

How ever if it’s just auto completion, that’s super helpful for someone like me

Not sure what you mean by LUA, but any Lua 5.1 code written before Luau is still valid, as the latter is intentionally backwards compatible with the former, so you could still use Luau even without using the new syntax.

2 Likes

This might not be as clean a change as you’re expecting.

For instance, even this recent post of mine contains code that relies on the fact function() end ~= function() end (in the “RoboxSignal” implementation) to generate unique tokens that can be passed through a BindableEvent without being serialized: Lua Signal Class Comparison & Optimal `GoodSignal` Class.

I know I’ve written code in the past that also makes that uses that approach, though I don’t think any of it is still important / in libraries of mine people might be using.

1 Like

Sure, but I maintain that having definitions and implementations be linked only by a linter and the good will of developers is flimsy. Marrying the definition and implementation seemed like a good way to link them. Conceptually, the definition is allowed to be wrong and neglected while the implementation can’t because if a developer neglects it… they can’t be working on that type’s implementation. That is what I consider to be the issue, and in practice it did prove to be enough of an issue that we don’t write code based on type definitions.

For what it’s worth, we already don’t have many problems managing (relative) type safety with just the annotations. It’s very rare for us to have random type related bugs. But I’m glad to hear there are alternatives coming down the road!

2 Likes

Another issue arises if setfenv is brought into the scenario (backport _ENV please :frowning_face:)

local t = {}

for i = 1, 2 do table.insert(t, function() print("Hello, World!") end) end

setfenv(t[1], {})

t[2]()

This should be mentioned as the latest update notes say unless setfenv is used, but that doesn’t seem to be the case here. Are the release notes wrong or is this a different issue?

1 Like

The release notes don’t say that if setfenv is used after the functions are created it’s too late. Try using setfenv (1) before creating each function and you will see unique instances with unique tables.

In our internal tests this was sufficiently powerful to not break interesting examples of sandboxing so that’s the compromise for now - the only way to know if it’s safe is to try.

We also have a more heavy handed de optimization for this that we aren’t enabling yet - we’ll see if we observe issues in practice with setfenv interaction

3 Likes

We’re very aware that this is a complex optimization, especially due to interactions with setfenv, which is why we will run it in Studio only for a while. Like any optimization that elides allocations it’s important enough to try.

Any change results in observable behavior under Hyrum’s law; some are fine in practice, some aren’t. We’ll see!

This probably isn’t as big of an issue as it’s only present in this edge case which I’m pretty confident isn’t used commonly by people (only noticed in obfuscation). A bigger issue that was mentioned by someone else is storing similar functions as keys in tables and accessing them later. Is there something that can be added to disable this optimization for specific functions that won’t de optimize the entire script?

If there’s a wide spread use of using functions as tokens (it’s funny that @stravant’s example literally says “Abuse the fact that function refs can be passed through BindableEvents intact” :slight_smile: ), I think we’ll need to simply disable the optimization. It will be unfortunate. I don’t think there’s a great reason to rely on this behavior, especially if you include interactions with our reflection system, but if it breaks code that’s in widespread use there’s not going to be a clean way for us to unbreak that code without just giving up on the idea.

1 Like

Is there not some alternative to getfenv/setfenv that can be created that makes an entirely new environment for the function? Sort of like setfenv(f, getfenv()), but instead of a call to getfenv that triggers deoptimization it sets it as a cloned default environment of some sort? I don’t know if I’m explaining it well enough but hopefully you can partially understand what I’m asking for.

What I mean is that I’m actually expecting that the setfenv-based code will work just fine with the new optimizations, because while you can construct examples that break it’s going to be very artificial. The problematic examples are likely to exclude setfenv entirely, like the code @stravant linked. That code isn’t relying on the presence of function environments, and indeed it would have been broken in Lua 5.3 (and fixed in Lua 5.4) despite the fact that Lua 5.2 removed setfenv.

We wouldn’t add a whole new primitive for such a small corner case - if we don’t break code in practice, then it’s already okay; if we do, we already need to disable this optimization, there’s not a whole lot of in-between.

1 Like

FWIW I think that the change is worth attempting, I was just raising a potential pattern that could be problematic with it.

2 Likes

I understand now. I honestly don’t think it should be too big of an issue and if it really makes a reasonable impact to optimizations maybe its better off making it intended functionality. Though it should definitely be listed as an incompatibility in it’s current state.