Luau Recap: May 2022

(post marked for deletion for privacy reasons)

8 Likes

Would it be possible to add support for the __len metamethod for tables now? I been using the newproxy() for the __len support and it be nice to have that for the tables so I can use both the __iter and __len

3 Likes

Love the improvement to garbage collection! In the past I’ve struggled a lot with maintaining smooth performance, since GC would randomly decide it wanted to do a bunch of work on one frame and cause a nasty stutter.

1 Like

Is there any documentation for this? I’d like to read up on it.

Also, I see there are improvements to garbage collection, any chance we could see an implementation of the __gc metamethod? :eyes::eyes::eyes:

Edit: nvm :frowning:

1 Like

Nice update :+1:

Are you adding a way to narrow types by returning from a type check?

local function t(player: Player)
	local character = player.Character -- Model?
	
	if character == nil then return end
	
    -- Still thinks character could be nil	
	local humanoid = character:WaitForChild("Humanoid") 
end
3 Likes

Yes, we are going to support that.

5 Likes

as a workaround for now, you can add assert(character) right after the if-statement.

1 Like

Hopefully the example in the recap is sufficient (also here Syntax - Luau); if you want a very thorough description you can read the full RFC (https://github.com/Roblox/luau/blob/master/rfcs/generalized-iteration.md)

3 Likes

Yeah we’ve discussed this as a potential next step, as with __iter this is the only metamethod you’d need to fully implement an indexed container. We’d need to be careful wrt potential performance impact, but we plan to look into this at least.

6 Likes

:thinking: yeah not a bad idea, although a bit ugly

I recall something about constant-type type-checking (e.g. strings, numbers, booleans, etc):

type constantStringABC = "ABC" -- Type matching only the string "ABC"

Is this still a planned feature? I find a lot of times I want to use this, particularly for enum-like types, like tables containing different parameters based on a particular key’s value which could be a set of constants.

(Type enums would actually be cool but that’s a different topic)

I really don’t like “generalized iteration”. I’m going to continue using pairs and ipairs explicitly. Please don’t ever deprecate them.

I think that’s called singleton types and IIRC that’s already live?

5 Likes

Indeed it is, I’m surprised I missed this.

I like this quite a lot, but I agree with you otherwise.

It is proper to use pairs when you expect/desire a table (and ipairs doesn’t get replaced by this at all since it is fundamentally different than pairs). If your use case is actually just iteration on the other hand (not usage of a table), this feature might be what you want, and, there’s been a few times I really wish I could’ve used this.

1 Like

While it’s up to you whether to continue using pairs/ipairs, there’s not really a benefit to using pairs or ipairs explicitly. I don’t think the word “proper” is, well, proper in this case :slight_smile: We specifically made sure that they don’t need to be used when iterating over tables or array-like tables.

That said obviously we don’t have plans to deprecate pairs/ipairs, so if people prefer them stylistically for some reason it’s fine to continue to use them.

2 Likes

Yeah I suppose I worded that poorly, I’m glad you pointed that out. I meant to emphasize that it’s a little more descriptive about intent, e.g. the input is expected to be be a table (which will have key value pairs) vs any old iterable, which can definitely matter.

There’s definitely no objectively correct choice for tables because functionally they’re the same, and which one you use probably won’t won’t really change how easy it is to scale up code or anything, but, it might sometimes be preferable to use pairs because of its explicitness.

I am definitely happy all around with the feature so far :smile:

I have a problem with the __iter metamethod’s implementation.

I have a sandboxing tool here which wraps around objects with metamethods. In order to do this though, I need to be able to essentially generate a copy of the original metamethod without touching that metamethod (I use this for values going in AND out of the sandbox, so some values I need to wrap are going to be unmanaged!)

With the __iter method, there does not exist any function which can return the generator, state, or index produced. pairs throws because the input is not a table, as I feel it it should.

In the RFC, it is noted that the equivalent of how t[index] is to rawget(t, index) is as in t is to in pairs(t). This is true, however, t[index] can be invoked as an expression, meanwhile getmetatable(t).__iter(t) cannot be safely invoked as an expression, and I can’t access the generator, state, and index and therefore these values go unsandboxed, providing a way for users to define code in completely unmanaged space which can access unmanaged values, even from my own managed tables.

Additionally, I am even unable to do something smart like the following, wrapping a real iterator inside of a coroutine, and returning a function which advances the state by resuming it repeatedly. The generator can return a variable number of results but I can only capture a finite number of results.

This example which mimicks the structure I require runs as expected, but only if the iterator uses two or less arguments. A vararg is not valid syntax. Very very thankfully, iterators cannot yield, but if they could, this example would be invalid for that reason because it would cause the two iterators that end up processing to lose their synchronization.

local metatable = {}
local proxy = {}

-- Psuedo-code
local function getUnmanaged(value)
	return {
		abc = 123,
		cde = 234
	}
end
local function sandboxAllTheResults(...)
	return ...
end

metatable.__iter = function(sandboxed)
	local real = getUnmanaged(sandboxed)
	return coroutine.wrap(function(x)
		-- Instead of index, value, if a vararg (...) were placed here it would cause a syntax error
		for index, value in x do
			coroutine.yield(sandboxAllTheResults(index, value))
		end
	end), real
end

setmetatable(proxy, metatable)

-- cde 234
-- abc 123
for index, value in proxy do
	print(index, value)
end

So, there are two solutions that solve this:

  1. Allow varargs in for loops
  2. Provide a way to access the results of the __iter metamethod directly (Preferable to me since it allows me to manage the generator itself)

c.c. @zeuxcg

Why not? I’m a little confused at the description above, but short of tables with locked metatables (which you can’t introspect reliably, but neither can you introspect any other metamethod so I don’t see how you can wrap an object with a locked metatable in general), you should be able to return a proxy that forwards __iter. For example:

local function proxy(v)
	local function proxyiter()
		print("proxyiter")
		assert(type(v) == "userdata" or type(v) == "table")
		local mt = getmetatable(v)
		if mt and mt.__iter then
			return mt.__iter(v)
		else
			assert(type(v) == "table")
			return next, v
		end
	end
	return setmetatable({}, { __iter = proxyiter })
end

for k,v in proxy({1,2,3}) do
	print(k,v)
end

local mt = {}
function mt:__iter()
	local index = 0
	return function()
		if index >= self.count then
			return
		end
		index += 1
		return index
	end
end

for i in proxy(setmetatable({count = 3}, mt)) do
	print(i)
end
1 Like

P.S. Maybe the confusion is that you aren’t sure how many results __iter can return, but it can return at most three, so if you want to wrap/proxy functions somehow you can instead do:

local gen, state, index = mt.__iter(v)
-- do some work on gen/state/index
return gen, state, index

The Lua iteration protocol, which __iter follows, only uses three values - generator, state, and index (which is fed into the generator repeatedly on every iteration and becomes the first loop variable). __iter doesn’t change that.

2 Likes

Thank you, this is exactly what it is, I never knew this somehow haha. I guess I’ve never actually tried to use more than three values from an iterator. :person_shrugging:

Basically, I just need to define some code that will invoke what every metamethod would normally do, and I need to cover every case, no matter the inputs/outputs. I don’t need any access to the metamethod at all, or even what it returns, as long as sandboxed code can’t access what it returns either.

For example, the functionality of each metamethod can be fully described like so:
__index - return target[index]
__call - return target(...)
__len - #target
__add - return target + subject
__newindex - target[index] = value (no return value)
etc, and apparently, since there are only three results, in this case,
__iter - for a, b, c in target do (with some coroutine magic)

What is nice about this is that even for something that isn’t a metamethod or behaves completely differently like __metatable or __mode, it generalizes:
__metatable - getmetatable(target)
__mode - nothing, you can’t access the value of __mode unless you have a reference to the metatable (just like any other metamethod)

All I need to do is gaurantee I can invoke the above, and insert my own code before and after. This allows me capture and modify the values entering the sandbox, and the values exiting, which essentially means I have complete control over everything the code running inside may/may not do.

My usage of “safely” is very misleading and I didn’t really think it through haha. The thought process was return getmetatable(target).__iter(target) would describe the metamethod except when target had __metatable set on its metatable, and that is “unsafe” because I am not describing the metamethod in a way that I can manage the inputs and outputs. (So, I guess, it literally is an “unsafe” way to represent it in my sandbox, but that makes zero sense without any context whatsoever)

Thank you for the reply, I apologize for my confusion.


P.S.

Here is my current solution as implemented in my code, which I believe covers every case correctly now, if you’re curious about what I am actually even doing.

The rawequal check covers the fact that the methods being called (:Import()/:GetClean()) are capable of returning nil, and the result is what the value should be functionally equivalent to.

(Except for something functionally equivalent to pairs where the value becomes nil, but, there wouldn’t really a be a case could solve this no matter what I do)

-- External -> Sandbox
self:CheckTermination()
self:TrackThread()

local real = self:GetClean(object)

return self:Import(coroutine.wrap(function(object)
	local real = self:GetClean(object)
	for index, value, extra in real do
		index = self:Import(index)
		if not rawequal(index, nil) then
			coroutine.yield(index, self:Import(value), self:Import(extra))
		end
	end
end)), self:Import(real)

-- Sandbox -> External
self:CheckTermination()
self:TrackThread()

local real = self:GetClean(object)

return coroutine.wrap(function(object)
	local real = self:GetClean(object)
	for index, value, extra in real do
		index = self:GetClean(index)
		if not rawequal(index, nil) then
			coroutine.yield(index, self:GetClean(value), self:GetClean(extra))
		end
	end
end), real

To cover cases that use getmetatable, rawset, rawget, etc, I just wrap functions too. I don’t even need to re-define them, it just works!

…except for code clarity. The reader does not have the entire type system in their head. They won’t always know the structure of the table, or the exact method of iteration that’s intended. Additionally, I don’t trust Luau to pick the right solution for every table. Omitting an explicit iterator function seems like it could have undesired effects at runtime. I don’t trust it.

So using pairs and ipairs explicitly has to do with intent and predictable behavior.

1 Like