Lua Signal Class Comparison & Optimal `GoodSignal` Class

That is correct, but you never cited any of your sources for saying that Luau stores that information inside a VM instruction (which seems redundant).

Yes, in vanilla Lua using an array is basically always faster than a struct with actual fields, but as you can see from the benchmark above that’s not true for the Roblox Luau VM because of the things I mentioned.

Yup, I cited in terms of Lua. Luau does some voodo magic for table field lookups as well as inline caching.

Sorry for the lack of citations, that’s me being a bit lazy. The info is spread buried deep inside many Luau Recap threads / release notes, so it’s pretty hard to dig up.

I thought it would be useful info regardless even with a citation, because many people still mucking up their code under the misconception that it will be faster if they use arrays where they should be using proper structs.

2 Likes

I see! so, to clarify - if I for instance have something of the like:

for _, Data in next, SomeTable do
     local Math = Data[1] * Time + Data[2] / Data[3] -- etc
end

would it be better to convert the Data into a dictionary, then? I am iterating through ~20,000 elements and indexing at least 2-3 properties for each one, so would it be more worthwhile for me to make Data a dictionary instead here?

Depends what your needs are.

If what you care about is perf, then for a 3-5 element item fields should beat an array (IIRC for a 2 element item using an array still wins for some reason, somehow the bucket logic must not work out in that case)

If what you care most about above all else memory and you have specifically 3 or less numeric fields, then you should actually put the three values into a Vector3: Since Vector3s are now a native type, you will avoid doing any memory allocation of a table grouping them and the numeric data will just be stored directly in the main table. This also works for compressing an array of plain numbers by a factor of 3 at the expense of it being slower to access them.

1 Like

Doesn’t coroutine.wrap do the same? Where does the performance gain come from?

Yes… but there’s other reasons to not use coroutine.resume (if you want to re-use a coroutine you have to use a separate coroutine.resume, not coroutine.wrap): You can’t fully correctly implement event:Wait() using coroutine.resume, and it also doesn’t integrate as well with the output window and debugger in the case where the code in the coroutine encounters an error.

2 Likes

I am a bit confused on why you said we “can’t fully correctly implement event:Wait()” with coroutines. Could you elaborate on why this is the case?

Using coroutine.resume makes errors disappear.
And also has some other hard to explain issues which I haven’t got to understanding yet. And I have no reason to, since now task.spawn is so much better for everything, that coroutine.resume is only used at really specific scenearios.

He didn’t say you can’t implement it with coroutines, the task library is one of the main ways to mess with coroutines, it’s just different and has less problems.
Just because it isn’t inside the coroutine library, doesn’t mean it’s not coroutine-related.

task.spawn(coroutine, ...) resumes a coroutine (better named thread), but has none of the issues that coroutine.resume has.

I have been using your GoodSignal class for a while now and it’s been a breeze to use. However, it seems to keep a strong reference to variables passed as parameters after firing and is only kept if the signal is connected. Here’s some code that displays this behavior.

local ref = setmetatable({},{__mode="v"})

local Signal = require(script.Parent.GoodSignal)

local Object = Instance.new("Part")
table.insert(ref,Object)

local Event = Signal.new()
Event:Connect(function()end)

Event:Fire(Object)
Object = nil

print(ref[1])
wait(2)
print(ref[1])

Output:
image


To compare, I modified the code slightly to make use of SimpleSignal instead and the Part was able to be GC’ed as expected.

local ref = setmetatable({},{__mode="v"})

local Signal = require(script.Parent.SimpleSignal)

local Object = Instance.new("Part")
table.insert(ref,Object)

local Event = Signal.new()
Event:Connect(function()end)

Event:Fire(Object)
Object = nil

print(ref)
wait(2)
print(ref)

Output:
image


Is this behavior related to how the coroutines that run the handlers are recycled? Is this intended?

Oh boy, that is one subtle bug, nice catch! It is indeed unintional and here’s why it happens:

-- Coroutine runner that we create coroutines of. The coroutine can be 
-- repeatedly resumed with functions to run followed by the argument to run
-- them with.
local function runEventHandlerInFreeThread(...)
    acquireRunnerThreadAndCallEventHandler(...)
    while true do
        acquireRunnerThreadAndCallEventHandler(coroutine.yield())
    end
end

Can you spot the bug? Hint: It’s in the ....

If none of your handlers ever actually yields, it will keep reusing the same runEventHandlerInFreeThread thread… and the ... argument of that thread remains on the stack even through the subsequent handler calls, so anything in the ... can’t be garbage collected until one of the handlers yields, requiring the creation of a new runEventHandlerInFreeThread thread, and resulting in the eventual garbage collection of the old thread.

So basically, specifically the first set of arguments you pass to a :Fire call in your game after a handler yields will be held onto until the next time that any handler yields.

I’m not sure what to do about this. It’s a pretty obscure edge case I don’t think will ever actually come up in practice and even when it does it would only leak a small fixed amount of memory. Since solving it will incur a slight performance penalty I’m hesitant to do it with there being so little chance of it mattering. If you want to solve it, you can change runEventHandlerInFreeThread to this:

local function runEventHandlerInFreeThread()
	while true do
		acquireRunnerThreadAndCallEventHandler(coroutine.yield())
	end
end

And the creation of the freeRunnerThread to this:

freeRunnerThread = coroutine.create(runEventHandlerInFreeThread)
coroutine.resume(freeRunnerThread) --> Add this

To basically “clear” the initial step of the coroutine and only invoke handlers via the yield.

Kinda unfortunate that we can’t load LVM bytecode… if we could I would be able to solve this via LVM assembly at no perf cost.

1 Like

I’ve actually fixed this on my Lua implementation, and I think it’s worth it to change the behaviour for this on GoodSignal.

It only comes at a really, really small cost that is even more insignificant given it only costs more when creating a new thread, which again only happens when a handler errors or yields, which if you’re yielding then you probably don’t care about the performance cost anyhow.

This little aspect might cause confusion for users, even if it’s a rare edge case, they could be losing a lot of time trying to debug such a hidden issue.

Edit: Made a pull request to make your life easier :)


Also, the GoodSignal links here are not pointing to the repo and are still pointing to the gist. Would you mind fixing that?

2 Likes

so i found an issue when making my own goodsignal style signal class, and i think it might be an issue with goodsignal as well considering it uses the same method: when using task.spawn to resume a :Wait(), and following the wait with disconnecting all connections, theres a chance you will encounter an error. replacing task.spawn with task.defer would fix this

What’s the specific scenario?

Changing to task.defer would be a change in behavior. The intent is that firing the signal synchronously invokes the handlers, which would not happen with defer.

If you need defer on a specific handler you can always add a defer in the handler.

Hi, I recently made a pull request on the GitHub repo based on a new feature coming soon:
image

Would love it if you could check it out: Add Signal:Once by Baileyeatspizza · Pull Request #5 · stravant/goodsignal (github.com)

Updated the post, gists, model, library, github repo etc with a couple of updates:

  • Added a Once call to matching the new API added to RBXScriptSignal in v531.

  • Removed the error when calling Disconnect multiple times. The approach of throwing an error in that case is incompatible with the Once API, so I have to admit defeat on that one. Most people didn’t want the error anyways and this shouldn’t create backwards compatibility problems.

  • Added the fix avoiding leaking the first set of arguments passed to Fire in a given session. This potentially comes with a small performance penalty but only under very specific circumstances (If you both have a large number of event handlers connected and they also frequently yield).

4 Likes

Hey there! I had a look through your module and it is absolutely wonderful!. Though I have some questions in mind regarding to how the script actually works as I’m extremely confused.
Specifically in these blocks of code:

-- The currently idle thread to run the next handler on
local freeRunnerThread = nil

-- Function which acquires the currently idle handler runner thread, runs the
-- function fn on it, and then releases the thread, returning it to being the
-- currently idle one.
-- If there was a currently idle runner thread already, that's okay, that old
-- one will just get thrown and eventually GCed.
local function acquireRunnerThreadAndCallEventHandler(fn, ...)
	local acquiredRunnerThread = freeRunnerThread
	freeRunnerThread = nil
	fn(...)
	-- The handler finished running, this runner thread is free again.

	freeRunnerThread = acquiredRunnerThread
end

-- Coroutine runner that we create coroutines of. The coroutine can be 
-- repeatedly resumed with functions to run followed by the argument to run
-- them with.
local function runEventHandlerInFreeThread()
	-- Note: We cannot use the initial set of arguments passed to
	-- runEventHandlerInFreeThread for a call to the handler, because those
	-- arguments would stay on the stack for the duration of the thread's
	-- existence, temporarily leaking references. Without access to raw bytecode
	-- there's no way for us to clear the "..." references from the stack.
	while true do
		acquireRunnerThreadAndCallEventHandler(coroutine.yield())
	end
end

-- in the :Fire() method

       if connection.isConnected then
			if not freeRunnerThread then
				freeRunnerThread = coroutine.create(runEventHandlerInFreeThread)
				-- Get the freeRunnerThread to the first yield
				coroutine.resume(freeRunnerThread)
			end
			
			task.spawn(freeRunnerThread, connection.callback, ...)
		end

I have read your detailed comments meticulously quite a few times now and I struggle to comprehend this. I guess I’m just way out of my depth to partially grasp this. Anyway, when I removed those codes the performance got significantly worse, and I’m curious as to why those codes reduce the “lag” so much. I’m eager to hear your explanation!

Appreciate your effort profusely!

A picture is worth a thousand words:

image

Basically, we need at least one thread for each handler being run at the same time. Normally only one handler is being run at a time, but if the handler yields (the blue wait), now there’s a chance for another :Fire() call to happen (green), and the handlers for that will have to be run on a new thread because the original one is occupied waiting on the second waiting (blue) handler.

The magic happens when at the end of acquireRunnerThreadAndCallEventHandler, we set freeRunnerThread back to the currently running thread, that’s what lets the same thread be re-used once it’s no longer needed by a yielding handler.

And the performance comes from that same thread being re-used, because creating new threads is expensive.

9 Likes

@stravant would you be adding ConnectParallel?

Unfortunately I can’t really support correct operation of ConnectParallel, all I could do is pretty much just directly wrap the existing call, and I wouldn’t even be able to pass tables by reference because you can’t share a table across parallel boundaries.

you might be able to support that through task.desync and task.sync?