Coroutines V.S. Spawn()... Which one should I use?

What are you using for signals? How listener threads are resumed depends on how the signal is fired. For example, BindableEvent:Fire() is instant because Fire creates and resumes each listener thread itself. It’s the same for most signals that fire from property changes.

Signals like Stepped, RenderStepped, and Heartbeat are fired at particular moments in a frame. The microprofiler can be used to tell exactly when. Here’s another graph:

One: connections added later are resumed first, which is by design. Two: none of them have budgeting, also by design. Three: RenderStepped runs later in the frame than the other two, by design.

Interestingly, WaitForChild is budgeted. Here it is compared to wait() & friends:

WFC appearing lower on the graph indicates that it does not suffer from a minimum wait time as the others do. Taken as an average, the slopes are generally the same, with the minor deviations caused by sub-optimal or slightly different implementations:

Benchmark (LocalScript)
game.ReplicatedFirst:RemoveDefaultLoadingScreen()

local RunService = game:GetService("RunService")
local Step = RunService .RenderStepped
-- Used to avoid interference with budget.
local function Sleep(d)
	local t = tick()
	while tick()-t <= d do
		Step:Wait()
	end
end

Sleep(3)

local N = 10
local I = 1000
local a = table.create(I, 0)

local function mark(i, t)
	a[i] = a[i] + (tick()-t)
	for n = 1, 100 do -- Do some work.
		local v = math.sqrt(i)
	end
end

local RenderStepped = RunService.RenderStepped
local function doRenderStepped(i, t)
	coroutine.resume(coroutine.create(function()
		RenderStepped:Wait()
		mark(i, t)
	end))
end

local Stepped = RunService.Stepped
local function doStepped(i, t)
	coroutine.resume(coroutine.create(function()
		Stepped:Wait()
		mark(i, t)
	end))
end

local Heartbeat = RunService.Heartbeat
local function doHeartbeat(i, t)
	coroutine.resume(coroutine.create(function()
		Heartbeat:Wait()
		mark(i, t)
	end))
end

local function doSpawn(i, t)
	spawn(function()
		mark(i, t)
	end)
end

local function doDelay(i, t)
	delay(0, function()
		mark(i, t)
	end)
end

local function doWait(i, t)
	coroutine.resume(coroutine.create(function()
		wait()
		mark(i, t)
	end))
end

local wfca 
local wfcb
local function doWFC(i, t)
	coroutine.resume(coroutine.create(function()
		wfca[i]:WaitForChild("Value")
		mark(i, t)
	end))
	wfcb[i].Parent = wfca[i]
end

for n = 1, N do
	wfca = table.create(I)
	wfcb = table.create(I)
	for i = 1, I do
		wfca[i] = Instance.new("BoolValue")
		wfcb[i] = Instance.new("BoolValue")
	end
	local t = tick()
	for i = 1, I do
		doWFC(i, t)
	end
	Sleep(1)
end
for i = 1, I do
	print(i, a[i]/N)
end
5 Likes

I implemented WaitForChild 4 years ago as an intern, and inserted it into the same task scheduler as wait() and spawn(). Obviously changes could have been made to this system in the mean time, but this is how I originally did it.

Note that no budget is consumed if the instance exists immediately as requested, which is the vast majority of the time in my experience.

3 Likes

This is an interesting observation, I’d like to see more posts like these in the future and your experiment seems like I would be able to replicate it. Now my only questions when it comes to these are: Are the firing times for Render Step, Stepped, and Heartbeat all fired at the same set times respectively at all times? Or can there be variations between frames (I can understand if you are speaking in terms of averages). Also, can you elaborate on what you mean when you say that WaitForChild is budgeted? Does this mean it’s not affected subject to a yield amount to resume again? So that means that WaitForChild doesn’t inherently use the wait() function correct?

EDIT: Sorry I’m still on mobile so it’s hard for me to carry out any of the experiments myself, so bear with me. Also, can you elaborate on how exactly you were able to put these in a graph and look at them from a statistical POV?

What folks are missing is the rather obvious elephant in the room (though I have mentioned it several times on this thread)

If you want to use a thread scheduling mechanism because you are dynamically creating several light weight threads that need to be executed at a later time (not necessarily the next frame or even the next second) and you want to prioritize safety against FPS drop, you either

a) have to write/use a custom thread scheduler

or b) use spawn.

Yes, writing a custom thread scheduler to replace the out of box one is desirable (and I have done this). But it is most definitely not easy to do.

edit to give a quick example (and not contrived, if you think about it)… how would you do this with coroutines with both fairness and safety:

function add_lifeform()
	spawn(function() 
		local cp = workspace.Part:Clone()
		local r = Random.new()
		cp.Position=Vector3.new(r:NextInteger(-50,50),5,
		   r:NextInteger(-50,50))
		local die = false
		cp.Touched:Connect(function() die=true end)
		cp.Parent = workspace
		while not die do
			cp.Position = cp.Position + 
				Vector3.new(r:NextInteger(-2,2),0,r:NextInteger(-2,2))
			wait(math.random(1,2))
		end
		print("Died!")
		cp:Destroy()
		add_lifeform()
	end)
end

for i=1,500,1 do
	add_lifeform()
end

As indicated by @Quenty above, a thread yielded by WaitForChild is put into the same, budgeted queue as threads yielded by wait, spawn, and delay. The only difference is that the WaitForChild thread is not scheduled with a delay. After being added to the queue, it will run as soon as WaitingScriptsJob gets around to resuming it.

Imagine an AddThreadToScheduler function that receives a thread along with a number indicating the duration to wait before the thread should be resumed. Within wait, spawn, and delay, the call might look like this:

DefaultWaitTime = 0.03
if duration < DefaultWaitTime then
	duration = DefaultWaitTime
end
AddThreadToScheduler(thread, duration)

Whereas WaitForChild would look like this:

AddThreadToScheduler(thread, 0)

The MicroProfiler will let us see how this works. Let’s use the following LocalScript, put under ReplicatedFirst, to be run with Play Solo:

game.ReplicatedFirst:RemoveDefaultLoadingScreen()
wait(5) -- Give the game some time to load and settle down.
-- May also disable Players.CharacterAutoLoads and Chat.LoadDefaultChat to
-- reduce clutter.

local function DoSomeWork(ms)
	local t = tick()
	repeat until tick()-t >= ms/1000
end

local RunService = game:GetService("RunService")

RunService.Stepped:Connect(function()
	debug.profilebegin("STEPPED")
	DoSomeWork(2)
	debug.profileend()
end)

RunService:BindToRenderStep("BIND", 0, function()
	debug.profilebegin("BIND")
	DoSomeWork(2)
	debug.profileend()
end)

RunService.RenderStepped:Connect(function()
	debug.profilebegin("RENDER")
	DoSomeWork(2)
	debug.profileend()
end)

RunService.Heartbeat:Connect(function()
	debug.profilebegin("HEARTBEAT")
	DoSomeWork(2)
	debug.profileend()
end)

while true do
	debug.profilebegin("WAIT")
	DoSomeWork(2)
	debug.profileend()
	wait()
end

Ctrl+F6 will open the profiler. The script will produce a profile that looks similar to this:

Look for RENDER, BIND, STEPPED, HEARTBEAT, and WAIT, as defined the in the script. These are the LocalScript doing work in various locations.

From what can be seen, BIND and RENDER always run in the render step on the Main thread. BIND, which allows a priority to be set, runs first. The bit of activity after BIND is the default camera script doing some work, which runs after BIND because it has a later priority. RenderStepped is designated as having the latest priority, so RENDER runs after all bound render functions.

WAIT, STEPPED, and HEARTBEAT run in one of the several worker threads each frame. Once rendering has finished, WaitngScriptsJob starts. It is not visible in the first frame because it is doing almost no work. Remember that wait() has a minimum delay of 0.03 seconds, so it resumes at least every other frame. It can be seen in the second frame because WAIT is running.

Following that is simulation. The Stepped event is dependent on the simulation being active, so it runs here. The bit of activity following STEPPED is physics simulation. Finally, HEARTBEAT starts running. The remainder of the time is spent idling to sync to the next frame. Another physics step may also occur here.

The DevHub has more information about the MicroProfiler:

My previous post has the benchmark script I used to produce the data. This data was pasted into LibreOffice Calc and rendered as a chart. Your preferred spreadsheet program should be able to do something similar.

3 Likes

ah, you may want to profile that baby. It’s called a spin wait and it’s really brutal on the CPU. Even adding a heartbeat:wait is awful because of the minimum yield times.

For any reasonable period of wait time, you need to yield via some mechanism (spawn, or something you build yourself) and be rescheduled at the appropriate time for efficient use of CPU.

Thus the value of spawn/wait.

1 Like

So if I can’t use wait and delay what do I replace it with?
A Custom Heartbeat Wait?

If you need really reliable timing, yes, use a Heartbeat-based wait.

For spawning threads reliably see Crazyman32’s comments above.

After reviewing the forums, I realize now where some of the general set of misunderstanding and confusion is coming from.

People are actually using the pattern

while tick()-startTick < waitTime
heartbeat:wait()
end

This doesn’t work efficiently at all. (and I couldn’t find it in the documentation, either). It eats up the CPU terribly for any waitTime values > a few frames.

You can do a quick experiment with this. Eg, use my example above, but instead use heartbeat connect / heartbeat:wait.

The underlying problem is that thread scheduling with FPS protection is a tricky problem and requires some assumptions about budgeting and fairness which are not easy to explain to new engineers not familiar with threading.

There is some thoughts, I suspect, that multi-threading will help with this. It absolutely will for a small number of threads, but not for a large number, as that generally scales very poorly due to the need to start locking and issues with fairness.

That said, what might be interesting is the ability to have a non-preemptive thread scheduler inside of a new preemptive thread. Eg, spawn/wait but they’re all grouped in new thread or threads. Eg, 10 preemptive threads * 1000 non-preemptive coroutine threads. That way you leverage multiple cores, but still can take advantage of lightweight non-preemptive lua coroutines.

There is a really good book on threads if you’re interested, though it’s java based - https://www.amazon.ca/dp/0123973376?slotNum=0&linkCode=g12&imprToken=rWCg5y4.f5QjNtp8cr-6WQ&creativeASIN=0123973376&tag=javarevisit0c-20 Java is pretty mature when it comes to real threading though, so it’s not a bad language to use for this domain.

The beauty of lua threading model is the ability to yield a large number of ‘threads’ with only a very small performance penalty, though of course there is problem in leveraging multiple cores as everything executes on a single core.

And maybe we can start using scare quotes when we say co-routine based ‘threads’, that’s less confusing, as they aren’t really threads. And to add to the confusion, it looks like Roblox is actually using real threads. Eg, tweening/GPU work is on a separate real worker thread, something important to leverage.

3 Likes

That is literally the point of DoSomeWork: so that there’s something to see on the profiler.

None of the scripts I’ve posted so far are meant to be practical. Their only purpose is to benchmark the scheduler and see how it works.

Fair point, I just saw the same pattern as what was suggested above and jumped on it a little too fast. We might want to start a new thread for benchmarking as this is very very useful stuff and I’d hate to see it get lost in the debate of whether there are any scenarios to use spawn.

What do you mean never use wait()? What if you want to pause your script for a certain amount time?

You measure time via Stepped, RenderStepped, or Heartbeat.

1 Like

As @sparker22 said.

There are many ways to make your own, better “wait”.

Uh no Coroutines can be used more then once

1 Like

Typically whatever function you input as an argument in your thread is not supposed to yield. I do agree with you on spawn not being bad tho.

1 Like

To everyone proving spawn() is bad. That is a false claim. Instead of trying to find solutions to how to make spawn bad by spamming 100 - 1,000,000 spawns, it is better to use more practical proof. Threads are not meant to be spammed in the first place even if it’s just a “benchmark” just like @MisterHumbled stated. In general, threads are expensive.

spawn() is delayed since it uses wait() but it only yields for 0.03 seconds. The timing of how much spawn can yield depends on how much you spam spawn because of internal work in the task scheduler due to the fact that yielding tasks are managed in the scheduler.

Referenced from Task Scheduler docs:

" The task scheduler coordinates tasks done each frame as the game runs, even when the game is paused. These tasks include detecting player input, animating characters, updating the physics simulation, and resuming scripts in a wait() state."

A custom spawn uses bindables which neglects the cons of coroutines and the delay of spawn which is good. But I’d rather not create a bindable + connection each time I want to create a thread even in moderation.

1 Like

To be honest, from experience (I posted this topic like a year ago) use Quenty’s fast spawn for spawning, and coroutines for… coroutines.

Spawn() is absolutely fine when used in its intended environment, that being the classic 2012(or earlier) style of roblox game. You will unavoidably reach a point where you have too many spawn functions, this is not a thing you can absolutely plan for, this will just become an issue if you intend to make a complex game.

Coroutines obviously have the issue of losing stack trace during errors, however, thats not that big of a deal in the grand scheme of things. Regardless of how bad the error is, you should be able to eventually determine what the issue is without a direct trace back to the error, and you can always test your coroutine function outside of a coroutine to double check.

Naturally, theres a couple workarounds to either of these, namely that being bindable events, though you may not want to make a new bindable event every time you want to run a function, or really use one at all in that capacity.

Personally i’d just stick to coroutines and just figure out issues when they come, and people should definitely be taught how to use coroutines so they don’t keep using spawn() as its grossly out of date for the environment roblox is making for games.

Better question is why doesn’t roblox update all of their old globals to the new task scheduler so we can avoid these problems? Usually their current solution right now is to just implement new alternatives which doesn’t make much sense, since it just bloats the engine.

2 Likes

What if you were to call the function directly instead of firing a bindable? Say you pass a function and after your custom delay is up you decide to call said function.

Yeah thats how i’d imagine someone would do it, I personally do this with coroutines

function module:cowrap(func,...)
    coroutine.wrap(func,...)()
end

You could do a similar thing with bindable events in a premade function, so you don’t have to manually fire it every time, just require the module and run the code. I believe thats what the fastspawn module just does, though theres more logic to it than that i hope.

is task.spawn() also broken?