I wrote a post basically providing an answer for an old topic essentially and I wrote so much that I think it may be a good community tutorial, but I wanted some other eyes to verify, challenge, or validate what I said.
The context was the person asking about Actor documentation and I did basically a somewhat deep dive into how Actors behave with fibers (coroutines) and serial vs parallel.
I’ll just re-paste it here but you can go to the link as well.
Actors and Fibers (coroutines, or the thread
datatype)
Each Actor is an isolated environment, aka an execution context. They are, amongst having other information, containers to hold “fibers”, aka coroutines or pseudothreads. If you are unsure what a fiber is, its basically a unit of execution that has its own call stack and execution context. Scripts are fibers (during runtime, as they’re unique and have other data attached to aid the task scheduler). Coroutines are fibers. You also know these as the thread
datatype. This is misleading because its not actually a “thread” in boilerplate terms, like OS threads.
So, having these fibers, Roblox uses multi-threaded scheduling, which Actors have the information (metadata) for, to execute the instructions within them in parallel across whats called ‘worker threads’, aka those actual OS threads. You can see these ‘worker threads’ if you open the micro-profile.
Note: I’ve seen some information about Roblox using “16 VMs or a VM per worker thread” – which I think is a misunderstanding of environments versus VMs. It’s making whats called an environment. An environment is information for the VM on how to run the code and what resources and data the code has access to.
How Worker Threads are Chosen by Actors
Guess what? Actors do not choose that worker thread, the task scheduler does. But, VERY IMPORTANTLY, they do limit it!
When using one Actor, only one worker thread can be utilized for all the fibers within that Actor at a time because they’re ran serially in their environment context. Think of an Actor like one of those Rally Car drivers who has a co-driver (the task scheduler) and hes got a bunch of friends (fibers) in the back seat. He can’t go into two lanes (worker threads) at once because he’s gonna hit someone so the task scheduler keeps him in one at a time. For more specifics as to why though, see Upvalues below.
As for the logic in regards to how Worker Threads are chosen: it is not exposed to the public currently both for security, proprietary reasons, and also because we don’t really need it. We can’t and should not be doing anything that Roblox is doing with C++ under the hood because we’ll crash the game. While documentation would be cool, but it’s not really useful for us outside of satisfying curiosity.
How task.desync/task.sync work
Moving on into more technical specifics about selection and how desync/sync work, what we can infer though is task.desynchronize()
within a fiber causes the parallel scheduler to find the next available worker thread by doing a comparative check on workload amongst all worker threads. This comparison is most likely based on just workload and anything queued in task.delay()
. From there, it yields the fiber and then it directly schedules it “to happen” in parallel on the selected worker thread. Yes, this causes a small delay. However, don’t confuse this with migration. Fibers are not migrated across threads once desynchronized. Instead, they are assigned an execution context (main or parallel), and they remain within that scheduler for their lifetime. Think of a fiber like a tool. It has markings on the handle to tell which scheduler can use it. task.desync
and task.sync
write the markings, and the Actor is the toolbox with some instructions that ensures the people using it (the two schedulers) are using it within the proper boundaries and maintain the same overall environment. Actors do this, and thus exist, for thread safety. If we had no Actors, most APIs just wont work, modules will have race conditions and create duplicate fibers, you can’t mutate or change any Instance, and you can’t have thread-safe access to specific physical assets like you can when you parent them under Actors.
Upvalues in Lua/Parallel Lua, and How They’re Handled
The definition of upvalues in Lua are:
A variable that’s captured by a nested function but not defined inside that function.
Observe the code (apologies for formatting im bad at markdown) as an example of what happens:
task.spawn(function()
local x = 12
task.spawn(function()
task.desynchronize()
x += 1
end)
while true do
print("hi", x)
end
end)
In order:
-
The outer
task.spawn
creates a fiber in the Actor’s execution context. It defines a local variablex = 12
, which is in the stack frame of said fiber. -
A second fiber is created by the inner
task.spawn
, which defines a new function. This function capturesx
as an upvalue, because it exists in the scope where the function was defined. -
Because the variable
x
is now shared between two closures, Lua “boxes” the variable, meaning it allocates it on the heap, and both fibers share a pointer to the same boxed value. That’s why, even if the outer fiber later modifiesx
, the change is visible to the inner one. The Actor ensures that the boxed variable is only accessible in its environment. -
The inner fiber is calls
task.desynchronize()
, yielding and being scheduled to another worker thread. Resumed. -
The increment to
x
affects the original variable in the Lua heap given its pointing to it. The while loop prints out the change.
But how is this allowed in Parallel versus Serial execution without data racing? As said earlier: Actors. Remember, they’re just info for the schedulers and have their own environment. They basically ensure, even if its ran in parallel, their own callstack is called serially in context to itself.
Thus, with this in mind and the safety it brings to modify upvalues, Actors can only utilize one worker thread at a time. Roblox most likely did this because its just easier to program, because if this wasn’t the case, they’d be pointless. You’d get NaN’s with your upvalues. It’s pretty remarkable, even look at this:
Example of Serial nature
This code is from a test I personally did with one Actor and one script.
NOTE: You can add task.wait() to these loops, same result.
task.spawn(function()
local x = 0
local z = 10
task.spawn(function()
task.desynchronize()
x = 0
task.spawn(function()
while z do
--task.wait()
local num = tonumber(string.rep("9", 1e6)) + 1
x-= 1
end
end)
end)
task.spawn(function()
task.desynchronize()
x = 0
task.spawn(function()
while z do
--task.wait()
local num = tonumber(string.rep("9", 1e6)) + 1
x+= 1
end
end)
end)
repeat
z -= 1
print(x) -- x is always safe, will always report 0, never -1 or +1.
until z <= 0
z = nil
print("ended")
end)
You can see here in the MicroProfiler what I’m talking about.
And if we copy the script 3 times…
Okay, but what if we run two Actors, each having 4 scripts?
See? They’re all switching lanes, but keeping themselves in serial for each execution. There’s never more than 2 chunks vertically aligned.
The Order of Execution (runParallel
and Heartbeat
)
In the ROBLOX Lua Server VM, the task scheduler executes things during the Heartbeat
step of the game (which is approximately 60/s), which first executes code serially and then in parallel via runParallel()
. On the image above, that’s the vertical stacking you’re seeing.
On the client’s VM, its a bit different, with code instead going during the task()
phase, as you can tie code to rendering. The server doesn’t render, so its pointless.
On the client’s VM, its a bit different, with code instead going during the PreRender
phase or other related phases, as you can tie code to rendering. The server doesn’t render, so its pointless.
Note that Parallel CANNOT run overtop Heartbeat in another worker thread, it only extends the heartbeat phase and causes hitching. This is why Roblox says to bind Actors to logic instead of processes. The goal is to get as many fibers as you can running in each runParallel()
phase without extending Heartbeat.
Furthermore, and back to Actors, if you have a fiber thats Serial in an Actor and then another fiber thats running Parallel in that same Actor, the parallel phase can be excecuted in another worker thread. The structure of Serial being before Parallel already ensures a single Actor cannot have its own serial phase occur at the same time as its runParallel
phase.
For yielding, the fibers are stopped to maintain a healthy Heartbeat frametime and are resumed during the phases delayedThreads
, deferredThreads
, and resumeVMThreads
. These all occur also during Heartbeat of the next frame (if using task.wait()
) or whenever they’re scheduled to resume.
Okay thats all I got, been writing this for a bit now and I’m tired. Sorry if theres typos, if I got anything wrong or this wasn’t helpful! Have a nice day!
-mewow aka InfiniteYieldNote that Parallel CANNOT run overtop Heartbeat in another worker thread, it only extends the heartbeat phase and causes hitching. This is why Roblox says to bind Actors to logic instead of processes. The goal is to get as many fibers as you can running in each runParallel()
phase without extending Heartbeat.
Furthermore, and back to Actors, if you have a fiber thats Serial in an Actor and then another fiber thats running Parallel in that same Actor, the parallel phase can be excecuted in another worker thread. The structure of Serial being before Parallel already ensures a single Actor cannot have its own serial phase occur at the same time as its runParallel
phase.
For yielding, the fibers are stopped to maintain a healthy Heartbeat frametime and are resumed during the phases delayedThreads
, deferredThreads
, and resumeVMThreads
. These all occur also during Heartbeat of the next frame (if using task.wait()
) or whenever they’re scheduled to resume.
Okay thats all I got, been writing this for a bit now and I’m tired. Sorry if theres typos, if I got anything wrong or this wasn’t helpful! Have a nice day!
-mewow aka InfiniteYield