Parallel Scheduler - Parallel lua made easy and performant

Is it okay to Yield in a runservice event? I always avoid doing that with the assumption it would be bad otherwise.

You can have multiple Schedulers loaded returning ModuleTables, running work at the same time correct? And the Scheduler will divide up the work between Actors for various different modules accordingly?

Also is calling the ModuleTable:Work() Method whilst its already running something that you think can happen easily and needs to be checked for or nah?

1 Like

the yeild happens for a fraction of a frame (if the function from the module doesn’t yeild)
I don’t think yeilding in a runservice event does anything? The connected function runs whenever Runservice.Heatbeat is fired

Yes, however, every loaded module has its own set of actors and will not share them with other loaded modules. If you call ModuleTable:Work() at the same time (like if you use heartbeat) for different ModuleTables, the tasks should run in the same parallel phase

ModuleTable:Work() yeilds so as long as you don’t avoid the yeild, that wont happen. Otherwise, if the function from the module doesn’t yeild, the work will be completed by the next serial phase, so very unlikely that you run into such an issue.
It could happen if your code errors, but at that point, you wont receive the results ever so… yeah. If it does happen, there will be a warning in the output

1 Like

I really appreciate how responsive you’re being. Thank you.

1 Like

How many actors per module? The total of 24 or something? Two different Modules running work would run on the Same parallel phase, but on different threads if they have their own actors…? Right.

1 Like

yeah, the default is 24, in the module there’s a DEFAULT_MAX_WORKERS variable at the begining of the module, currently set to 24 for the client and 48 for the server
image

They do run on their own threads, but then roblox distributes those threads onto your cpu threads (so like lets say you have 24 actors doing work, on a 4 thread cpu, roblox will distribute the work of those actors to the 4 cpu threads). This is why lowering that value can be beneficial if you have multiple modules doing work in parallel, to reduce overhead

I don’t know exactly how the Roblox serial and parallel system works, but there can be multiple parallel phases/serial phases during a frame. I would guess that after a serial phase, it starts a parallel phase with whatever was told to run in parallel, then serial, back to parallel if needed…

2 Likes

I am 90% this is due to this update Deferred Engine Events: Rollout Update but now if you schedule a bunch of work using :ScheduleWork, and then immediately use :Work() this module breaks. I think is because the actors are created on the fly, but they don’t have their :ConnectParallel work until the frame after, and thus the first :ScheduleWork will break since the event is just nonexistant. Every :ScheduleWork after the first broken one works however since it’s just reusing the Actors from the previous attempt which have existed for longer than a frame and do have their :ConnectParallel set. If you also put a task.wait() it also fixes itself.

I edited the module script myself to make it generate all the actors as soon as it is required() (if you require and instantly schedulework it might still break, but in my use case that never happens, so I don’t care)

This is one of the weirdest bugs I’ve ever seen. Every four times you use this function, it yields. No matter if you unroll the loop or not. Really strange. It has something to do with ResultEvent getting it’s :Wait() interrupted one frame late, but why I have no idea

Edit: if you yield the script yourself using something like task.wait() the script not longer yields suggesting that there is a limit to the amount of times you can wait on the bindable (in a single frmae) before it forces the script to yield. Very strange indeed. Maybe someone could get around this by using multiple bindables however for me this doesn’t matter cause I’m not gonna be using this more than once a frame, it just so happens that when I was testing performance, I was calling the parallel scheduler multiple times per frame causing this chain of events (pun unintended).

That is definitely odd, I’ll have to look into it

I GOT A FIX :D

I really did not expect the fix to be this easy, the scripts not initializing before :Work() was one issue, but then, for some reason, the shared table with the results was empty (something I didn’t get to investigate before I found the easy fix)

It was indeed due to Deferred mode… Um, I think deferred mode also makes the initialization of scripts deferred, meaning the scripts don’t initialize in time, but only after the thread (that called :ScheduleWork()) finishes

As you have said. However, it breaks because the Work event is fired before the scripts connects to the event. :ConnectParallel(). It connects way before the next frame, and actually starts working right after, so using task.defer(coroutine.running()) fixes it, no need to wait a full frame (though, then the shared table for the results is empty…)

So the fix

Deferring WorkEvent:Fire(), like this


(This is the equivalent of task.defer(function() WorkEvent:Fire() end))

This seems to fix everything, and the results are in the shared tabled. (I thought maybe the issue for that was that I put task.defer(coroutine.running()) before the place where the shared table is cleared, but putting it after doesn’t work either…)

Thank you for pointing this out. I’m somewhat surprised that I didn’t notice this before, seems like none of the games I’ve used this in are in deferred mode

The test place and the model have been updated


I’m too lazy to test that, but can you tell me if you are still running into the weird bug you are having? (it yielding every four times)
It might have to do with task.synchronize and task.desynchronize only being able to be used 4 times per frame, idk. I thought there was no limit for that

2 Likes

Hello, great module youve made!

Im curious if you could give an example of you using it on the server as opposed to just the client

Im not quite proficient enough with lua to interpret this as easily as id like to, and im currently trying to impliment parallel lua into my game which has a pretty heavy physics load. Id like to transfer this load to multiple threads/workers for better performance but am at a road block in the implementation.

Thanks in advance!

It works the same way on the server
What is the “physics load” in question? If you are talking about the roblox physics engine, you cannot make that parallel, roblox would have to do it themselves by adding a :SetThreadOwnership(), kind of like :SetNetworkOwnership()

Anyway, here is an example where I use it on the server, to deserialize a table that can get pretty big

local GameDatabase = Games[Task.TaskIndex]

if not GameDatabase then 
	local Data = DatastoreFunctions:GetGamesData(Task.TaskIndex) or tostring(GamesSerializer:GetLastestVersion())
	Data = string.split(Data,"_")

	local _version = tonumber(table.remove(Data,1))
	local DataLenght = GamesSerializer:GetDataLenghtFromVersion(_version)

	local GamesAmount = table.maxn(Data)/DataLenght
	local Index = 0

	for i = 0, Settings.ServerThreads -1 do 
		local Tasks = math.floor(GamesAmount/(Settings.ServerThreads-i))
		GamesAmount -= Tasks

		local a = Index + 1
		local b = Index + Tasks*DataLenght

		Index += Tasks*DataLenght

		SerDeserModules.Games.Deser:ScheduleWork(table.concat(Data,"_",a,b),_version)
	end

	local FinalData = {}

	Data = SerDeserModules.Games.Deser:Work()
	for i, v in ipairs(Data) do
		table.move(v,1,table.maxn(v),table.maxn(FinalData)+1,FinalData)
	end
	Data = nil -- TODO -- what the hell is happening here, why is this needed

	local DataSize = table.maxn(FinalData)

	Games[Task.TaskIndex] = {
		Data = FinalData,
		DataSize = DataSize,
		Index = math.random(1,math.max(DataSize,1)),
	} 
	GameDatabase = Games[Task.TaskIndex]
end

Here is the Deser module

local Settings = require(game.ReplicatedStorage.Settings)
local TasksPerThreads = Settings.SponsorsPerDatastore/Settings.ServerThreads
if TasksPerThreads ~= math.round(TasksPerThreads) then error("Invalid settings, cannot spread tasks evenly)") end

local Serializer = require(game.ServerScriptService.ServerTasks.TaskScript.Sponsors.SponsorsSerializer)

return function(String, _version, TaskIndex : number)
	return Serializer:DecompressArray(String, _version)
end

the DecompressArray method is for decompressing multiple elements at once. I made my code to make it specifically Schedule work for every thread available (aka the number of actors, which is set in the settings for the module), instead of doing ScheduleWork for every element separately and having the ParallelScheduler merge them. It’s better performance wise to do it like this. It does complicate things a bit though


It is much simpler when not merging tasks together, though if you are getting into the territory of maybe 300-500 smallish tasks or more, you should merge them
e1900e8b4f8364c30fe6bf8fd628e8379d38da1f

Much appreciated!

Unfortunately, looking deeper into my Micro Profiler, most of my lag is from defualt roblox physics and only about 4-6ms is from my computations.
Ive confirmed the Micro Profiler Physics report looks exactly the same in other physics based games

Any idea why roblox is such an un-performant platform?
looking at all other platforms, doing something as simple as what im doing would never cause any lag, but on roblox everything screams at the mere sight of physics and parts needing to interact with the world around them.

2 Likes

It seems like most of it is coming from aerodynamics, try disabling that to see if things improve. I don’t know why roblox is so unperformant, even reading and especially modifying properties of parts is quite slow. It can’t be the C/lua boundary for physics as physics are fully written in C++ I’m pretty sure

1 Like

When I use LoadModule, it requires two parameters, self and then the modulescript? What does this mean? Fyi, Im calling the scheduler inside a module script to call another script that uses runservice heartbeat. Im inexperienced with parallel luau, sorry! (ps, this is in serverscriptservices)

self is syntactic sugar in lua when using the : notation

function Table:Method()
	print(self) --> The contents of table will be printed
end

Table:Method()

self is a hidden first argument in this case. We can see that by using . notation instead

local Table = {}

function Table.Method(self)
	print(self)
end

Table:Method() -- Table is passed implicitly when using : notation, it's the same as doing Table.Method(Table)
Table.Method(Table)

All you have to do is this, to use LoadModule (if the module script is a child of script). If you use the : notation, you don’t have to pass self. You can rely on the autocomplete to figure out what you have to pass to the functions. Every function uses the : notation


To make a function run with the Parallel Scheduler, you need to have a module (the one you pass to LoadModule) return a function, like shown in this figure (at the bottom, where it says Module Script)
The script calling the ParallelScheduler can be a LocalScript, or a ServerScript, doesn’t matter
Script

Great Module! Running into one issue with it, though. I’m getting this error:
image

Which is linked to this line right here
image

Based off some debugging I attempted, it seems that after the RemainingTasks hits 2 (red circle), something isn’t cleared and the script still assumes there are 2 tasks and tries to assign them (blue circle). The WorkParmeters for that WorkerId doesn’t exist though so it just errors.

This error will contine to popup for subsequent :Work() requests.

Everything still will work as intended though so I can kinda just ignore it.

1 Like

Well that’s dumb lol. I never considered the case where there are more actors than tasks to run XD

This happens when you Schedule and run more tasks (for example 4), then less tasks (but lower than DEFAULT_MAX_WORKERS, ex. 2), 4 actors were created previously, but only 2 have tasks to run, causing the error

image
The fix is simply to return if there are no params (aka no tasks) for the actor. This is what was (implicitly) happening when you encountered the error, which explains why the module kept working as expected
There might be another, more performant, fix to prevent the excess actors from running in the first place, I might look into it some day

This is also why I didn’t bother to really test this fix thoroughly cuz I’m lazy, so let me know if there are other errors

The roblox Model and the Place have been updated with this fix

Alrighty, thanks for the fix! I’ll keep you updated
Awesome work man :smile:

1 Like

Responding to your question in the other thread, the reason why this module did not work for me is the yielding. I’m likely using the module wrong, but when I’m trying to calculate cframe data for hundreds of voxels at a time, and I’m constantly scheduling work to be done through a loop, it ends up looking like this:

And for reference this is what it looks like without parallel scheduler:

2 Likes

Can you show the code that uses Parallel Scheduler?

Make sure you are scheduling all the tasks before running Work(), though it seems like it is yielding to the next frame (and not freezing), which is, odd…
(Could be that there is a maximum amount of parallel phases in a frame?)

Send the code that uses :ScheduleTask() and :Work(), as well as the function inside the module passed to :LoadModule()