Parallel Scheduler - Parallel lua made easy and performant

It seems like most of it is coming from aerodynamics, try disabling that to see if things improve. I don’t know why roblox is so unperformant, even reading and especially modifying properties of parts is quite slow. It can’t be the C/lua boundary for physics as physics are fully written in C++ I’m pretty sure

2 Likes

When I use LoadModule, it requires two parameters, self and then the modulescript? What does this mean? Fyi, Im calling the scheduler inside a module script to call another script that uses runservice heartbeat. Im inexperienced with parallel luau, sorry! (ps, this is in serverscriptservices)

self is syntactic sugar in lua when using the : notation

function Table:Method()
	print(self) --> The contents of table will be printed
end

Table:Method()

self is a hidden first argument in this case. We can see that by using . notation instead

local Table = {}

function Table.Method(self)
	print(self)
end

Table:Method() -- Table is passed implicitly when using : notation, it's the same as doing Table.Method(Table)
Table.Method(Table)

All you have to do is this, to use LoadModule (if the module script is a child of script). If you use the : notation, you don’t have to pass self. You can rely on the autocomplete to figure out what you have to pass to the functions. Every function uses the : notation


To make a function run with the Parallel Scheduler, you need to have a module (the one you pass to LoadModule) return a function, like shown in this figure (at the bottom, where it says Module Script)
The script calling the ParallelScheduler can be a LocalScript, or a ServerScript, doesn’t matter
Script

Great Module! Running into one issue with it, though. I’m getting this error:
image

Which is linked to this line right here
image

Based off some debugging I attempted, it seems that after the RemainingTasks hits 2 (red circle), something isn’t cleared and the script still assumes there are 2 tasks and tries to assign them (blue circle). The WorkParmeters for that WorkerId doesn’t exist though so it just errors.

This error will contine to popup for subsequent :Work() requests.

Everything still will work as intended though so I can kinda just ignore it.

1 Like

Well that’s dumb lol. I never considered the case where there are more actors than tasks to run XD

This happens when you Schedule and run more tasks (for example 4), then less tasks (but lower than DEFAULT_MAX_WORKERS, ex. 2), 4 actors were created previously, but only 2 have tasks to run, causing the error

image
The fix is simply to return if there are no params (aka no tasks) for the actor. This is what was (implicitly) happening when you encountered the error, which explains why the module kept working as expected
There might be another, more performant, fix to prevent the excess actors from running in the first place, I might look into it some day

This is also why I didn’t bother to really test this fix thoroughly cuz I’m lazy, so let me know if there are other errors

The roblox Model and the Place have been updated with this fix

Alrighty, thanks for the fix! I’ll keep you updated
Awesome work man :smile:

1 Like

Responding to your question in the other thread, the reason why this module did not work for me is the yielding. I’m likely using the module wrong, but when I’m trying to calculate cframe data for hundreds of voxels at a time, and I’m constantly scheduling work to be done through a loop, it ends up looking like this:

And for reference this is what it looks like without parallel scheduler:

2 Likes

Can you show the code that uses Parallel Scheduler?

Make sure you are scheduling all the tasks before running Work(), though it seems like it is yielding to the next frame (and not freezing), which is, odd…
(Could be that there is a maximum amount of parallel phases in a frame?)

Send the code that uses :ScheduleTask() and :Work(), as well as the function inside the module passed to :LoadModule()

my division algorithm script:

local OctreeDivision = {}


local Types = require(script.Parent:WaitForChild("Types"))
local Settings = require(script.Parent:WaitForChild("Settings"))
local Constructor = require(script.Parent:WaitForChild("PartConstructor"))
local Scheduler = require(script.ParallelScheduler)
local ModTable = Scheduler:LoadModule(script.ModuleScript)

local function partCanSubdivide(part : Part) --Checks if part is rectangular.

	local Threshold = 1.5  -- How much of a difference there can be between the largest axis and the smallest axis 

	local largest = math.max(part.Size.X, part.Size.Y, part.Size.Z) --Largest Axis
	local smallest = math.min(part.Size.X,part.Size.Y, part.Size.Z) -- Smallest Axis

	if smallest == part.Size.X then 
		smallest = math.min(part.Size.Y, part.Size.Z)
	elseif smallest == part.Size.Y then
		smallest = math.min(part.Size.X, part.Size.Z)
	elseif smallest == part.Size.Z then
		smallest = math.min(part.Size.X, part.Size.Y)
	end

	return largest >= Threshold * smallest 
	--Returns true if part is rectangular. 
	--Part is rectangular if the largest axis is at least 1.5x bigger than the smallest axis
end



function IsBoxWithinPart(data:BasePart, part:Types.VoxelInfo)
	local partPosition = part.CFrame.Position
	local partSize = part.Size
	local halfPartSize = partSize / 2

	local dataMin = data.CFrame.Position - (data.Size / 2)
	local dataMax = data.CFrame.Position + (data.Size / 2)

	local partMin = partPosition - halfPartSize
	local partMax = partPosition + halfPartSize

	return dataMin.X <= partMax.X and dataMax.X >= partMin.X and
		dataMin.Y <= partMax.Y and dataMax.Y >= partMin.Y and
		dataMin.Z <= partMax.Z and dataMax.Z >= partMin.Z
end




local function CheckForNewVoxelsInHitbox(Voxels:Types.VoxelInfoTable, hitbox:BasePart) -- Vector3,Instance
	local partsInHitbox:Types.VoxelInfoTable = {}
	for i,v in Voxels do
		if IsBoxWithinPart(hitbox,v) then
			table.insert(partsInHitbox,v)
		end
	end
	return partsInHitbox
end
local function CheckForNewVoxelsNotInHitbox(Voxels:Types.VoxelInfoTable, hitbox:BasePart) -- Vector3,Instance
	local partsInHitbox:Types.VoxelInfoTable = {}
	for i,v in Voxels do
		if not IsBoxWithinPart(hitbox,v) then
			table.insert(partsInHitbox,v)
		end
	end
	return partsInHitbox
end

local function getLargestAxis(part : Part)  --Returns Largest Axis of Part size
	return math.max(part.Size.X, part.Size.Y, part.Size.Z)
end

local function CutPartinHalf(block : Types.VoxelInfo, TimeToReset : number) --Cuts part into two evenly shaped pieces.
	local partTable:Types.VoxelInfoTable = {} --Table of parts to be returned
	local bipolarVectorSet = {} --Offset on where to place halves
		
	
	
	local X = block.Size.X
	local Y = block.Size.Y
	local Z = block.Size.Z

	if getLargestAxis(block) == X then --Changes offset vectors depending on what the largest axis is.
		X /= 2

		bipolarVectorSet = {
			Vector3.new(1,0,0),
			Vector3.new(-1,0,0),
		}

	elseif getLargestAxis(block) == Y then 
		Y/=2

		bipolarVectorSet = {
			Vector3.new(0,1,0),
			Vector3.new(0,-1,0),
		}

	elseif getLargestAxis(block) == Z then
		Z/=2

		bipolarVectorSet = {
			Vector3.new(0,0,1),
			Vector3.new(0,0,-1),
		}

	end




	local halfSize = Vector3.new(X,Y,Z)

	for _, offsetVector in pairs(bipolarVectorSet) do
		ModTable:ScheduleWork(block.CFrame,halfSize,offsetVector)
		local info:Types.VoxelInfo = {
			Size = halfSize,
			CFrame = ModTable:Work()[1],
			Parent = block.Parent,
			CanCollide = block.CanCollide,
			Transparency = block.Transparency,
			Material = block.Material,
			Anchored = block.Anchored,
			Color = block.Color,
			AlreadyExistsInWorkspace = false,
			AlreadyDivided = false,
			ResetTime = TimeToReset,
			OriginalPart = block.OriginalPart
		}
		print(info)
		table.insert(partTable,info)
	end

	return partTable -- Returns a table containing the two halves
end

local function DivideBlock(voxInfo : Types.VoxelInfoTable, MinimumVoxelSize : number, TimeToReset : number, Hitbox:BasePart) --Divides part into evenly shaped cubes.
	--MinimumvVoxelSize Parameter is used to describe the minimum possible size that the parts can be divided. To avoid confusion, this is not the size that the parts will be divided into, but rather the minimum allowed
	--You CANNOT change the size of the resulting parts. They are dependent on the size of the original part.

	--if Hitbox == nil then
	--	return voxInfo	
	--end
	


	
	local partTable:Types.VoxelInfoTable = {} -- Table of parts to be returned
	local minimum = MinimumVoxelSize or Settings.DefaultMinimumVoxelSize
	
	local inHitbox
	local NotInHitbox
	
	if Hitbox then
		inHitbox = CheckForNewVoxelsInHitbox(voxInfo,Hitbox)
		NotInHitbox = CheckForNewVoxelsNotInHitbox(voxInfo,Hitbox)
	else
		inHitbox = voxInfo
		--NotInHitbox = voxInfo
	end
	
	 

	
	for i,block in inHitbox do
		if (block.Size.X > minimum or block.Size.Y > minimum or block.Size.Z > minimum) then
			if partCanSubdivide(block) then --If part is rectangular then it is cut in half, otherwise it is divided into cubes.
				partTable = CutPartinHalf(block,TimeToReset)
			else



				local Threshold = 1.5  -- How much of a difference there can be between the largest axis and the smallest axis 

				local largest = math.max(block.Size.X, block.Size.Y, block.Size.Z) --Largest Axis
				local smallest = math.min(block.Size.X,block.Size.Y, block.Size.Z) -- Smallest Axis




				if smallest == block.Size.Y and smallest * Threshold <= largest then







					local bipolarVectorSet = {}

					local X = block.Size.X
					local Y = block.Size.Y
					local Z = block.Size.Z

					X /= 2
					Z /= 2
					bipolarVectorSet = { --Offset Vectors
						Vector3.new(-1,0,1),
						Vector3.new(1,0,-1),
						Vector3.new(1,0,1),
						Vector3.new(-1,0,-1),

					}


					local halfSize = Vector3.new(X,Y,Z)


					for _, offsetVector in pairs(bipolarVectorSet) do
						ModTable:ScheduleWork(block.CFrame,halfSize,offsetVector)
						local info:Types.VoxelInfo = {
							Size = halfSize,
							CFrame = ModTable:Work()[1],
							Parent = block.Parent,
							CanCollide = block.CanCollide,
							Transparency = block.Transparency,
							Material = block.Material,
							Anchored = block.Anchored,
							Color = block.Color,
							AlreadyExistsInWorkspace = false,
							AlreadyDivided = false,
							ResetTime = TimeToReset,
							OriginalPart = block.OriginalPart
						}
						print(info)
						table.insert(partTable,info)
					end



				elseif smallest == block.Size.X and smallest * Threshold <= largest then







					local bipolarVectorSet = {}

					local X = block.Size.X
					local Y = block.Size.Y
					local Z = block.Size.Z

					Y /= 2
					Z /= 2
					bipolarVectorSet = { --Offset Vectors
						Vector3.new(0,-1,1),
						Vector3.new(0,1,1),
						Vector3.new(0,-1,-1),
						Vector3.new(0,1,-1),

					}


					local halfSize = Vector3.new(X,Y,Z)


					for _, offsetVector in pairs(bipolarVectorSet) do
						ModTable:ScheduleWork(block.CFrame,halfSize,offsetVector)
						local info:Types.VoxelInfo = {
							Size = halfSize,
							CFrame = ModTable:Work()[1],
							Parent = block.Parent,
							CanCollide = block.CanCollide,
							Transparency = block.Transparency,
							Material = block.Material,
							Anchored = block.Anchored,
							Color = block.Color,
							AlreadyExistsInWorkspace = false,
							AlreadyDivided = false,
							ResetTime = TimeToReset,
							OriginalPart = block.OriginalPart
						}
						print(info)
						table.insert(partTable,info)
					end
				elseif smallest == block.Size.Z and smallest * Threshold <= largest then








					local bipolarVectorSet = {}

					local X = block.Size.X
					local Y = block.Size.Y
					local Z = block.Size.Z

					X /= 2
					Y /= 2
					bipolarVectorSet = { --Offset Vectors
						Vector3.new(1,-1,0),
						Vector3.new(1,1,0),
						Vector3.new(-1,-1,0),
						Vector3.new(-1,1,0),

					}


					local halfSize = Vector3.new(X,Y,Z)



					for _, offsetVector in pairs(bipolarVectorSet) do
						ModTable:ScheduleWork(block.CFrame,halfSize,offsetVector)
						local info:Types.VoxelInfo = {
							Size = halfSize,
							CFrame = ModTable:Work()[1],
							Parent = block.Parent,
							CanCollide = block.CanCollide,
							Transparency = block.Transparency,
							Material = block.Material,
							Anchored = block.Anchored,
							Color = block.Color,
							AlreadyExistsInWorkspace = false,
							AlreadyDivided = false,
							ResetTime = TimeToReset,
							OriginalPart = block.OriginalPart
						}
						print(info)
						table.insert(partTable,info)
					end




				else




					local bipolarVectorSet = { --Offset Vectors
						Vector3.new(1,1,1),
						Vector3.new(1,1,-1),
						Vector3.new(1,-1,1),
						Vector3.new(1,-1,-1),
						Vector3.new(-1,1,1),
						Vector3.new(-1,1,-1),
						Vector3.new(-1,-1,1),
						Vector3.new(-1,-1,-1),
					}

					local halfSize = block.Size / 2.0

					for _, offsetVector in pairs(bipolarVectorSet) do
						ModTable:ScheduleWork(block.CFrame,halfSize,offsetVector)
						local info:Types.VoxelInfo = {
							Size = halfSize,
							CFrame = ModTable:Work()[1],
							Parent = block.Parent,
							CanCollide = block.CanCollide,
							Transparency = block.Transparency,
							Material = block.Material,
							Anchored = block.Anchored,
							Color = block.Color,
							AlreadyExistsInWorkspace = false,
							AlreadyDivided = false,
							ResetTime = TimeToReset,
							OriginalPart = block.OriginalPart
						}
						print(info)
						table.insert(partTable,info)
					end



				end
			end
		end
	end


	local check = false
	for i, v in pairs(partTable) do
		if math.floor(v.Size.X) > minimum or math.floor(v.Size.Y) > minimum or math.floor(v.Size.Z) > minimum then
			check = true
		end
	end	
	
	if NotInHitbox then
		for i,v in NotInHitbox do
			table.insert(partTable,v)
		end
	end
	
	if check == true then
		for i,partInfo in pairs(partTable) do
			if partInfo.AlreadyDivided then
				table.remove(partTable,i)
			end
		end
		return DivideBlock(partTable,MinimumVoxelSize,TimeToReset,Hitbox)
	else	
		return partTable --Returns resulting parts
	end
end



function OctreeDivision.DivideBlock(Parts:Types.PartTable,Minimum:number|string,timeToReset:number,Hitbox:BasePart):Types.VoxelInfoTable
	local voxelTable:Types.VoxelInfoTable = {}
	
	local minimum = Minimum or Settings.DefaultMinimumVoxelSize

	for i,v in Parts do
		if v:HasTag(Settings.TagName) then
			v:RemoveTag(Settings.TagName)
			--if Settings.UseCache then
			--	Constructor:ReturnPart(v)
			--else
			--	v:Destroy()
			--end
		end
		
		local temp:Types.VoxelInfo = {
			Parent = v.Parent,
			Size = v.Size,
			CFrame = v.CFrame,
			Material = v.Material,
			CanCollide = v.CanCollide,
			Transparency = v.Transparency,
			Color = v.Color,
			Anchored = v.Anchored,
			AlreadyDivided = true,
			AlreadyExistsInWorkspace = true,
			ResetTime = timeToReset,
			OriginalPart = v,
		}
		table.insert(voxelTable,temp)
	end
	


	return DivideBlock(voxelTable,minimum,timeToReset,Hitbox)
end

return OctreeDivision

And the function passed through LoadModule:

return function(blockCF,halfSize,offsetVector)
	local cf = blockCF + blockCF:VectorToWorldSpace((halfSize / 2.0) * offsetVector)
	print(cf)
	return cf
end

I know it’s kind of hard to read lol

Not too bad when you paste it into studio :P

Here is the issue
image

You are calling :Work() right after Scheduling a task, meaning there is only a single task scheduled to run, and it is thus unable to assign tasks to multiple threads. You need to Schedule multiple tasks, before calling :Work()

I will also point out that the work you are doing (the CFrame calculation) is not very computationally intensive alone, and it’ll only become computationally intensive from the high number of voxels.
However, when doing a single voxel per task, this increases the amount of tasks substancially, and the overhead (especially with the arguments being sent, sending arguments to different parallel threads is costly).
I would suggests coding the function in the module to take a list of voxels, and return the calculated CFrame for those voxels (bonus point if you can reduce the amount of data being sent), and have the main script schedule a set amount of tasks (maybe like 24), and divide all the voxels evenly (or the closest to evenly) between those 24 tasks

However, it is important to measure the performance (either through the micro profiler, or maybe with os.clock(), the script performance tab is inaccurate) to figure out if you actually save up performance by doing this. Parallel lua on roblox has very limited use cases still, and often, using parallel lua takes up more time because of the overhead. It is also very possible that other segments of the code (such as moving parts, resizing them, etc) take up a significant amount of time while CFrame calculations aren’t very significant, and those cannot be ran in parallel or optimized

Anyway, here is an example on how you could implement it:

	local halfSize = Vector3.new(X,Y,Z)
	
	for _, offsetVector in pairs(bipolarVectorSet) do 
		ModTable:ScheduleWork(block.CFrame,halfSize,offsetVector)
	end
	
	local ResultTable = ModTable:Work()

	for i, offsetVector in pairs(bipolarVectorSet) do
		
		local info:Types.VoxelInfo = {
			Size = halfSize,
			CFrame = ResultTable[i],
			Parent = block.Parent,
			CanCollide = block.CanCollide,
			Transparency = block.Transparency,
			Material = block.Material,
			Anchored = block.Anchored,
			Color = block.Color,
			AlreadyExistsInWorkspace = false,
			AlreadyDivided = false,
			ResetTime = TimeToReset,
			OriginalPart = block.OriginalPart
		}
		print(info)
		table.insert(partTable,info)
	end

Here, the loop is ran twice, once to schedule all the tasks, and then to set up the tables when the work is done. The results are returned in the same order they were scheduled, so you can simply get them using the index of the for loop

(I also noticed that the loop only runs 2-8 times throughout your code, which is not a lot, and for each, only somewhat basic CFrame calculations are done. It is highly likely that the work is not computationally intensive enough to benefit from parallel lua)

Something else to note is that if you batch tasks together, you can send block.CFrame and halfSize only once for the 2-8 tasks, reducing the amount of data being sent. My example doesn’t include this though

Under the “Performance Tips” drop down, you can read more about performance when using this module or parallel lua in general

Hope this helps

2 Likes

I don’t know if I’m using the module incorrectly or what, but it doesn’t work for me.

I’m using your module for my ‘Global Illumination’ module, so it runs faster.

Code for main module:

-- Service(s)
local Lighting = game:GetService("Lighting")
local HTTPService = game:GetService("HttpService")

-- Variable(s)
local Classes = script.Classes
local Methods = script.Methods
local External = script.External
local ParallelFunctions = script.ParallelFunctions

local ParallelScheduler = require(External.ParallelScheduler)

-- Function(s)
local function InstanceChacker(instName, className, parent)
	if parent:FindFirstChild(instName) then
		return parent:FindFirstChild(instName)
	else
		local inst = Instance.new(className)
		inst.Parent = parent
		inst.Name = instName

		return inst
	end
end

-- Main
local globalIllum = {totalGrid = 0}
local PreviousStorage = nil

function globalIllum:GloballyIlluminate(baseBrightness: NumberSequence?, divisions: number?, color: ColorSequence?, castShadow: boolean?)
	baseBrightness = baseBrightness or NumberSequence.new(1)
	divisions = divisions or 60
	castShadow = castShadow or false
	color = color or ColorSequence.new(require(Methods.Color).Mix(Color3.new(1,1,1), Lighting.Ambient))

	local GlobalLights = InstanceChacker("GlobalLights", "Folder", workspace)
	local Storage = InstanceChacker(HTTPService:GenerateGUID(), "Folder", GlobalLights)

	PreviousStorage = Storage

	local workspaceSize = require(Methods.WorkspaceSize).Get()
	local yOffset = workspaceSize.Y
	local yOffsetRaycast = workspaceSize.Y / 8
	local divisionSizeX = workspaceSize.X / divisions
	local divisionSizeZ = workspaceSize.Z / divisions

	local lightDirection = require(Methods.LightDirection).Get()
	local Brightness = Lighting.Brightness
	local totalParts = divisions^2 

	local totalTime = divisions - 1

	require(Classes.GridPart).CastShadow = castShadow

	local gridSize = math.max(divisionSizeX, divisionSizeZ) / 2
	
	local ModuleScript = ParallelScheduler:LoadModule(ParallelFunctions.IllluminationLoop)

	--coroutine.wrap(function()
	--	for i = 0, divisions - 1 do
	--		coroutine.wrap(function()
	--			for j = 0, divisions - 1 do
	--				local interpFactor = (i + j) / (2 * totalTime)

	--				local brightness = require(Methods.Number).Interpolate(baseBrightness, interpFactor, totalParts)
	--				local gridColor = require(Methods.Color).Interpolate(color, interpFactor)

	--				require(Classes.GridPart).Brightness = brightness
	--				require(Classes.GridPart).BaseColor = gridColor

	--				require(Classes.GridPart).new(i, j, gridColor, castShadow, brightness, workspaceSize, Vector3.new(divisionSizeX, 0, divisionSizeZ), yOffset, gridSize, lightDirection, yOffsetRaycast, Storage)
	--			end
	--		end)()
	--	end
	--end)()
	
	ModuleScript:ScheduleWork(divisions, totalTime, totalParts, baseBrightness, color, castShadow, workspaceSize, divisionSizeX, divisionSizeZ, yOffset, gridSize, lightDirection, yOffsetRaycast, PreviousStorage)
	ModuleScript:Work()

	globalIllum.totalGrid = totalParts
end

function globalIllum:ManipulateGrid(gridX: number, gridY: number, color: Color3?, brightness: number?, castShadow: boolean?)
	local Grid = PreviousStorage:FindFirstChild(tostring("X_" .. gridX .. "-Y_" .. gridY))

	if Grid then
		if color or brightness or castShadow then
			Grid.PointLight.Color = color or Grid.PointLight.Color
			Grid.PointLight.Brightness = brightness or Grid.PointLight.Brightness
			Grid.PointLight.Shadows = castShadow or Grid.PointLight.Shadows
		end
	else
		warn("Invalid grid position.")
	end
end

return globalIllum

Loop module:

return function(divisions, totalTime, totalParts, baseBrightness, color, castShadow, workspaceSize, divisionSizeX, divisionSizeY, yOffset, gridSize, lightDirection, yOffsetRaycast, Storage, TaskIndex)
	local Methods = script.Parent.Parent.Methods
	local Classes = script.Parent.Parent.Classes
	
	for i = 0, divisions - 1 do
		for j = 0, divisions - 1 do
			local interpFactor = (i + j) / (2 * totalTime)

			local brightness = require(Methods.Number).Interpolate(baseBrightness, interpFactor, totalParts)
			local gridColor = require(Methods.Color).Interpolate(color, interpFactor)

			require(Classes.GridPart).Brightness = brightness
			require(Classes.GridPart).BaseColor = gridColor

			require(Classes.GridPart).new(i, j, gridColor, castShadow, brightness, workspaceSize, Vector3.new(divisionSizeX, 0, divisionSizeZ), yOffset, gridSize, lightDirection, yOffsetRaycast, Storage)
		end
	end
	
	return
end

Error in question:
image

There are a couple of problems in your code

First of all, you should only use LoadModule once, so instead of using it inside of your function, put it at the top, perhaps right under the line where you require ParallelScheduler

At the bottom of your GloballyIlluminate function, you use ModuleScript:ScheduleWork() and ModuleScript:Work() right after one another, meaning only 1 thread will be working (for the 1 task scheduled), completely undermining the benefits or running it parallel. ModuleScript:ScheduleWork() should be called multiple times, before ModuleScript:Work() is called

These issues aren’t what is causing the error though, that error is simply caused because SharedTables (what is used to send the arguments to the different lua vms) doesn’t accepts instances

https://create.roblox.com/docs/en-us/reference/engine/datatypes/SharedTable

Shared tables (and other means to send data across lua vms) are slow. You are currently sending a lot of arguments to your loop, so I would recommend reducing the number of arguments as much as possible
I also see in your Loop Module that you seem to change the Brightness and BaseColor properties of instances? That function will be ran in parallel, which includes many restrictions, such as modifying the properties of instances. You can either use task.synchronize() at the end of the Loop Module to exit the parallel phase, or send back values to the main script and apply changes there

Another note is that you probably can require Classes.GridPart, Methods.Number and Methods.Color at the top of the Loop Module, outside the returned function, allowing you to require them only once

Also make sure that using parallel scheduler actually improves performance. Parallel lua has a lot of overhead, and it can make code slower instead of faster if the work performed in parallel wasn’t all that slow. If you have a high number of small (fast) tasks, group them into bigger (slower) tasks

May I know what you found with your bindable testing? I just made a parallelism module that uses bindables to send packets of data and put them together to one big table, it performed better than ComputeLua - which is what I usually use for Parallel Luau. I’m sure you’re a much more professional in this field than I do, but it seems like sending data through bindables seems to perform faster? I haven’t tested your module with mine yet, but is there a reliable way to stress test my module? Currently I just do square roots.

Roughly benchmarked your module, turns out yours was ~50ms faster; I was just doing plain os.clock() benchmarks instead of averaging attempts out, but now my average of 100 iterations is 20ms while yours is a solid 26ms (doing sqrt)

1 Like

It’s been a while since I’ve tested different alternatives and I didn’t note down the results. It’s interesting that you’ve found out bindables are faster, I’ll have to look into and probably do testing again, more thoroughly this time. Don’t really have time right now though :/

My testing wasn’t very thorough so it is possible I missed something. Could also be that performance changed over time

I don’t remember why I used shared tables instead of bindables. It would be easy to strictly use bindables to pass the arguments, and I’m pretty sure this is what I was doing initially, but I assume performance made me switch to shared tables

2 Likes

The reason why I opted out of using sharedtables is because just fetching it consumed 6% of the total time, and when I tried the bindables - it’s just around 0.7%

1 Like

im trying to use this for Quasiduck’s lightning module and im still new to parallel luau so i dont really know much about it, but im trying to make it use this module and it keeps erroring with:
"ReplicatedStorage.ParallelScheduler:155: ModuleTable:Work() was called before the previous tasks were completed. This can cause errors or wrong results. If this was caused because of an error in the ModuleScript, you can ignore it"

the modulescript dosent error, it’s the :Work() not finishing in time, Here’s my code: (It’s in a heartbeat loop)

local p0Data = {
				p0 = a0.WorldPosition,
				p1 = a0.WorldPosition + a0.WorldAxis * CurveSize0,
				p2 = a1.WorldPosition - a1.WorldAxis * CurveSize1,
				p3 = a1.WorldPosition
			}


			--Initialise iterative scheme for generating points along space curve
			local init = SpaceCurveFunction(0, p0, p1, p2, p3)
			local PrevPoint, bezier0 = init, init
			
			if TimePassed + 0.1 < Lifetime then
				for i = 1, PartsN do
				local PercentAlongBolt = i / PartsN

				Actor:ScheduleWork(PercentAlongBolt, i, PartsN, spd, TimePassed, freq, RanNum, MinRadius, MaxRadius, MinThick, MaxThick, p0Data, bezier0);
				end
			end
			
			local ResultTbl = Actor:Work();

			--Update
			if TimePassed + 0.1 < Lifetime then
				for i = 1, PartsN do
					local BPart = Parts[i]
					local PercentAlongBolt = i / PartsN

					local Results = ResultTbl[i];

					local input, input2, noise0, noise1, thicknessNoise, bezier1, NextPoint = unpack(Results);

					ThisBranch:_UpdateGeometry(BPart, PercentAlongBolt, TimePassed, thicknessNoise, PrevPoint, NextPoint)

					ThisBranch:_UpdateColor(BPart, PercentAlongBolt, TimePassed)

					PrevPoint, bezier0 = NextPoint, bezier1
				end
			else
				ThisBranch:Destroy()
			end

ModuleScript:

local function CubicBezier(PercentAlongBolt, p0, p1, p2, p3)
	return p0 * (1 - PercentAlongBolt) ^ 3
		+ p1 * 3 * PercentAlongBolt * (1 - PercentAlongBolt) ^ 2
		+ p2 * 3 * (1 - PercentAlongBolt) * PercentAlongBolt ^ 2
		+ p3 * PercentAlongBolt ^ 3
end

local function DiscretePulse(PercentAlongBolt, TimePassed, s, k, f, min, max) --See https://www.desmos.com/calculator/hg5h4fpfim for demonstration
	return math.clamp(k / (2 * f) - math.abs((PercentAlongBolt - TimePassed * s + 0.5 * k) / f), min, max)
end

local function ExtrudeCenter(PercentAlongBolt)
	return math.exp(-5000 * (PercentAlongBolt - 0.5) ^ 10)
end

local function NoiseBetween(x, y, z, min, max)
	return min + (max - min) * (math.noise(x, y, z) + 0.5)
end

local offsetAngle = math.cos(math.rad(90))

return function(PercentAlongBolt, i, PartsN, spd, TimePassed, freq, RanNum, MinRadius, MaxRadius, MinThick, MaxThick, p0Data, bezier0, TaskIndex)
	
	TaskIndex -= 1;
	local p0, p1, p2, p3 = p0Data.p0, p0Data.p1, p0Data.p2, p0Data.p3
	local input, input2 = (spd * -TimePassed) + freq * 10 * PercentAlongBolt - 0.2 + RanNum * 4, 5 * ((spd * 0.01 * -TimePassed) / 10 + freq * PercentAlongBolt) + RanNum * 4
	local noise0 = NoiseBetween(5 * input, 1.5, 5 * 0.2 * input2, 0, 0.1 * 2 * math.pi) + NoiseBetween(0.5 * input, 1.5, 0.5 * 0.2 * input2, 0, 0.9 * 2 * math.pi)
	local noise1 = NoiseBetween(3.4, input2, input, MinRadius, MaxRadius) * ExtrudeCenter(PercentAlongBolt)
	local thicknessNoise = NoiseBetween(2.3, input2, input, MinThick, MaxThick)

	--Find next point along space curve
	local bezier1 = CubicBezier(PercentAlongBolt, p0, p1, p2, p3)

	--Find next point along bolt
	local NextPoint = i ~= PartsN
		and (CFrame.new(bezier0, bezier1) * CFrame.Angles(0, 0, noise0) * CFrame.Angles(
			math.acos(math.clamp(NoiseBetween(input2, input, 2.7, offsetAngle, 1), -1, 1)),
			0,
			0
			) * CFrame.new(0, 0, -noise1)).Position
		or bezier1
	
	return {input, input2, noise0, noise1, thicknessNoise, bezier1, NextPoint}
end

I am pretty sure what is happening is here,

if TimePassed + 0.1 < Lifetime then
	for i = 1, PartsN do
		local PercentAlongBolt = i / PartsN

		Actor:ScheduleWork(PercentAlongBolt, i, PartsN, spd, TimePassed, freq, RanNum, MinRadius, MaxRadius, MinThick, MaxThick, p0Data, bezier0);
	end
end

local ResultTbl = Actor:Work();

If your if statement above is false, no tasks are scheduled, and this isn’t something I considered, so my module fires the actors to make them work, but if no tasks were scheduled, no worker becomes active, and they never tell the main module that the tasks are done, so Actor:Work() yields forever, and on the next frame you get the warning

Putting local ResultTbl = Actor:Work(); inside the if statement (and combining the two if statements) would probably fix the issue. I should also fix my module for this edge case, perhaps a descriptive warning for this specific case

You should also note that sending a lot of data to actors is quite slow. If you want the best performance, try to move some of those parameters to within the module, if possible. Also measure performance, it is always possible that using parallel lua makes it slower rather than faster, because of the overhead

You could also improve performance by not sending i and using TaskIndex instead (you are doing TaskIndex -= 1 in your code, but that does nothing… TaskIndex should be equal to i). And you can manually put PartsN, spd, TimePassed, freq, RanNum, MinRadius, MaxRadius, MinThick, MaxThick, p0Data, bezier0 into a SharedTable, and retreiving it inside the ModuleScript.

This is something I would like to add to this module, a field for values that will be the same for every actor. That would reduce the amount of stuff stored in my shared tables by a lot, reducing the overhead
(I would probably make the function inside the ModuleScript receive 2 tables, and TaskIndex, table 1 would be for Global Parameters, parameters that are identical for every actor, and would be passed through Actor:Work(table), and the second table would be for Local Parameters, those sent by Actor:ScheduleWork(table). Sending parameters as tuples would not be allowed)

1 Like

Thank you for the response! I’d appreciaite if you could do that because your module is really helpful and i wanna optimize this lightning module further as it’s pretty old and laggy compared to what it can be, of course idk how to optimize it the most efficient way, but its just learning so i believe when you do fix it i’ll prob know how to accomplish this, sadly there arent many articles on Parallel Luau but im learning from existing code, thanks!

1 Like

Yo, any update on this?

I’d love to see a version of the module which is faster.

Also. Multiple times while trying to use the module I was unable to pass raycast/shapecast params to raycasts running in parallel.
Is this a me problem or is the module limiting me from using raycast params in parallel?

1 Like

For the mean time, I would recommend perhaps looking into other modules, looking if they are faster than mine

I have started working on an updated version, that has local params and global params, that could potentially improve performance by reducing the amount of data that has to be shared between actors for data that is identical for each actor. However, I will have to do benchmarks and I also want to try some other optimizations, and that will take a significant amount of time. I would have to slow down the world around me by 2x to do everything I want to do…

Even with those changes, I don’t expect performance improvements to increase. I’m hoping for a good performance increase for people scheduling hundreds of tasks, but for lower amount of scheduled tasks, there will probably only be minimal performance improvements
Most of the performance penalty comes from overhead of parallel lua in the roblox engine, and sending data between actors


As for raycast params, how I’ve done it is make the raycastparam object direcly inside the ModuleScript that runs in parallel. So pass the information necessary to the actors for updating the raycast params

Sending a raycast param to actors directly might not be possible if they cannot go into shared tables (which is what my module uses to share data between actors). If you have an error message, please share it so I can give a much more in depth answer

Some properties/methods of raycastparams cannot be modified in parallel, if that is what you are referring to (thought I just saw that :AddToFilter() is write parallel, which is new I think)

1 Like