First thing you should do if you haven’t yet is open the microprofiler and see if the tasks are actually being distributed among different threads. It should look something like this where there are multiple layers of processes:
If you don’t see something similar to that, continue reading. Otherwise, it’s probably a problem with parallelization itself, and I’ll still help you with that.
This is a red flag. Parallelization is on a per-actor basis, and using it inside the module itself could be the problem. I’ve used Parallel Luau before, and the way I did it is by putting the entire to-be-parallelized code inside the actor script, and not in a module. This also means that the thread is desynchronized in the script directly and not from a module function.
For comparison, here’s my own multithreaded terrain generator that has a similar framework to yours:
And at the bottom is a snippet of the actor scripts in my project:
Summary
...
instructEvent.Event:ConnectParallel(function(instruction: taskInstruction)
print(`Processor {id} received a new task`)
local encodeds: {string} = {}
for k, corner: Vector3 in instruction.corners do
local voxelDim: number = wCFG.lodVoxelDims[k]
local chunkDim: number = wCFG.lodChunkDims[k]
local scalarField: Tensor<boolean> = Tensor.new() --these are random names lol
local surfaceMap: Tensor<boolean> = Tensor.new()
local function getVoxel(x: number, y: number, z: number): boolean
local filled: boolean? = scalarField:get(x, y, z)
if filled == nil then --if the voxel doesn't exist, make one and store it
filled = perlin.noiseBinary(corner.X+x*voxelDim, corner.Y+y*voxelDim, corner.Z+z*voxelDim)
scalarField:set(x, y, z, filled::boolean)
end
return filled::boolean
end
local function isSurfaceVoxel(x: number, y: number, z: number): boolean --just check to see if the voxel is next to air (nothing)
if not getVoxel(x+1, y, z) then return true end
if not getVoxel(x-1, y, z) then return true end
if not getVoxel(x, y+1, z) then return true end
if not getVoxel(x, y-1, z) then return true end
if not getVoxel(x, y, z+1) then return true end
if not getVoxel(x, y, z-1) then return true end
return false
end
for x = 1, chunkDim do
for y = 1, chunkDim do
for z = 1, chunkDim do
if getVoxel(x, y, z) and isSurfaceVoxel(x, y, z) then
surfaceMap:set(x, y, z, true)
end
end
end
end
encodeds[k] = tensorEncoder.encodeBinaryTensor(surfaceMap)
end
local _r: taskResult = {
id = instruction.id,
encodedTensors = encodeds
}
returnEvent:Fire(_r)
end)
...
As you can see, everything is in the actor script, aside from the bare essential modules like the Perlin noise generator and the Tensor data structure I’m using to store voxels.
Next point:
This can also be a problem if it wasn’t already. You’re not supposed to use task.wait
in desynchronized threads, because by definition it will just involuntarily put the thread back in synchronized mode, because of how the task scheduler works. If your parallelized code depends on task.wait
to function, then I’m afraid you have to rewrite it so that it doesn’t.
@ClientCooldown also has a good point; you should minimize changing the synchronization state of the threads as that can be a big bottleneck in your code. Perform all the parallel tasks together at the same time, temporarily store their results, and then resynchronize the thread and do what you need to do in serial.