Parallelizer - A Lightweight Packet-Based Parallelism Solution

marketplace download

DocumentationAPI Reference


Brief Overview

Parallelizer is a lightweight and performant module to parallelize heavy and intensive tasks that would otherwise put a lot of strain on performance. Parallelizer aims to make parallel luau more accessible and performant without sacrificing ease of use and readability.

What is parallel luau you may ask? Parallel luau is a way to spread the computation load into multiple cores with the help of Actors. This helps in reducing strain and lag from the heavy computations that you may use.

:information_source: For more information, check out Parallel Luau

Possible Use Cases
  • Bullet Raycast Computations
  • Enemy AI Computations
  • Terrain Computations
  • Raytraced Renderer
  • Parallelized Cryptography

Why use this over other options?

Parallelizer uses a bindable event to send packets of data from parallel to serial instead of a SharedTable - which SharedTables are frankly quite slow to deal with.

Parallelizer also offers minimal footprint because of its unobstructive and lightweight nature :dash:

:information_source: You can check out the benchmarks here


Example Usage

Here is an example usage of calculting the nth root (2) from 1 to 4096, to be run in parallel

:exclamation: Note: Ensure the module is located where both the job scripts and the master script can access; for example, in ReplicatedStorage

Main Script:
local Parallelizer = require(path.to.module)
local Scheduler = Parallelizer.CreateJobScheduler(script.Job, 256, script) -- 256 workers/actors, actors stored below the script

local RootInstruction = Parallelizer.CreateInstructionData({2}) -- {2} is the data to send to the actors, preferably for constants

-- Calculate the root of 1 to 4096 with each actor equally doing the same number of root operations
Scheduler:DispatchEquallyAsync('CalculateRoot', 4096, function(result)
	print(result) -- The result in an array
	Scheduler:Destroy() -- Destroy when no longer used further (only for memory cleanup purposes)
end, RootInstruction)
-- The Dispatch function is asynchronous, meaning it won't yield - so the succeeding code will run unobstructed

Actor/Job Script (located under the main script):

local Actor = script:GetActor()

-- Ensure we don't run as a non-actorized script?
if not Actor then
	return
end

local Parallelizer = require(path.to.module)

-- id is the index of the thread, which is in the range [1, threadCount]
Parallelizer.CreateThread(Actor, 'CalculateRoot', function(id, instruction)
	return id ^ (1/instruction[1])
end)

:exclamation:Important: The create thread callback function expects a return value.


Examples

Benchmarks & Comparisons

Benchmark settings:

  • 256 actors
  • 8192 threads
  • 32 task assigned per actor
  • 100 iterations
  • Non-native environment

Hardware (the only stuff I have :sob:):

  • Intel(R) Core™ i7-4770K CPU @ 3.50GHz
Benchmark Source

Parallelizer:
benchmark_parallelizer.rbxm (3.4 KB)
ComputeLua:
benchmark_computelua.rbxm (31.0 KB)
Parallel Scheduler:
benchmark_parallel_sched.rbxm (6.8 KB)

Task Parallelizer ComputeLua Parallel Scheduler
nth root (n = 2) 22ms 40ms 25ms

Credits & Footnotes

:warning: Warning: Messing with generally all parallel code is prone to crashes, save a backup or publish the place to avoid your progress being loss.

Could anyone suggest a stress test method I could use? I’m trying to test the capabilities of this module before I’m going to redo my shader.

If you have any suggestions in mind, feel free to share them here! Or if you’ve encountered issues with the module, don’t hesitate to ask for help! I will assist you through the process as soon as possible.

Thanks to ComputeLua once again for inspiring me and giving hope to make the shader - and this. Most of the API is similar to ComputeLua’s (and also a bit of unity’s)

This module is under the MIT license

Version History

0.1.6: Parallelizer.lua (4.1 KB)
0.1.5b: Parallelizer.lua (3.9 KB)
0.1.5a: Parallelizer.lua (3.4 KB)
0.1.5: Parallelizer.lua (2.7 KB)
0.1.4: Parallelizer.lua (2.4 KB)
0.1.3: Parallelizer.lua (2.4 KB)
0.1.2: Parallelizer.lua (2.2 KB)
0.1.1: Parallelizer.lua (2.1 KB)
0.1.0: Parallelizer.lua (2.4 KB)
Release: Parallelizer.lua (2.2 KB)

What do you think? (your votes are anonymous :shushing_face:)
  • What the sigma
  • Ok cool
  • Couldn’t think of a use for Parallel Luau

0 voters

Should I make a Github Repository?
  • Yes
  • no why

0 voters

22 Likes

#resources:community-resources
PS: Change the topic to this ^^^

1 Like

I was about to say I wasn’t really going to maintain this project as often; but I guess it doesn’t matter

2 Likes

Added a :DispatchWithBatches helper function to dispatch and calculate the BatchSize automatically for you (divides the work into equal parts)

1 Like

Made it so you can pass in arguments directly into both the dispatch functions since I realized it would get tedious to set the Instruction table repeatedly

1 Like

Oh yeah and it’ll not break when your thread count is not divisible by your actor count

Also added benchmarks and stuff

1 Like

Fixed a silly if statement oversight, previously a false return value would be flagged as a missing return value

I’m pretty sure I did benchmarkes wrong, It should be intensive repeated tasks. I’m gonna go fix the benchmark section now

1 Like

Updated the benchmark section, it now compares the average instead of just plain attempts

1 Like

Fixed an issue where batchSize can be 0, resulting in the actor loop not looping - caused by thread count being smaller than the actor count

1 Like

v0.1.5 - QoL Changes

What you need to know:

  • Added instruction data to store values to be sent to actors
  • Changed constructor methods (those that starts with Create) to use . instead of :
  • Now supports tuple (or varargs? if that’s what its called) return types

I have also made a uncopylocked simple raycaster place using this with the help of CanvasDraw:

The benchmark section should be updated tomorrow afternoon

this looks neat, i’ll check it out, the experience is private btw

It’s not supposed to be played within the roblox player, you should be able to edit the place since its uncopylocked
image

oh alright then (charcharchar)

1 Like

v0.1.5a - Minor Internal Changes

What you need to know:

  • Added type annotations
  • Fixed a bug where instData isn’t optional

Updated the benchmark section

v0.1.5b - Minor Internal & API Changes

What you need to know:

  • Renamed :DispatchWithBatches to :DispatchEqually
  • Renamed .CreateNewJobScheduler to .CreateJobScheduler
  • Replaced the for range loop in :Dispatch to a for in loop (I believe it’s faster since it doesn’t have to index every iteration)

I have tried to optimize some parts of the module, but most of them just leads to worse performance or very little to no difference. One of the optimization I tried was using table.move instead of a for loop, it just turns out to be slower by 5ms so I kept the for loop

Planning to use this in the near future as it seems very useful. There aren’t many resources out there on understanding or implementing Parallel Luau, not for many use cases at least.

Will reply with results once I get to doing that.

Thanks for the useful resource :+1:

1 Like

Glad you found it useful :smile:

Yeah, I was using ComputeLua before I made this for my parallel needs since thats the most promising (pun intended) amongst others

Goodluck!

:ok_hand:

1 Like

v0.1.6 - Minor Internal Additions & Changes

What you need to know:

  • Added AllowedInstructionValues and InstructionTableData type
  • Made Dispatch functions defer instead of instantly dispatching

Just updated the post for more aesthetics.

1 Like

v0.1.6a - Minor API Changes

What you need to know:

  • Renamed asynchronous functions (Dispatch and DispatchEqually) to have -Async suffix (DispatchAsync and DispatchEquallyAsync)

Updated the raycaster place to have the current version