@zeuxcg Studio simulation isnt running. Is this a memory leak? Lol.
Same place you posted for raytracer except I added some lighting effects.
I also found an unrelated crash.
@zeuxcg Studio simulation isnt running. Is this a memory leak? Lol.
Same place you posted for raytracer except I added some lighting effects.
I also found an unrelated crash.
Will there eventually be a way to separate code entirely from the main thread, so as to not slow single threaded workloads being performed there?
Also, is pathfinding going to be enabled for multi-threading?
Yes finally. Been waiting for multithreading a long time ago. Hope it won’t be too different of a workflow.
I’m a bit confused about this, does this mean if I’m using a 4-core system, 16 actors are spread between each core if i create 64 actors, or?
He’s implying you should not target specific core counts or hardware, just create however many parallel tasks you need logically. The engine takes care of spreading it over cores (or not) depending on internal heuristics. Hardware that only has room to run single-threaded Lua will run your 64 actors sequentially and better hardware might run 1 actor per core, or anything in between.
This is great news and will raise the bar significantly, my only worry is that this seems very complex. Will games have to move to this new system with Actors
or is that only if they want multi-threading?
The general rule is that a new feature is optional unless explicitly specified otherwise. This will be optional.
Ah yes time to run into many race conditions whilst implementing this into my game cause i have absolute no experience with parallel execution XD However I am totally ready to tackle this feature and get used to it
Few questions I have have:
Since actor instances are required per process logical needs, does that mean single script architectures such as AeroGameFramework will not be able to utilize the full functionality?
My other question is not related to technical aspects but I am wondering why the function is named “task.desynchronize” instead of “task.desync”? Although auto complete exists, it still feels like a long text to type it out.
I’m glad the Raycast method is usable! A number of people have been pestering politely asking me to implement parallel Luau into FastCast to allow more projectiles to simulate at once, and I plan on doing just that. I anticipate great performance gains from being able to cluster various ongoing raycasts onto different cores so that only a fraction of the work is done on any single core.
Does this apply to very very large amounts of tasks?
I am trying to determine how I will work with this system.
Let’s say I want to simulate over 1,000 NPCs and constantly update them. Also let’s ignore other methods to optimize them such as only doing a certain amount each second/frame. In this case we’d basically be running 1,000 tasks in parallel at the same time.
Does this mean it would be better to create 1,000 actors or tasks, instead of using say 10 actors each doing 100 NPCs?
It seems the only potential benefit is someone with a high number of cores.
If that is the case then I think I’d rather add some abstraction and go with the former because I could also be potentially running other intensive tasks that I’d like to prioritize.
Super excited by this one, nice job!
I appreciate you’re still actively working on this but do you have a timeline with regard to release? Have quite a few cool ideas I’d love to throw into some projects.
Also, assuming the internal tech spec doesn’t include sensitive / trade info, any chance that’ll be released to the community? I would love to read it to get a deeper understanding of RBLX.
In regards to module scripts, I hope that I will be able to run some functions in parallel from inside the module script while keeping the rest of the module script global. (In this hypothetical case, the script requiring the module script would not be inside an Actor or otherwise used in parallel.) Though, perhaps there could be an issue with module variables which are accessed from the parallel code and accessed from the serial code (or maybe not, I’m thinking with the general C mutex mindset where you need to lock things down).
I am super excited by this update, can’t wait to start using it.
Alright - finally been able to play around with this thing. I’m hyped!
I managed to get a simple Lua-based API working for a job-based paradigm:
The idea is pretty simple: you can pass in pure functions with a set of arguments, and they’ll be run in parallel as fast as possible. The system should be easily expandable to let you return values from those functions too, if you’re looking to use them to run computations.
I’ll most likely end up using a refined version of this in Blox to run terrain generation and greedy meshing calculations
Thanks for the report! As mentioned previously, any engine crash is a bug even if you’re using multithreading. You were writing to a property that isn’t supposed to be writeable but our mechanism for preventing this had a typo in the check that guards against incorrect usage of API. This will be fixed in the next build later this week.
1000 actors would make sense here. In general we see two possible patterns for use of this system:
Today both could work. In the future we’re going to start using Actor hierarchy to selectively enable writes - for example, you’d be able to move the parts of the NPC from the parallel section but only if the script and parts are both part of the same Actor. So when you already have a notion of an entity we’d recommend creating an Actor per entity.
If multithreading allows the client to use more of their CPU and the thread count depends on the CPU, what about the server? How many threads will the server be able to effectively use?
I know we shouldn’t aim to use a certain number of cores but it would be good to know the performance difference server-side.
This is somewhat separate from this release, but we plan to have official documentation on the core count for the server, which will grow based on the # of players the server supports and the historical CPU usage. This will affect both internal engine systems (a lot of them became multithreaded over the last year or so), and the parallel script execution. Except more announcements on this subject next year, for now this is all the information we have.
Awesome! Looking forward to this; parallel processing would be exceptional for neural network training.
Thanks for the feedback everyone! We’ve uploaded a new build (links in the post updated) with fixes based on the initial reports and internal testing.
Changes: