ClusterUpdate and adornees are badly optimized

Hello!

(Compiled bug report below)

So I’ve recently been making a tattoo game that requires a lot of parts which created lag issues. At first we were using unions but I soon realized that the problem was caused by the broadphase of the raycasts of the popper cam where it had to check if all the parts had CanQuery. To fix this started putting the parts inside a world model inside the workspace which fixed the issue. Tho, I still couldn’t get past 20K parts whiteout lag and so I looked at the micro profiler and I found that UpdateDynamicParts and UpdateInstancedClusters where eating the CPU (only 2 clusters). I then tested adornees as they don’t have clusters (which might be the issues :roll_eyes:) but they were even worst (15K before lag).

COMPILATION


Briefing : Cluster updates are very slow with a lot of parts
Information : The cluster updates are very slow and eat the CPU. In this case it was tested on 20K parts and was taking 10 out of 15 ms of the frame (normal usage ~6ms).

Observed Cause : Roblox regenerates all of the clusters every frame as suggested by this post on point 4. This can be kinda proven by the fact that unioning parts by batch of 500 allows to get to more than 150K parts (in grouped union form) whiteout even having the schedule render above 14%.
Reproduction : Download the benchmark below, set the benchmark to part and let it go up to ~20K. Then open the micro-profiler and check the update prepare field (Tested on Intel-i5 3rd gen).


Briefing : Adornees are very slow even slower than normal parts
Information : For some reasons adornees which don’t have anything relating to physics are 1 sided and aren’t shaded (ImageAdornees) are laggier than parts (15K compared to 20K parts).

Observed Cause : Maybe as they don’t use clusters they are worst or the culling is badly optimized? (No clue here)
Reproduction : Download the benchmark below, set the benchmark to adornees (scroll down) and let it go up to ~15K (The adornees are facing away from the spawn orientation). Then open the micro-profiler and check the standard adornee field.

Reproduction File:
Rendering-Benchmark - New.rbxl (178.6 KB)

4 Likes

We’ve filed a ticket into our internal database for this issue, and will come back as soon as we have updates!

Thanks for the report!

1 Like

Hello,

I have done a preliminary investigation. Very nice test case for benchmarking btw!

So Adorns are really slow, they are not meant to be used in the way they are used in the test case. The mesh parts, I expected it to be a little costly while adding parts, due to the way that works internally. But I was surprised to see the cost continues after stopping the benchmark. I debugged this a bit and I will have to continue debugging it tomorrow or so, it seems to think the parts are dynamic, when they are not… this is what triggers the update.

I think what’s more important at this point is trying to understand your use case. There is no specialized solution for rendering many many tiny parts. This is also largely true for other game engines. So we should better understand your use case, so we can provide some guidance on how to make this work well within Roblox.

Hello!

Thanks for looking into it!

Currently I am using this for a drawing game where you draw 3D tattoos on players that wrap around their character (using mesh data). The reason I use parts/adornees is because it has to be 3D. I could use surface guis but they would still require having invisible parts (1 per drawn surface which because of the bug still produce a lot of lag) and even if it wouldn’t lag, I would need to clip the UI objects to the surface which I wouldn’t expect to be very performant using available methods.

One thing I did note is that if the parts are unioned (in my testing by groups of 500) you can reach upwards of 200K+ parts (~2M triangles) whiteout even starting to have frame drops. This was actually our solution but it had major cons, mostly it taking time to union everything which made editing unioned chunks tedious (As we allow 50K parts per players which can be achieved pretty fast), seeing other people’s tattoos while they are being drawn and loading the tattoo. From our tests, loading a tattoo with 50K parts could take 5 minutes per players excluding the fact that only 1 union operation could be performed at the same time.

So yeah, we use it to allow players to draw “dynamic” textures and apply them on 3D objects.

(I didn’t mention it in the report but the same type of stuff happens with textures and beams, having more than 300 or 3000 at the same place also causes a lot of lag, I didn’t do torough testing so I don’t really know what is really happening but I know it also lags).

Hello again,

Yeah so the main problem is that your specific use case, drawing 3D tattoos, doesn’t have a good generic solution in Roblox (currently). So you have found very creative ways to make it work, but they mostly (except the union) involve lots of individual parts.

The main problem with lots of individual parts is that each part is a full object, it goes into WorldModel\Objects in your benchmark test bed. This means it hooks into the data model, has signals, goes through network replication data, etc. This is not great for performance, but also costs a lot of memory. (FWIW other game engines like Unity and Unreal don’t handle this particular scenario very well either)

I unfortunately don’t have a good suggestion on how to work around it with the current technology available to you. I think that the union approach is the thing that would work ‘best’, but that’s still not great. To alleviate some of the cost however, you should probably not try to make a union of everything, but rather build smaller unions of subsections and say have 100-500 total parts. And then incrementally union more things together as they are not being edited and have ‘stabilized’. But the truth is, your current use case does not have a good technical solution in Roblox currently.

FWIW I have forwarded this thread to some other people to see if they have good ideas.

1 Like