Occlusion Culling Now Live in Roblox Client

I have seen it myself run its own logic within the micro profiler. It does seem to be pretty fast however i can’t just take that for granted, especially when my system is basically GPU bottlenecked in 99% of scenarios. Personally i really want to know exactly how the occlusion culling works on systems with BOTH low CPU and GPU processing power like an old i3 with intel hd graphics. I wish to know exactly the trade offs between the CPU and GPU when it comes to processing such a feature since i don’t believe this tech is just “free” in that regard. I’m really curious to know how much CPU computing goes down relative to how much GPU computing goes up all while not causing an obvious CPU bottleneck.
Another kind of test case would be how much would occlusion culling hurt performance in places which are CPU bottlenecked. I myself have a couple large places which suffer from CPU bottlenecks and i’m really curious to know exactly how much does it hurt framerates (if at all) in those places. Since yknow, a CPU bottlenecked place wont exactly improve framerates regardless of how much load you remove off the GPU.

Also sticking humanoids into random parts and models is sure to destroy performance regardless or whenever or not those humanoids are actually doing something. Don’t do that, thats bad practice. Seen quite a lot of cases in which humanoids magically ended up in random places to even being a direct child of the workspace! And let me tell you, when that happens framerates can go into the single digits pretty darn quick.

This is the best roblox update of all time, all games using lots of part based maps with fancy builds now run 2x faster.

i3? That’s way too modern! I’m using a 2007 CPU (Intel Core2) :rofl:

1 Like

Maybe not best practices due to the black box nature of how their code works, but it was an example of a known way to make this happen. I haven’t had time to try every possible object property or type to get a definitive list yet, I’m sure someone else will beat me to it. :grinning:

1 Like

To be fair while you do have a core 2 quad from 2007, your system so far is still far more powerful than the average craptop with some laptop/mobile dual core i3 with integrated graphics that barely even quality as display adapters. This is all forgetting the fact usually those systems come with super slow single channel 4-8 gigs of ram which don’t exactly do much on modern versions of windows (win 10 and 11). Maybe youre more prone to CPU bottlenecks (since your 780 Ti should be pretty much more than enough for roblox) however it is basically running without any other bottleneck and limitation, unlike a vast majority of actually old and low end machines.

This isnt to undermine your core 2 quad rig, i believe its impressive what it can do but i believe its pretty clear that your rig right now is the exception to the rule here.

I recall asking this at RDC when speaking to a rendering engineer, but I’ll repost it here for visibility.

Are there any plans in the future to use occlusion culling on layered clothing, so that it hides hidden body geometry instead of HSR? HSR, while good for performance, has some pretty problematic restrictions to what types of avatars you can make. If layered clothing used occlusion culling instead, HSR wouldn’t be needed!

Occlusion culling monitors its performance and tunes itself dynamically, independent of graphics settings. This is important to work well across vastly different hardware and scenes. In our metrics, we haven’t seen occlusion culling be a CPU bottleneck. If you do see that, we’d like to know about it!

The reason avatars and the like aren’t culled (yet) has nothing to do with the occlusion culling itself, but with different code paths in the engine needing to be hooked up. Software is often like that; 10% of the time is doing what you think the feature is, and 90% of the time is the glue code to connect it to the rest of the program!

It does this internally and automatically. We have considered developer hinting, but we wanted to get what we have out there first. This lets everybody start getting the benefits earlier, and it also lets devs discover what controls they need in practice and not just in theory. Once we know what controls might help and why, we can prioritize adding those hints against all the other things we can do to improve Roblox. Because we have limited time, we can’t do all the helpful things we want to do!

Originally it didn’t cull anything with a highlight controller. We updated it to only skip culling if there is an active highlight that ignores depth, which lets more stuff get culled in some places.

5 Likes

Occlusion culling does not work at the granularity needed to become a replacement for hidden surface removal on layered clothing.

2 Likes

Thanks for answering but i do have to say that its not quite the answer i was expecting. I wasn’t specifically asking of occlusion culling can cause a CPU bottleneck (i believe this is pretty obvious), i was more asking if occlusion culling could result in performance losses in CPU bottlenecked places/experiences. In those scenarios, what should we exactly expect out of occlusion culling? As youve said, the system tunes itself dynamically to not cause performance issues however i have to ask how would it actually perform in the case of an real CPU bottleneck (caused by other parts of an experience). Would it simply do nothing in that case? (Since logically i assume it would have to do nothing or very little in order to not add onto the already high CPU load, no?). Or would it still try to do something? (Potentially adding onto the CPU computation of the already CPU bottlenecked place which COULD logically technically reduce framerates while not providing any benefits).

And in the case of just low end and weak hardware, how efficient is it actually? I have seen the couple examples of this running on mobile you guys had provided however i have no clue what kind of phones were used for those example and i also question the actual performance scaling of the entire system shown in those examples. What im curious to know is the actual trade offs between the CPU and GPU on those sorts of weak and/or old machines and phones in order to provide higher framerates. Since the CPU’s in those devices is usually pretty weak overall, i expect occlusion culling to leave a much larger footprint in the performance graphs so im curious to know exactly how much does the CPU lose compared to what the GPU would be getting. Would it be possible for occlusion culling to simply not provide any significant or noticeable gains due to a weak CPU (not exactly due to a CPU bottleneck caused by other systems within a place) simply holding occlusion culling back? Would it be possible for occlusion culling to not be doing anything at all due to the already high load on the CPU?

I do apologize for these 50 million questions about this but i believe these are questions most developers should know the answers to. I wish to know exactly what can i do with it to allow it to run at 100% capacity while also providing decent GPU performance gains without risking some losses on the CPU which COULD technically diminish the GPU gains by quite a large margin in specific scenarios. I really dont want to take these sorts of things for granted and expect them to magically work all day every day under any and all possible conditions.
Plus we also can’t really properly test ourselves the numbers due to the fact that well, there is no on and off toggle for occlusion culling so we can do some initial general test of its performance.

I’ve been trying to test the system for a bit now. I have found that it CAN actually slightly reduce overall performance in CPU bottlenecked places. I can’t say the quick testing i have done now is exactly amazing but ill try soon to create a better test case for more consistent results. So far i have tested on a i7 14700k and an i7 4770 and found losses of around 5 to 10~ fps. Highest time i have seen occlusion culling take is around 9.6 milliseconds on the 4770. I’m predicting that the occlusion culling may cause some substantial performance losses in more demanding scenes but who knows, ill see when i finally do a more proper test.

Managed to enable and disable occlusion culling via its fflags btw.

this is a game changer, my pc isn’t the greatest so this definetly helps, hope this is added on mobile devices soon

@ProgramDude are there any internal performance benchmarks you or the team would be willing to share based on your testing? I think many would be interested about the performance gains from different experiences on the platform based on device etc.

3 Likes

whats the fflag for it? i wanna run some tests

FFlagEnableVisBugChecks27 and DFFlagUseVisBugChecks

Ever since occlusion culling added ive noticed shadow flickering and geometry flickering in various games. The flickers are decently rare and somewhat inconsistent but they happen.

This was an extremely quick release compared to what I expected. I’m impressed, this will most definitely be allowing Roblox experiences to be not only bigger but more detailed. The showcase I’m working on will likely be able to work on mobile entirely due to this update.

Here are most of the FFlags I’ve found related to occlusion culling. You’ll have to validate most of these yourself as this was posted before it was released to the client and may contain inaccurate information. (that and I’m too lazy to edit it)

https://devforum.roblox.com/t/are-you-using-occlusion-culling-and-what-do-you-think-of-it/3255027/10?u=timefrenzied

If you don’t know what you’re doing, just stick to DFFlagUseVisBugChecks. (FFlagEnableVisBugChecks27 no longer works, 28 seems to be the newest one, and is set to true by default from what I know)

1 Like

thank you, DFFlagUseVisBugChecks works beautifully on mobile (confirmed that its not enabled by default yet).

seeing some perf improvements overall even with the unfinished release.

sorry what, you can use FFlags on mobile???

1 Like

yes, you can either edit the apk or use root.