Occlusion Culling Now Live in Roblox Client

I recall asking this at RDC when speaking to a rendering engineer, but I’ll repost it here for visibility.

Are there any plans in the future to use occlusion culling on layered clothing, so that it hides hidden body geometry instead of HSR? HSR, while good for performance, has some pretty problematic restrictions to what types of avatars you can make. If layered clothing used occlusion culling instead, HSR wouldn’t be needed!

Occlusion culling monitors its performance and tunes itself dynamically, independent of graphics settings. This is important to work well across vastly different hardware and scenes. In our metrics, we haven’t seen occlusion culling be a CPU bottleneck. If you do see that, we’d like to know about it!

The reason avatars and the like aren’t culled (yet) has nothing to do with the occlusion culling itself, but with different code paths in the engine needing to be hooked up. Software is often like that; 10% of the time is doing what you think the feature is, and 90% of the time is the glue code to connect it to the rest of the program!

It does this internally and automatically. We have considered developer hinting, but we wanted to get what we have out there first. This lets everybody start getting the benefits earlier, and it also lets devs discover what controls they need in practice and not just in theory. Once we know what controls might help and why, we can prioritize adding those hints against all the other things we can do to improve Roblox. Because we have limited time, we can’t do all the helpful things we want to do!

Originally it didn’t cull anything with a highlight controller. We updated it to only skip culling if there is an active highlight that ignores depth, which lets more stuff get culled in some places.

5 Likes

Occlusion culling does not work at the granularity needed to become a replacement for hidden surface removal on layered clothing.

2 Likes

Thanks for answering but i do have to say that its not quite the answer i was expecting. I wasn’t specifically asking of occlusion culling can cause a CPU bottleneck (i believe this is pretty obvious), i was more asking if occlusion culling could result in performance losses in CPU bottlenecked places/experiences. In those scenarios, what should we exactly expect out of occlusion culling? As youve said, the system tunes itself dynamically to not cause performance issues however i have to ask how would it actually perform in the case of an real CPU bottleneck (caused by other parts of an experience). Would it simply do nothing in that case? (Since logically i assume it would have to do nothing or very little in order to not add onto the already high CPU load, no?). Or would it still try to do something? (Potentially adding onto the CPU computation of the already CPU bottlenecked place which COULD logically technically reduce framerates while not providing any benefits).

And in the case of just low end and weak hardware, how efficient is it actually? I have seen the couple examples of this running on mobile you guys had provided however i have no clue what kind of phones were used for those example and i also question the actual performance scaling of the entire system shown in those examples. What im curious to know is the actual trade offs between the CPU and GPU on those sorts of weak and/or old machines and phones in order to provide higher framerates. Since the CPU’s in those devices is usually pretty weak overall, i expect occlusion culling to leave a much larger footprint in the performance graphs so im curious to know exactly how much does the CPU lose compared to what the GPU would be getting. Would it be possible for occlusion culling to simply not provide any significant or noticeable gains due to a weak CPU (not exactly due to a CPU bottleneck caused by other systems within a place) simply holding occlusion culling back? Would it be possible for occlusion culling to not be doing anything at all due to the already high load on the CPU?

I do apologize for these 50 million questions about this but i believe these are questions most developers should know the answers to. I wish to know exactly what can i do with it to allow it to run at 100% capacity while also providing decent GPU performance gains without risking some losses on the CPU which COULD technically diminish the GPU gains by quite a large margin in specific scenarios. I really dont want to take these sorts of things for granted and expect them to magically work all day every day under any and all possible conditions.
Plus we also can’t really properly test ourselves the numbers due to the fact that well, there is no on and off toggle for occlusion culling so we can do some initial general test of its performance.

I’ve been trying to test the system for a bit now. I have found that it CAN actually slightly reduce overall performance in CPU bottlenecked places. I can’t say the quick testing i have done now is exactly amazing but ill try soon to create a better test case for more consistent results. So far i have tested on a i7 14700k and an i7 4770 and found losses of around 5 to 10~ fps. Highest time i have seen occlusion culling take is around 9.6 milliseconds on the 4770. I’m predicting that the occlusion culling may cause some substantial performance losses in more demanding scenes but who knows, ill see when i finally do a more proper test.

Managed to enable and disable occlusion culling via its fflags btw.

this is a game changer, my pc isn’t the greatest so this definetly helps, hope this is added on mobile devices soon

@ProgramDude are there any internal performance benchmarks you or the team would be willing to share based on your testing? I think many would be interested about the performance gains from different experiences on the platform based on device etc.

3 Likes

whats the fflag for it? i wanna run some tests

FFlagEnableVisBugChecks27 and DFFlagUseVisBugChecks

Ever since occlusion culling added ive noticed shadow flickering and geometry flickering in various games. The flickers are decently rare and somewhat inconsistent but they happen.

This was an extremely quick release compared to what I expected. I’m impressed, this will most definitely be allowing Roblox experiences to be not only bigger but more detailed. The showcase I’m working on will likely be able to work on mobile entirely due to this update.

Here are most of the FFlags I’ve found related to occlusion culling. You’ll have to validate most of these yourself as this was posted before it was released to the client and may contain inaccurate information. (that and I’m too lazy to edit it)

https://devforum.roblox.com/t/are-you-using-occlusion-culling-and-what-do-you-think-of-it/3255027/10?u=timefrenzied

If you don’t know what you’re doing, just stick to DFFlagUseVisBugChecks. (FFlagEnableVisBugChecks27 no longer works, 28 seems to be the newest one, and is set to true by default from what I know)

1 Like

thank you, DFFlagUseVisBugChecks works beautifully on mobile (confirmed that its not enabled by default yet).

seeing some perf improvements overall even with the unfinished release.

sorry what, you can use FFlags on mobile???

1 Like

yes, you can either edit the apk or use root.

The highlights I was using were the “Occluded” depth mode. Should they be excluded from culling or should they have disappeared along with the other parts?

Occlusion culling runs concurrently on a separate thread. The simple answer is that we do more work on culling if we’re finishing before the rendering code needs visibility results, or if we are heavily GPU bound. We do less work if the opposite is true. There’s a buffer zone and a gradual transition so that we don’t keep ping-ponging and we don’t get tricked by short spikes.

This is designed to avoid the very problem you’re concerned about, and to do it automatically for any scene on any device, and to do it by degrading gracefully as needed. It will also dynamically adjust to changes in thermal throttling on a device, since that is no different to it than the content suddenly becoming more or less expensive.

We empathize with experienced devs who want full control to fix problems with automatic solutions. My personal belief from my own professional experience (not a Roblox official position) is that hinting works best when devs know stuff the engine can’t, and automatic solutions work best when the engine knows stuff that devs can’t. In this case, devs can know more about what would be good occluders, but the engine knows more about where time is going and how much time we have to work with. So if we do find we need to provide performance hinting, it will probably be hinting about what occluders are good, while leaving the automatic effort adjustment in the engine.

If you want to help occlusion culling be more efficient, use big occluders with simple geometry. Block parts are ideal. Low-poly mesh parts also work well. You can split your meshes into a “low detail” mesh that is used for both rendering and occlusion, and a “high detail” mesh that adds all the cool details just for rendering. As an example, you can use a block part for a door and a mesh part for the doorknob and hinges, instead of doing the entire door as a single mesh part. The engine will automatically detect that the high detailed mesh is a bad occluder.

And yes, we recognize that this suggestion is an argument for devs to have occluder mesh hints. We agree that there are many good arguments for occluder hinting, and we don’t want to shut the door on hints coming someday. At the same time, it is important for there to be a good “no effort required” solution for existing places and novice devs. Once we see where the “no effort” solution falls short, we have a better idea of what hints are actually needed – or (less likely) that hints aren’t needed after all. But even if we know something would make Roblox better, that doesn’t mean we have time to do it! I’m sure you as devs are all aware that you have more ideas on how to improve your products than you have time to actually implement.

5 Likes

Here is a more extreme test I did on a very low end PC. (by today’s standards) :rofl:

Windows 7 PC Specs (with the amazing GeForce 8400GS !!):

No Occlusion Culling (1.6 FPS, oh yeah!!):

Occlusion Culling Helping (4.8 FPS, now we are gaming!! :smile_cat:):

FPS aside, the big help is reducing the tris count. Also, there a whole other floor underneath, so basically helps to cut down the tris count by a loooooot…

2 Likes

I don’t know how it’s even possible to make a worse PC after seeing that FPS

1 Like

i think you gained most of your fps by going from 21 graphics to 6 ngl

another comparison with the same graphics incoming?? :eyes:

1 Like