Optimize Smooth Terrain Better

To repeat what I said, I don’t think this is the best smooth terrain can offer. As has been mentioned before in the thread, the biggest omission has been occlusion culling IMO - this is something that has been easy to bake for a long time (which is what “any other engine from 2006” does - think portal visibility, BSP splits, etc…), but very very hard to make dynamic (it’s only in recent times that we’ve seen approaches like hierarchical Z-buffer or software raster become viable).

I’m not so confident. For cubes, it’s a pretty tractable problem because all the faces are planar and so merging them does not change the shape of the surface. This isn’t true for smooth terrain; you’d need to start considering angle and curvature, at which point you’re in dynamic decimation territory.

Also - Minecraft does not use greedy meshing; their block model rendering approach is not compatible with it. And again - I have no reason to believe triangle count is the problem here; the topology is sane, the triangles have good area, and are far from the size where the 2x2 clusters would introduce steep overdraw costs due to derivative calculations..

As a further case study, I was actually contemplating getting rid of the greedy mesher I implemented in Daydream - my own voxel renderer - because it was eating up too much CPU time. That’s in Rust, one of the fastest languages around, and I was using one of the most advanced binary greedy meshing techniques around. None of these greedy meshing optimisations are a given - in most cases, they just make the wireframe view look pretty, not the profiler.

5 Likes

One of the things that I really should’ve talked about in my initial reply to this topic was draw calls, I omitted it because there’s a lot of complex specifics here and didn’t want to make a massive wall of text, but the thing about draw calls is that many people assume that they’re all the same, performance wise, on the CPU. This is absolutely not the case.

The main thing that makes a draw call slow on the CPU is state switching, I’ll try to not bore you to death with details, the general takeaway is that you want to avoid changes to render state if you can avoid it, particularly for mobile devices. Usually you want draw calls to be as low as possible, but for a terrain system, it gets messy fast.

Draw calls in Roblox terrain are handled in chunks, the closest chunks are ~64 cubic studs, but further away ones are larger due to LODs (so they take fewer draw calls, up to a point), materials don’t matter here (except for water) since all terrain textures are stored on texture atlases. Doing things this way avoids state switching, so the cost of these calls is much lower relatively speaking than it would be for rendering a ton of unique models. This also happens to be the way that most games handle terrain, the thing is that the cost of each one of these draw calls is so small that it’s worth it when you consider the benefits to occlusion/frustum culling as well as the performance of mesh generation, so basically every voxel based terrain system has a rendering architecture like this.

Can it be better? Well, occlusion culling helps a lot, but outside of that there’s really not much you can do. You can’t instance terrain drawing because the geometry of each chunk is different. With that in mind a potential idea might be to generate chunks that aren’t a fixed size, but rather have dynamic sizes based on some heuristics, so for flat terrain you could make chunks much larger since they’re less likely to be occluded by the surrounding terrain, thus reducing draw calls. This would be a huge amount of work to implement and there’s a lot of pitfalls with that, mainly mesh generation speed & LOD behavior, and even in a best case scenario, you’re still going to have a lot of draws.

Given that terrain draws really just don’t take much time relatively speaking and that Roblox is GPU limited on most devices rather than CPU limited, it would be more worth it to focus on occlusion culling + LOD improvements, both of which work as GPU and CPU optimizations, and there’s definitely some improvements to be done to both of those systems, but this post is already getting into essay territory, so I’ll leave it here for now.

I would be interested to see how you came to that conclusion specifically, because Smooth Terrain itself is not going to cause your FPS to drop from 120 to 20 unless you go from looking at literally nothing to looking at terrain on a low end device at max graphics. In which case there’s not really anything that can be done. Even on my machine, which is a bit outdated at this point, 600 terrain draws and 450k triangles takes about 0.8ms on the CPU and 1.2ms on the GPU.

when i saw this my brain was like
it cant be that ba-
AH HELL NAH

1500 draw calls. Ouch.

1 Like