Creating an ultra-high performance open-source map framework: Step-by-step dev-log

Why post this here?

In this series of posts, I’m going to be explaining and sharing my process of 1. going through the test process with a practical real-world example, 2. setting up an open-source module, and finally 3. creating and finishing an open-source module.

Abstract goals

  • Provide proof that StreamingEnabled isn’t enabled for enormous maps
  • Create a framework that allows for huge maps to exist without StreamingEnabled.
  • Open-source this framework

The end-test

  • Test must not use more than 3 gigabytes of memory
  • Must run at 60fps minimum, 144fps optional
  • Must not murder visual quality
  • Must be at least 20K x 20K studs large
  • Must have a fully detailed map- interior and all

Identifying the issues

By stress-testing Roblox’s systems in the first place, we can figure out what areas have the most impact, and alleviate these issues. This test will also help me define what the module needs to do- we do this because we don’t want to be solving a problem that doesn’t exist, as that’s obviously counter-productive.

I grabbed a few free-model buildings off of the toolbox, I optimized said free models and then slapped an interior onto them. My end result was 2 skyscrapers, and 2 homes- each decked with a very basic interior. Made sure that for anything not noticable, CastShadow was off, and I disabled collisions on some parts.

Pictures



Great! Now we just need to script it. Just a crude, but simple test-script.

Test code
local Home = workspace:WaitForChild("Home")
local HomeLarge = workspace:WaitForChild("HomeLarge")
local Skyscraper = workspace:WaitForChild("Skyscraper")
local BiggerSkyscraper = workspace:WaitForChild("BiggerSkyscraper")
local Path = workspace:WaitForChild("Path")

for x=-20,20 do
	for y=-20,20 do
		local isLarge = (math.random(1,2) == 2)
		local isSkyscraper = (math.random(1,4) == 4 and not isLarge)
		
		local clone
		if isLarge then
			clone = HomeLarge:Clone()
		elseif isSkyscraper then
			if math.random(1,2) == 1 then
				clone = Skyscraper:Clone()
			else
				clone = BiggerSkyscraper:Clone()
			end
		else
			clone = Home:Clone()
		end
		
		local PathClone = Path:Clone()
		PathClone.CFrame = CFrame.new(x*88, 0, y*88)
		
		clone:SetPrimaryPartCFrame(CFrame.new(x*88, math.random(-2,2)*2, y*88))
		
		clone.Parent = workspace
		PathClone.Parent = workspace
	end
	task.wait()
end

Now, what do we want to measure? That’s a critical piece of information- we need hard numbers we can crack down on before continuing. Just “frames per second” won’t cut it, and neither will “memory usage”. We need memory categories, profile labels, milliseconds taken. Something to say “this is why you’re having issues, this is how you solve it, and this is a bundled package which solves it for you”. Luckily, we have the microprofiler:

Microprofiler logs


image

The microprofiler showed some pretty obvious results. More parts, more materials, more time used up. We can solve this by loading parts in and out- we won’t be settling on implementation details because we’re not that far ahead in development yet. However, yet again, we got some hard targets to decrease. Granted, they might’ve been obvious, but it’s still better practice so we can measure and compare the effectiveness of our module. It’s super easy to imagine a 10% boost as a 30% boost- we want to remain objective.

However there was also the issue of Roblox using almost 3 gigs of RAM too, so we should also investigate memory usage.
image

Memory usage




Looks like a lot our memory usage is being eaten up by physics collisions. That’s a target. We can cut that down with various optimization methods later down the line- something to say “This module reduces memory usage by over 800 megabytes in this test”.

The other memory category (GraphicsParts) isn’t something our module can cut down upon, unfortunately. I believe that it has to due with actual parts- not something we want to, or can change- and how the user creates them.

As per the final screenshot- I think that can be targeted as well. Optimizing parts by turning off CastShadow based on size, for example. That’s something this hypothetical module can target, and optimize. Again, a hard number that we can optimize.

Conclusion

We’ve defined goals for the module, and we’ve defined an end-test to do later. This can be changed to be more realistic if it proves too difficult or impossible- however that should not be done lightly, as changing your goals to be lesser when they prove difficult to reach is bad.

Asides from that, we’ve successfully ran stress tests, and we have the recorded results from it- something to look back on later. We can reasonably conclude that there is an issue, it is solvable by code we can create, and we know where to hit: Physics collisions and part count.

endnote: why is the roblox corescript freecam taking 2.7ms…
image

18 Likes

I think this would fit better in #help-and-feedback:creations-feedback (where did cool-creations go?) This isn’t mostly a tutorial right now, most of it looks more like a devlog of an optimization module. The title seems too far ahead for what’s actually in the post IMO.

2 Likes

This is an interesting topic that I would like to participate in.

I have thought of a few theoretical optimization methods (a better word may be algorithms?), so let’s begin!

Assets Referencing

"Assets Referencing** is just a term that I just made up that describes a certain instance streaming method, which to have a set of “reference” parts in the workspace that reference an asset ID (ClassicHouse for example), which would be used for static assets (or dynamic! although it requires more setup).

This technique offers a great networking performance boost as there is no “streaming” involved, in fact, all those asset IDs refer to a static instance that for example is stored in the Replicated Storage, which in turns results in almost no networking overhead and latency in showing the instance .

This of course results in a more complex workflow for using this with dynamic assets, which for example, a player can chop a “tree” using CSG, which in turn have to notify any player that is currently close to the reference part of the said tree (with server validation of course)

However, this also results in a problem where it is incredibly hard to make it interactable with server-owned instances. How can these instances that practically only exist on clients that are close their reference instances communicate and interact with server-owned objects (let’s say NPCs for an example)?

Do we have to shape the reference part as the collision of the asset it is referring? Or do some voodo dark magic that somehow creates the assets on the server without them replicating to all clients?

Collision Disabling

This another term that I just made up for dynamic collision disabling (which was mentioned in your post). Anyway, this is another technique that is used to disable the physics properties of parts that can’t be physically interacted with in the player’s current state.

This is useful, as it enables showing a lot of parts and models without them being physically there (which is in fact one of the goals that were mentioned in this post).

Unlike asset referencing, this is mostly fine with server-owned objects, as they don’t care about the physical status of parts that are being modified on the respective clients.

Replication Chunks

Another important technique that is practically used in every instance streaming frameworks is "Replication Chunks:, which simply means to have various chunks that hold certain objects, so that it is easier to dynamically stream instances in and out .

Greedy Meshing

The previous posts were a solution for networking and physics (which are an important aspects of games), and so, it only makes sense for me to somehow include a graphical (that also affects physics!) optimization.

Greedy meshing is the act of merging multiple parts into a single part, which if is implemented as a utility function, would improve rendering, networking, and physics.

Of course, such mechanism should be written in a heavily optimized way that it would be optimal to be used run-time. Besides that, the algorithm should be efficient to the point where it actually merges parts that would be logical from a human eyes


I don’t normally write big essays, however this was an interesting topic that I couldn’t resist to not talk on.

7 Likes

I thought about this, and I think that the community tutorials category was a closer fit. Although cool-creations would’ve fit it better, I do agree.

2 Likes

Sorry for the late reply! I’ve been super busy today. I haven’t had much time to lurk on the devforum, sadly enough.

Those are some really interesting ideas! I think it’s reasonable to say that something like a map framework- which ties so heavily into modeling-related development, needs a plugin on top of it. As per your first idea, which is cloning in assets- this is something I planned to do, but didn’t mention it for the sake of following the process and rules.

Collision disabling is something that’s definitely going to help- pretty much be required. You can put server-sided objects in the server’s Camera object to have server-only parts!

Greedy meshing is probably something that should be done in, e.g. a plugin, a pre-processor for example.

2 Likes

Greedy meshing is the act of merging multiple parts into a single part, which if is implemented as a utility function, would improve rendering, networking, and physics.<<

We do have the option to download as obj and the upload it as a single mesh… feels like this would be a low hanging fruit for Roblox to do.

1 Like

Roblox already has CSG, however, because how dynamic unions are, they store more memory to allow for undo requests and other operations that meshes aren’t typically subjected to.

1 Like