A comparison of client memory and CPU/GPU time with workspace and ViewportFrame

manddilu · July 31, 2022, 2:03am

TL;DR - you can scroll down to the CONCLUSION section at the bottom, all of the results are summed up there

INTRODUCTION

Recently, I ran into a rendering issue with ViewportFrames while trying to view parts as pixels (1 part = 1 pixel). I started this topic and @SquarePapyrus12 let me know that ViewportFrames have a 1024x1024 render limit and that this might be the problem. Sure enough, it was.

I knew that workspace will generally use more client memory per part, and that’s why I used a ViewportFrame from the start, but with this new tradeoff in mind I set out to compare client memory usage and average CPU/GPU time when working with workspace vs when working with a ViewportFrame. The following are the results of that comparison.

A word of caution

These results are true for my particular setup - a very, very minimal setup. I make no guarantee that the results would be similar with complex parts or advanced Lighting settings. My intention is to give you a starting point when considering your options, and you should do additional measurements for your intended setup.

Additionally, it seems to me that there is a small memory leak somewhere in my code, so the results might be slightly off, but should still be fairly accurate.

THE COMPARISON

In both cases I let the memory settle at around 380Mb before adding any parts. The following results are the differences between total client memory usage and those 380Mb.

workspace (parts on client)

10k parts ~35Mb
20k parts ~63Mb
100k parts ~322Mb
200k parts ~650Mb

ViewportFrame

10k parts ~22Mb
20k parts ~43Mb
100k parts ~212Mb
200k parts ~422Mb

Conclusion

workspace ~0.0033Mb/part at 200,000 parts
ViewportFrame: ~0.0021Mb/part consistently

The results were not consistent across the board with workspace. This is how the ViewportFrame performed compared to workspace, per part count:

10k parts ~37% less memory used per part in the ViewportFrame
20k parts ~32% less…
100k parts ~34% less…
200k parts ~35% less…

CPU/GPU time was about the same up to this point, but I was using a 2000x100 part grid, so I had a maximum of 136k parts in view at any given point. And I didn’t check how moving the camera around affects the time. There is a difference, it’s addressed below.

MORE PARTS

I compared the results with 518.4k parts (720x720 grid) which is just below my memory limit with Studio and Chrome running in the background, assuming 0.0033Mb/part.

These were the results:

workspace ~2077Mb
ViewportFrame ~1500Mb

The ViewportFrame used ~28% less memory, and the Mb/part averages were almost identical to the ones I got at 200k parts.

However, the CPU/GPU time averages weren’t close anymore. Assuming the same viewport size and the same amount of parts in view (in a still state), here’s how my PC performed:

In a still state, the workspace average time was ~23.2ms and the ViewportFrame average time was ~16.7ms which is ~28% less than the workspace average time
With fast non-stop camera movement, the workspace average time was roughly 17ms and the ViewportFrame average time was roughly 29ms which is ~71% more than the workspace average time

So, moving the camera around fast resulted in a ~27% lower average time with workspace, and a ~74% higher average time with the ViewportFrame. It was very noticeable, moving around in workspace looked much smoother, even at lower speeds.

CONCLUSION

With my setup:

workspace used ~38-59% more client memory per part compared to the ViewportFrame, depending on the number of parts
That is, the ViewportFrame used ~28-37% less memory compared to workspace
With less parts in view, and the camera not moving around, the average CPU/GPU time was about the same
With more parts in view, and the camera not moving around, workspace had a ~39% higher average CPU/GPU time compared to the ViewportFrame
That is, the ViewportFrame had a ~28% lower average CPU/GPU time compared to workspace
With more parts in view, and the camera moving around non-stop quickly, workspace had a ~41% lower average CPU/GPU time compared to the ViewportFrame
That is, the ViewportFrame had a ~71% higher average CPU/GPU time compared to workspace
Moving around in workspace looked much smoother than moving around in the ViewportFrame, even at lower speeds

The differences are significant, and we know that these are not all of them. As mentioned in the beginning, a ViewportFrame has a 1024x1024 texture render limit, this counts when drawing precision is required. It also has many other limitations related to rendering. It has it’s advantages, two of which are obvious from this comparison - at least with my setup.

Those are some of the things to consider when creating something that could be done both inside the workspace and inside a ViewportFrame. If you’re in that situation I hope that you have benefited from my post, and if you’re not, I hope that it was an interesting read.

See you around!

Abcreator · July 31, 2022, 2:46am

Typo?

Anyway, may aswell ask a few questions about your tests:

Were there any physics object in-use, would adding a WorldModel to the ViewportFrame even out some of the really fast performance.
Did you subtract the overhead of the workspace rendering 0 parts, the ViewportFrame may have been running more efficiently, but the overhead of rendering the player character (if applicable), any other parts or just the Workspace camera itself existing could have damaged it’s reading in your score. Could you try with double the ViewportFrames and minus your current scores to get the raw ViewportFrame performance data…

Now for a few little extra notes:

You really shouldn’t ever be using a ViewportFrame as a substitute for Workspace, it becomes even more inefficient when parts move and ViewportFrames were never designed for large part-counts.
You get a worse picture quality, no shadows, neon and glass render at graphics level 1, etc.

Read the bottom of this Developer Hub article if you wish to learn more on the performance implications of using ViewportFrames: ViewportFrame GUI

manddilu · July 31, 2022, 9:38pm

Typo?

Yes, thanks for that, I edited it.

Were there any physics object in-use, would adding a WorldModel to the ViewportFrame even out some of the really fast performance.

There were not, and I doubt that adding a WorldModel would change anything as it only allows raycasting and animations. I could test it at some point.

Did you subtract the overhead of the workspace rendering 0 parts…

As stated in the post, I subtracted an overhead of 380Mb in both cases, as in both cases the memory would rest around that point after a few minutes. The only difference between the tests was the visibility of the ViewportFrame and the parent of the parts. The player character wasn’t loaded in either case.

Now for a few little extra notes…

I agree that ViewportFrames shouldn’t genrally be used as a substitute for workspace. As I mentioned, the results are relevant to situations in which someone is “creating something that could be done both inside the workspace and inside a ViewportFrame”. And while that might be very rare, I disagree that it’s never the case.

Specifically, the results are relevant for someone who is creating something where antialiasing, shadows, physics, neon and glass materials as you mentioned, post-processing effects, occlusion and a few other things don’t matter. Moreover, in some cases some of these things might even be harmful if present, which is definitely true for what I’m creating.

manddilu · August 1, 2022, 2:39pm

I doubt that adding a WorldModel would change anything

I stand corrected, adding a WorldModel used ~54% more memory with my setup.

Abcreator · August 1, 2022, 4:38pm

Yeah, this makes sense since WorldModel creates a whole copy of the physics system.