Client App, temporary frame spikes for some reason, especially after window focus-in, MicroProfiler unknown sleep

Channel: zwhitespacecontrol715

Description

image

Not sure, maybe Roblox altered some Fast Flag, but these don’t make sense at all.

That full bar that you see in the image above, happen with no context associated within the MicroProfiler. At least I think. It could happen in any game, but I almost couldn’t ever have these happen in a pure empty Baseplate.

In games where it usually does not happen, after alt-tabbing in and out, there’s a chance that it does happen. This is temporarily, meaning it will stop. But why does it happen?

It is a little bit similar to this issue Docked windows that render Roblox UI can cause frame spikes - Event_::RenderViewUpdate, Perform, Present because alt-tabbing influenced the issue.

That and this issue, are new and they both didn’t use to happen before.

Note: Normally, when alt-tabbing Roblox will always spike because the window isn’t focused, until you’re focused back in. The images above DO NOT show the spikes that happen while alt-tabbed. It shows after focusing back in, into the window. The spikes that happen above are caused by something else.

 

Re-production Steps

I have attached logs in the private message.

  • Windows
  • Most likely depends on hardware.
  • You alt-tab in any game, such as DOORS
  • Then you alt tab back into the game

Note, that DOORS by default doesn’t cause spikes, so when the method above causes these to happen, it doesn’t make sense.

Expected Result

I expected the MicroProfiler giving more clues on why these happen. And these shouldn’t happen in the first place, this didn’t use to be an issue before.

Actual Result

When you focus back in, everything seems to appear fine. No spikes. However, a few seconds afterwards these spikes happen. It seems to happen way more in games that already spike by a certain amount of time.

And then for a short temporary amount of time, Roblox will just frame spike and then stop. These spikes are in a constant full bar and last for a few seconds.

A private message is associated with this bug report

1 Like

Hi @HealthyKarl!

Very perceptive! What you’re observing is an experiment we’re running as part of Harmony Compute Management. The idea behind this experiment is that smooth frame rates (improve P05, P10, P25,… P50 frame time (FT) improve FT standard deviation) generally provide a better experience than erratic frame rates, even if that results in a slightly lower average FPS.

What’s going on?

Here’s a brief overview of what’s going on. NOTE: TaskScheduler attempts to runs at 60 FPS / 60 Hz / 16.67 ms (FT) by default unless explicitly overridden by the Desktop menu settings:

Assume Roblox is running on a device that can’t achieve 60 FPS (16.67 ms FT); for illustration let’s say it’s running at 40 FPS (25.0 ms FT)

'-' = TaskScheduler work
'.' = TaskScehduler no work
0 ms                 25.0 ms            
|-------------------|-------------------|-------------------|
                    ^
                    TaskScheduler sees we took 25 ms to present frame
                    and immediately starts work on the next frame 

On the underlying CPUs, this means there’s work constantly being scheduled. Queuing theory states as our CPUs (or generalized term “server”) increase in utilization, we see exponential wait times:

Since the arrival of traffic to the CPUs is semi-random, e.g. alt-tabbing a window, huge explosion in game, lots of physics while tracking a target in a FPS game, etc., if the device’s CPUs are operating at higher utilization %'s (say 70% - 80%), you get exponentially longer waits on the CPU; aka exponentially longer frame time spikes (or FPS drops).

Harmony has fast/slow moving heuristics that measure frame time smoothness. If it detects that a device can’t maintain Frame Time Target (FTT), it will step the TaskScheduler UP to the next slower FTT to ensure that a smooth FPS experience is maintained. Conversely, if Harmony detects that it’s possible to step DOWN a FTT and maintain a smooth FPS experience, it will do so.

What you’re observing is both, the FTT step UP to 33.33ms due to lots of extra CPU work caused by windows/processes being focused in/out. Then the FTT step back DOWN to 16.67 ms after that burst of traffic has passed and steady frame time can be achieved at this 16.67 ms FTT.

This model ensures that there’s always some TaskScheduler rest time:

'-' = TaskScheduler work
'.' = TaskScehduler no work
0 ms                    33.33 ms            
|-------------------....|-------------------....|-------------------....|

These “rests” in TaskScheduler mean less CPU utilization, which means Roblox can handle bursts CPU work while maintaining a steady FPS more robustly. What this system also enables, is by default, if your device can handle >60 FPS at a steady rate, your FTT will automatically be stepped down for you (higher FPS)!

This is our initial test of the system. We expect to play with how fine/coarse the stepping of FTT is in the future.

As I previously mentioned, manually setting a desired FPS in the game menu will completely disable this system.

Let me know if you have any questions!

Edit: A fun experiment might be to override the FPS setting in the menus, then recreate this condition to see the differences in behavior.

5 Likes

image

Oh, wow. Immediately do see the differences. The spike above is while it was turned on, vs. turning it off in a game that would normally cause that spike to happen without the need of alt-tabbing in and out.

So, should I do that, or should I not do that, does it influence the experiment negatively? Will this system ever become a thing if the frame rate is not set to “(Default)” ?

Is it and will it stay, the true solution?

Does it matter if I set the FPS to 60 or 120. For instance, the laptop doesn’t have 120 Hz refresh rate. Though I do wonder if it helps achieve more constant 60 FPS, when setting it to 120 FPS.
While wondering if setting the limit to “60 FPS” itself, would be less powerful, to achieve 60 FPS.

 

In my case, it rested too much.

Will these ever be exposed in the MicroProfiler, with more context or not?

 

Is this something noticable even on high-end devices, just in-general?

 

 

Is this, by any chance or anything similar, happening in Roblox Studio, even when not in a Play Test?

Regards to this https://devforum.roblox.com/t/docked-windows-that-render-roblox-ui-can-cause-frame-spikes-eventrenderviewupdate-perform-present/4470460

Thanks for sharing the MP screenshot - it’s very cool to see =)

So, should I do that, or should I not do that, does it influence the experiment negatively? Will this system ever become a thing if the frame rate is not set to “(Default)” ?

The experiment’s sample size is sufficiently large, so you choose the option that works best for you. Just remember if you do change this setting, it’s sticky, so you’ll never see the system in action until you change it back to “Default”. We disable our system when a user selects an override to respect user’s decisions. It’s essentially telling us: “I have a very specific need I want so let me take control”. I.e. If your settings does NOT equal “Default”, our FTT system will be disabled.

Is it and will it stay, the true solution?

So far it’s the best solution we’ve come across and we’re closely monitoring the experiment results to validate that the system is a net good. We’ll be iterating on it as well.

Does it matter if I set the FPS to 60 or 120. For instance, the laptop doesn’t have 120 Hz refresh rate. Though I do wonder if it helps achieve more constant 60 FPS, when setting it to 120 FPS.
While wondering if setting the limit to “60 FPS” itself, would be less powerful, to achieve 60 FPS.

Ah yea, you’re getting into two separate topics, both are assuming you’re disabling our FTT system. Let’s address both:

Manually setting 60 FPS or 120 FPS

Scenario 1: Device can’t achieve 60 FPS anyway

  • Then there’s no difference between the 60/120 FPS setting, TaskScheduler will chug away as fast as it can

Scenario 2: Device CAN achieve 60+ FPS, but not 120 FPS

  • Manually setting to 60 FPS will limit TaskScheduler to 60 FPS, so you end up with something similar to Harmony’s system where you do work, rest, then do work again to maintain 60 FPS
  • Manually setting to 120 FPS; this is essentially the same as scenario 1

Display Refresh Rates

Let’s also assume your device is capable of running at 120 FPS.

Scenario 1: VSync Enabled + 60 Hz Display: Your frame would complete super fast, but then TaskScheduler would wait around the Present call because it’s waiting for the Display to say “I’m done presenting the frame” (V-Sync). So even if you set 120 FPS, you’d effectively see 60 FPS and TaskScheduler would be doing 60 FPS work

Scenario 2: VSync Enabled + 120 Hz Display: You would see 120 FPS and TaskScheduler would be working at ~8.3 ms FT.

TLDR: The FPS override settings only help depending on what your device + display are capable of. Choosing a setting that exceeds what your setup is capable of doesn’t necessarily improve things, and could actively hurt things (essentially what Harmony’s Compute Management is trying to help with).

In my case, it rested too much.

Yea, in the range provided by your original screenshot, the “spike” in CPU work has dissipated. This is by design. Our system should wait a little before making a decision to step FTT. If the system was TOO sensitive, you could get a thrashing effect that’s undesirable.

If you look at the frames leading up to the 33.33ms frame time, you’ll find that there should be very little “empty” time.

Will these ever be exposed in the MicroProfiler, with more context or not?

I don’t think it will. Utimately, TaskScheduler is just waiting until it’s next FTT to do more work and this can happen naturally (independent of our system). However, I’m always open to ideas/arguments for additional context.

Is this something noticable even on high-end devices, just in-general?

It will be less noticeable on high-end devices, but the act of switching active processes is really expensive for the kernel + CPUs, so it will always be measurable. Whether it’s noticeable or not will depend on how powerful that system is.

Is this, by any chance or anything similar, happening in Roblox Studio, even when not in a Play Test?

Nope! This test and system is unique to Roblox Client right now.

3 Likes

Quick admin note: I’ll be marking this as “Won’t Fix”. But that doesn’t mean the discussion is over! Please continue to ask any questions you have :grinning_face:

3 Likes

so the Roblox Studio is something else undiscovered… :confused:

 

Does this Harmony only happen if it notices that the CPU usage is too high on the overall system? e.g. same values as in Task Manager

Or is there a way to see how much is in the queue?

 

Except that it will not do the Harmony system?

Would that mean the previous behavior would be, when it is set over 60 FPS, or was the old behavior for Default (60 FPS) before Harmony, just 60 FPS?

so the Roblox Studio is something else undiscovered… :confused:

It doesn’t look like that post has gotten much love so I pinged a few folks on the Studio team. Hopefully you’ll get a response soon.

Does this Harmony only happen if it notices that the CPU usage is too high on the overall system? e.g. same values as in Task Manager

We approximate the CPU usage by measuring how frame time responds to spikes in work. This ends up being a good approximation because Roblox is typically the most dominant (i.e. active) process running on the device. Task Scheduler has it’s own heuristics for measuring CPU utilization, so I wouldn’t expect our estimates and Task Scheduler’s estimates to be the same; but the general trends should agree.

Or is there a way to see how much is in the queue?

At the CPU level, this requires detailed profiling so there’s no reasonable way to surface this specific information. But queue/wait time can be approximated by how long individual frame times take.

Except that it will not do the Harmony system?

That’s correct - in those scenarios we’re assuming the FPS level was overridden in the menus which disables Harmony’s Compute Management.

Would that mean the previous behavior would be, when it is set over 60 FPS, or was the old behavior for Default (60 FPS) before Harmony, just 60 FPS?

In this context, I’m referring to “old behavior” without Harmony setting FTT and the user is explicitly setting their FPS levels.

3 Likes

The reason for the queue thing, was because some games would cause these spikes to happen way more than other games.

Performance wise it would be interesting to know why. But maybe these spikes really happen on very special low-end devices only. And changing the FPS limit settings, are very easy to just avoid the issue.

 

If I’d create a script on purpose, like a “while loop” that stops after 0.33 ms. Would that trick it?

Oh, I meant,

Setting before Harmony: Default (60 FPS)

Setting with Harmony: Default (60 FPS) with Harmony Compute Management

What I am wonder is, which of these settings would mimic previous behavior for Default (60 FPS), assuming on a 60 Hz monitor?

  • 60 FPS
  • 120 FPS
  • etc. higher

 

There were other Harmony things in place before this one. For instance, a few used to say that re-sizing the Roblox Window or moving the window around would have caused spikes until re-sizing/moving stopped.

There’s several Engines, I think there’s one where everything runs except rendering, while a window is being resized.

 

While the Roblox window is focues there are some spikes as well, I am wondering if this is a reaction from Roblox or whether it is from the Operating System.

There are also some things I never tested, e.g. when alt-tabbing back in, or through Network Latency, a lot of things have to run back that didn’t ever execute in time, like Remote Events or similar. I am wondering if this is similar for when alt-tabbed or not, or in-general.

Refocusing on iOS would reload every materials.

The reason for the queue thing, was because some games would cause these spikes to happen way more than other games.

It really depends on what the game is demanding of the device/system. You can imagine games that are more visually intensive or have a lot of going on in the world around them have more work in a frame than something like a base plate experience. The CPU is also just one aspect of the whole. Network access could cause frames to go long as well.

If I’d create a script on purpose, like a “while loop” that stops after 0.33 ms. Would that trick it?

I suppose it could. Scripts are ultimately run as a TaskScheduler job and if it’s running long, it would impact frame time.

What I am wonder is, which of these settings would mimic previous behavior for Default (60 FPS), assuming on a 60 Hz monitor?

Oh, thanks for the clarification. To mimic the behavior pre-Harmony Compute Management, you’d manually select the 60 FPS setting (NOT the “Default (60 FPS)” one).

There were other Harmony things in place before this one. For instance, a few used to say that re-sizing the Roblox Window or moving the window around would have caused spikes until re-sizing/moving stopped.

The specifics of Rendering, unfortunately, are outside my wheel house.

While the Roblox window is focues there are some spikes as well, I am wondering if this is a reaction from Roblox or whether it is from the Operating System.

It’s hard to say for certain without using external profilers, but it’s always possible that OS related stuff can impact whatever the main running process is. E.g. a burst of network traffic, some background job getting scheduled suddenly, etc.

2 Likes