Task Scheduler - Control performance hits from intensive operations

EchoReaper · January 25, 2016, 1:31pm

Continuing the discussion from How to space apart repeated computations to avoid performance dips?:

The problem:

The problem I came across in that help thread was that repeated performance-intensive operations can freeze ROBLOX, and there was no way for me to know the perfect interval of rest to give ROBLOX between each operation to prevent freezing. My computer may be able to handle 100 operations without freezing, but a lower-spec machine may only be able to do 50 operations without freezing. To prevent the lower-spec machine from freezing, I'd have to wait at intervals of 50 operations instead of 100 even though my computer could handle 100. I wanted to wait the magic number to allow every machine to complete operations as fast as possible without freezing, but didn't know how to do that, so I posted that help thread.

I wanted to do performance-intensive operations (i.e. unioning, generating procedural terrain, etc) in quick succession, so I had to find a solution. Luckily in the Skype lounge, @SirSpence99, pointed out that I could count the number of times renderstepped could be run in a single second to determine the FPS of the client, and then I went to work. The result is a task scheduler which runs tasks at that magic number I was looking for. You tell the task scheduler the FPS you want the tasks to run at, and then it runs them at a pace that doesn’t drop the FPS below the specified amount (with a margin of error of 1-2 FPS). If you want something to run in the background, you tell the task scheduler to run it at 60 fps, and the user will never notice that anything’s happening behind the scenes. If something’s super important, you can tell it to run at 10 fps and it’ll run tasks super fast but not fast enough to permanently freeze studio. The speed at which it runs is that magic number based on user hardware (beefy machines will finish a long queue of tasks quicker than a low-end machine), but regardless of hardware, you can rest easily knowing it will always run at the FPS you specified.

Things to be aware of:

The task scheduler sleeps when not in use by disconnecting the renderstepped loop, so there is exactly 0 performance hit when there are no tasks in the queue.

If you tell the task manager to run something at 60 FPS but the user’s max FPS in studio is 40 (really low-spec machine), the tasks in the queue will never be run. If you want to be able to run stuff in the background for those users, use the task scheduler’s source for a reference and find the user’s average FPS and set the task scheduler to run at that.

----------

Link to module

To create a scheduler:

local scheduler = TaskScheduler:CreateScheduler(targetFps)
Parameter targetFps: Task scheduler won’t run a task if it’d make the FPS drop below this amount
(WARNING) this only holds true if it is used properly. If you try to complete 10 union operations at once in a single task then of course your FPS is going to drop; queue the union operations up one at a time so the task scheduler can do its job.

To use the scheduler:

Method Pause: Pauses the scheduler so it won’t run tasks. Tasks may still be added while the scheduler is paused. They just won’t be touched until it’s resumed. Performance efficient – disables renderstepped loop entirely until scheduler is resumed.

Method Resume: Resumes the paused scheduler.

Method Destroy: Destroys the scheduler so it can’t be used anymore.

Method QueueTask: Queues a task for automatic execution.
Paramter callback: function (task) to be run.

(all of the above are documented in the module’s source for quick reference)

Example usage:

local scheduler = TaskScheduler:CreateScheduler(60)
local totalOperations = 0
local paused
for i=1,100 do
	scheduler:QueueTask(function()
		local partA = Instance.new("Part", workspace)
		local partB = Instance.new("Part", workspace)
		plugin:Union({partA, partB}):Destroy()
		totalOperations = totalOperations + 1
		print("Times unioned:", totalOperations)
		if totalOperations == 50 then
			scheduler:Pause()
			paused = true
		end
	end)
end


repeat wait() until paused
wait(2)
scheduler:Resume()

SirSpence99 · January 27, 2016, 9:31pm

Wait, does this mean I’m smarter than Trey? He did say he had no idea…

Reinitialized · February 1, 2016, 1:19am

Cool API. I’m going to integrate it into my latest project.

Quenty · February 1, 2016, 2:11am

Nice Echo!

Be warned, this actually can drop frame rate. Using RenderStep for calculation-heavy operations (read: UnionOperations) is a bad practice. This is sort of mitigated by way you’re queuing things, but running the system in ROBLOX’s primary resume would be preferable.

Also, just a common on several lines:

	local sleep--get syntax highlighter to shut up on line 68

You actually want to localize that. If you didn’t, running two schedulers at once would mean overriding one sleep function.

In your destructor:

for i in next,scheduler do
     scheduler[i] = nil
end

Note you can just set scheduler=nil and ROBLOX’s GC will handle it. Also, you can just set setmetatable(scheduler, nil) and you’ll probably be fine in the destructor, although that’s more of a personal preference.

Nice job! Looking forward to more of your modules.

EchoReaper · February 1, 2016, 2:29am

Yeah, that’s good to know – shame it was posted after I published this. If I read that thread right, using coroutines looked equally as bad though.

If you look below the sleep function definition, you’ll see “local function wake()” – that stuff was already intended to be local. The locality wasn’t what that comment was nagging about. It was having to declare sleep before the onEvent function so the syntax highlighter wouldn’t have a fit even though I wasn’t using sleep until after it had been defined below onEvent.

Your other local script (the one that created the scheduler) still has a reference to the table, so it wouldn’t gc with an active reference still open. All you’re doing is letting go of it in the module – the requirer still has access to and can manipulate the table because it accesses it through the memory address, and not the module’s variable.

setmetatable doesn’t clear out the indices of the table – all it does is clear out previously defined metamethods (which I have none of), so it wouldn’t affect the table at all.

sleitnick · February 1, 2016, 2:57am

Very cool! I’ll definitely be using this. I’m working on a game that generates worlds on start-up per map per client, and I’ve ran into this very issue. I’ve kept it at a slow generation rate, but I might as well use this scheduler to speed it up when possible. I’ll edit this once I’ve implemented it and I’ll say what I think of it.

Edit:
[NOTE: The below concern has been fixed]
It works! My one criticism is this: I can run at most 1 task per frame update. That being said, if I’m generating a bunch of trees (20,000), it will take over 5 minutes to generate them all. It would be cool if it could continue flushing out the queue for as long as it can until it hits the FPS floor. I’m not immediately sure how you would do this, but that would be cool.

Or perhaps I’m doing something incorrectly. Here’s how I’m using the scheduler for tree generation:

local scheduler = main.TaskScheduler:CreateScheduler(10)
...
local base = game.Workspace:WaitForChild("BasePlate")
local treeModels = game.ReplicatedStorage:WaitForChild("Trees"):GetChildren()
local numTreeModels = #treeModels
local surface = base.CFrame * CFrame.new(0, base.Size.Y * 0.5, 0)
local numTrees = 20000
local numGenerated = 0
for i = 1,numTrees do
	scheduler:QueueTask(function()
		local t = treeModels[math.random(numTreeModels)]:Clone()
		t.Size = t.Size * (0.5 + (math.random() * 1))
		t.Parent = trees
		t.CFrame = surface * CFrame.new(math.random(-base.Size.X, base.Size.X) * 0.5, t.Size.Y * 0.5, math.random(-base.Size.Z, base.Size.Z) * 0.5) * CFrame.Angles(0, math.random() * 2 * math.pi, 0)
		--if ((i % 2000) == 0) then
		--	wait()
		--end
		numGenerated = (numGenerated + 1)
	end)
end
while (numGenerated < numTrees) do wait() end

Quenty · February 1, 2016, 4:03am

Coroutines aren’t bad. Just abusing the coroutine yield tick to get faster than 60 frame rates is. As of now, if you really want a 60 FPS resume rate, you really don’t have any optimal choice.

Oh yeah. I see here. Yeah, you can just define sleep above the function, otherwise, what you did was fine. The syntax highlighter was complaining because without defining it there would have failed (as you know, of course).

When a deconstructor is called, the class is almost always set to nil or dropped from scope. Most references like this will be dropped very quickly because a call on :Destroy() is almost always followed up by all references being cleaned up.

Usually, it’s a fair assumption if a user calls :Destroy() they don’t intend to use a method anymore.

Oh yeah, in looking back I realized you’re using closures to do OOP.

EchoReaper · February 1, 2016, 2:14pm

OH! I know why. I have it directly tied into the renderstepped loop, so that’s why it’s only running one task per frame. I think I can separate the task execution out of that into an infinite loop and then use the fps I got from renderstepped to pause it when necessary. In doing that, I can remove the renderstepped connection from the schedulers entirely and use one global one to calculate the FPS for all of them instead of running #schedulers renderstepped connections. If I do that, then I’d also be able to do something like TaskScheduler:GetAverageFPS() so you didn’t have to calculate that manually when you wanted to run background tasks on a low-spec machine. I’ll look into all of that when I get out of class.

Same situation on ROBLOX, but Destroy locks the parent instead of allowing you to pull it back into the workspace, right? Having it like this can help find unintended behavior in your code where you’re accidentally trying to add stuff to the queue after it’s destroyed – instead of silently doing nothing, it’ll error and you can find where you made the mistake.

Quenty · February 1, 2016, 3:20pm

Yes, calling Destroy on ROBLOX locks the parent, but that’s just a side effect of the real gain in using :Destroy(), which is the performance gain from freeing resources on the C side associated with the object (i.e. signals, et cetera), instead of letting Lua’s GC collect them. Locking the parent means players can’t misuse the object. So yes, you want it to error loudly.

Setting the metatable to nil, (if you weren’t using closure based OOP), would clear up these resources, while hopefully you’d also disconnect all events (as you do by calling pause). In this case, setting the metatable to nil if you nil out all values seems sufficient to keeping the user from using the object by accident, but if you think it will continue to be a problem, I guess setting the metatable to another one is OK.

HOWEVER, since you’re using a closure to make your code work, this:

		setmetatable(scheduler, {
			__index = function()
				error("Attempt to use destroyed scheduler")
			end;
			__newindex = function()
				error("Attempt to use destroyed scheduler")
			end;
		})

Means that every single variable (read, every function), is still stored in the scope until the scheduler is actually GCed by Lua. This means calling Destroy actually doesn’t free any resources, it just locks your object, removes all the tasks to be executed and then creates additional memory usage in the form of extra tables and an index.

Does it matter? No. Is it worth calling :Destroy() Eh. Depends on the situation.

However, I think this is more interesting on an understanding ROBLOX level, so I’m posting this analysis here.

EchoReaper · February 2, 2016, 5:06pm

I just updated the module with a fix for that oversight – thanks for finding it for me! Let me know if you find any other issues. I also added in TaskScheduler:GetCurrentFPS() which you can use to quickly get the client’s FPS – it’s also used internally now to get the current FPS instead of running a RenderStepped loop for each individual task. It’s part of the TaskScheduler itself, and not the schedulers created with TaskScheduler:CreateScheduler().

sleitnick · February 2, 2016, 8:03pm

Yes, the changes worked! I was able to generate the 20k trees limited at 30 FPS and it seemed to work as expected.

EchoReaper · May 17, 2016, 10:02am

Updated to use Heartbeat instead of RenderStepped. This will allow you to use the scheduler on the server instead of just the client. I strongly suggest anyone using this in their projects update, because as @Quenty mentioned, using RenderStepped is bad practice for this kind of stuff. At the time I had no other option, which is why the original version used RenderStepped, but now a better alternative is available, and the scheduler has been updated to reflect that.

Edit: I also added a :GetQueueSize() method that @sparker22 suggested

sparker22 · May 17, 2016, 7:23pm

Its about time, man. Thanks though.