How to handle "Script timeout: exhausted allowed execution time"

Shedletsky · April 20, 2020, 8:05am

I have a long running script that performs a minimax search of a game tree.

Testing locally, it looks like my script gets killed after ~20 seconds and I get the error “Script timeout: exhausted allowed execution time”. I assume this would happen on production also.

What is the best way to handle this? I don’t really understand how script execution works in Lua. I think there are actually two problems:

My long running script is probably blocking all other scripts from running. Is there a way to yield?
My long running script has a lot of state (tree search) and batching it to run in X second chunks is non-trivial

What I’d like to do is not have to think about this and just be able to define my long running script as a very low priority process and have it and all my other scripts run in parallel.

It looks like the only way to fix this is to serialize the state of my script and restart it every 59 seconds. How do people writing other long running scripts handle this? Terrain generation seems hard, for example. Is it just easier to batch?

Would coroutines help here? They’re not really threads, so I assume no?

Autterfly · April 20, 2020, 8:12am

Roblox currently doesn’t have in place a way to run scripts truly parallel. There is planning for this feature on the roadmap but it’s probably a while before we get it.

Yielding your scripts and then resuming them later is probably your best bet. Whenever you call a function like wait, coroutine.yield, or API that yields, your script will give way to all other scripts that currently need to run until they also yield or finish. Alternatively, you could serialize the state like you said, if your current code doesn’t have any good places to use short/temporary yields.

Shedletsky · April 20, 2020, 8:15am

Do you know if calling coroutine.yield resets the script timeout timer?

Autterfly · April 20, 2020, 8:16am

Any yielding function should. But with coroutine.yield specifically you’ll later need to resume it manually using coroutine.resume.

Shedletsky · April 20, 2020, 8:18am

I just confirmed this.

So all I have to do is keep track of when my script is about to get killed and call wait() before that happens. After some experimentation, I’ve discovered that even locally, there seems to be a fair amount of variation in when the timeout happens, so not easy to predict.

Wish there was a less hacky way to manage this.

It would be cool if script timeout was an event that my script could respond to with a wait()!

Anaminus · April 20, 2020, 1:58pm

You can disable the timeout in Studio with the following command (or just set the setting manually):

settings().Studio.ScriptTimeoutLength = -1

Long-running scripts should basically never happen on live servers (blocking important server stuff) and clients (player experiences lag), so you should find a way to batch if that’s where you’re going.

A dumb but effective way to “batch” is to yield conditionally based on execution time. For example:

local Budget = 1/60 -- seconds

local expireTime = 0

-- Call at start of process.
function ResetTimer()
	expireTime = tick() + Budget
end

-- Call where appropriate, such as at the top of loops.
function MaybeYield()
	if tick() >= expireTime then
		wait() -- insert preferred yielding method
		ResetTimer()
	end
end

Threads eventually need to yield back to the engine, or they will time out. Top-level code in a script, as well as event listeners, are resumed by the engine, so these are what eventually need to yield. The entry point is when the engine resumes a thread, and the exit point is when the thread yields back to the engine.

Here is an example, along with steps describing how it executes.

local i = 0
while true do
	i = i + 1
	print(i)
	coroutine.yield()
end

When the script runs, the thread of the script (let’s call it “Root”) is resumed by the engine.
Root calls coroutine.yield(), which yields back to the engine.

coroutine.yield() doesn’t do anything special to get the running thread to be resumed later (this used to be the case, but not anymore), so Root is effectively killed.

Consider the same example as a separate thread:

local Work = coroutine.create(function()
	local i = 0
	while true do
		i = i + 1
		print(i)
		coroutine.yield()
	end
end)
coroutine.resume(Work)

Engine resumes Root.
Root creates the “Work” thread.
Root resumes Work.
Works calls coroutine.yield(), yielding back to Root.
At this point, Root continues running. There’s no more code, so Root dies, yielding back to Engine.

Now consider this slightly altered example:

local Work = coroutine.create(function()
	local i = 0
	while true do
		i = i + 1
		print(i)
		coroutine.yield()
	end
end)
while true do
	coroutine.resume(Work)
end

Engine resumes Root.
Root creates the “Work” thread.
Root resumes Work.
Work calls coroutine.yield(), yielding back to Root.
Root resumes Work.
Work calls coroutine.yield(), yielding back to Root.
Root resumes Work.
Work calls coroutine.yield(), yielding back to Root.
…

Here, execution moves back and forth between Root and Work, but never actually yields back to the engine. This will eventually cause a timeout, and demonstrates that it is not enough just to yield a thread. You have to consider what you’re yielding back to.

To drive the point home, let’s see what happens when wait() is used instead of coroutine.yield():

local Work = coroutine.create(function()
	local i = 0
	while true do
		i = i + 1
		print(i)
		wait()
	end
end)
while true do
	coroutine.resume(Work)
end

Engine resumes Root.
Root creates the “Work” thread.
Root resumes Work.
Work calls wait(), adding Work to the engine’s scheduler queue, then yielding back to Root.
Root resumes Work.
Work calls wait(), adding Work to the engine’s scheduler queue, then yielding back to Root.
Root resumes Work.
Work calls wait(), adding Work to the engine’s scheduler queue, then yielding back to Root.
…

Eventually, a timeout occurs, then the scheduler gets to work emptying its queue by resuming the Work thread over and over again. Because threads scheduled by wait() run on a budget, this is rolled out slowly over a lengthy amount of time.

The problem here is that, by calling wait(), the Work thread is being managed by both the scheduler and Root’s resume loop. The simple resolution to this is to let the engine do all the work managing threads:

local i = 0
while true do
	i = i + 1
	print(i)
	wait()
end

Engine resumes Root.
Root calls wait(), adding Root to the engine’s scheduler queue, then yielding back to Engine.

Shedletsky · April 20, 2020, 7:17pm

This is very helpful, thank you.

EmesOfficial · May 8, 2020, 8:13pm

How can I use it properly with the :PasteRegion function? 30% of the time it’s taking too long to execute so my main server script is getting timed out. I’d really appreciate any help :c

PerilousPanther · June 5, 2020, 12:46pm

Are there drawbacks to using wait() in loops? Is it just advised that you shouldn’t have long scripts?

Anaminus · June 5, 2020, 7:30pm

Whether you need to yield or not depends on how much work is being done. Printing 1 - 100 in a loop barely takes any work, so you can get away with not yielding. Generating terrain takes a lot more work, and usually involves significantly more iterations, so it’s necessary to yield somewhere.

There’s a balance to be found; you don’t want to do too much work at once, or you’ll time out, and you don’t want to do too little work at once, or it will take more time than necessary. MaybeYield finds this balance by measuring how much time it takes to do work, then yielding when this duration exceeds a given budget.

For wait() in particular, it depends. For simple cases, it will work fine. In general, it should be avoided because it’s somewhat broken. This post has more detail:

https://devforum.roblox.com/t/what-are-the-largest-performance-culprits-right-now-for-huge-servers-100-200-players/559595/11

What you can do here is make smaller TerrainRegions, and load them in sequence. Then you can use MaybeYield between each call to PasteRegion.

Consider this code that copies and pastes a single region:

local lower = Vector3int16.new(-100, -100, -100)
local upper = Vector3int16.new(100, 100, 100)

-- Copy region.
local r = Region3int16.new(lower, upper)
local region = workspace.Terrain:CopyRegion(r)
region.Parent = game.ServerStorage

-- Paste region.
local corner = Vector3int16.new(300, 0, 0)
workspace.Terrain:PasteRegion(region, corner, true)

Here’s a revision that divides the same region into a number of chunks of a specified size:

local lower = Vector3int16.new(-100, -100, -100)
local upper = Vector3int16.new(100, 100, 100)
local step = Vector3int16.new(64, 64, 64)

local function chunks(lower, upper, step)
	local i = lower
	return function()
		if i >= upper then
			return nil
		end
		local n = i
		i = i + step
		return n, i > upper and upper or i
	end
end

-- Copy region.
local regions = Instance.new("Folder")
regions.Name = "Regions"
for x0, x1 in chunks(lower.X, upper.X, step.X) do
	for y0, y1 in chunks(lower.Y, upper.Y, step.Y) do
		for z0, z1 in chunks(lower.Z, upper.Z, step.Z) do
			local l = Vector3int16.new(x0, y0, z0)
			local u = Vector3int16.new(x1, y1, z1)
			local r = Region3int16.new(l, u)
			local region = workspace.Terrain:CopyRegion(r)
			local corner = Instance.new("Vector3Value")
			corner.Name = "Corner"
			corner.Value = Vector3.new(l.X, l.Y, l.Z)
			corner.Parent = region
			region.Parent = regions
		end
	end
end
regions.Parent = game.ServerStorage

-- Paste region.
local corner = Vector3int16.new(300, 0, 0)
for _, region in ipairs(regions:GetChildren()) do
	local c = region.Corner.Value
	c = Vector3int16.new(c.X, c.Y, c.Z) + corner
	workspace.Terrain:PasteRegion(region, c, true)
	wait() -- or MaybeYield
end

EmesOfficial · June 5, 2020, 11:15pm

Great answer! Thanks for the effort, I actually did the same thing, made a script for splitting the terrain into smaller regions and then reading them when necessary with interuptions so it won’t stress the server too much. However, I feel like roblox should make these specific functions for the terrain developer friendly, so it won’t get to the point when you publish your game not knowing script crash can happen and having a hard time figuring out the solution. For instance adding new parameter to pasteRegion() to tell the executor how much cells should be filled with one tick, also to warn everyone who operate on smooth terrain with scripts, clear the terrain with segments not with terriain:clear() unless you want to possibly crash the script.

SolsticeDevs · August 21, 2020, 5:01pm

I ran this script on the command line and now my game crashes every time I play it, what happened?

Oficcer_F · September 26, 2020, 7:24pm

Hey, it might be because of the -1, that the code doesn’t understand, how it can have a negative number.

So, try to just use, math.huge instead of -1.

Just my two cents

Shedletsky · October 31, 2020, 6:54am

At first I was like, “MaybeYield should be a core language feature, its so useful!”

And then I was like, “Future generations will not look kindly on this.”

In the end I think the need for MaybeYield semantics means that something is wrong at a higher abstraction level with scheduling script execution in Roblox and how user scripts interact with that scheduler.

my 2c

Can anyone think of another language or execution environment that has MaybeYield? It’s just so WACKY! It’s kindof like a spinlock, only it sometimes does nothing and sometimes spins, instead of sometimes spinning and sometimes locking.

Autterfly · October 31, 2020, 11:22am

If I understand correctly, there are similar patterns in other languages. It’s not rare to see functions that either do some kind of work like an HTTP request or grab the data from a cache.

Though it’s probably a pattern more seen in lax cases where it’s not much of an improvement to do the request or yielding work earlier in the code and only expose a cache.

Kennykorn · January 11, 2021, 4:13pm

I just ran this line and pressed play without saving and I lost an entire AI script
Make sure you save regularly, and if you are having this “Script timeout: exhausted allowed execution time” error, run the following line:
settings().Studio.ScriptTimeoutLength = math.huge