Roblox detects require loops improperly

Corecii · February 26, 2018, 8:10pm

Setup:

Script requires OuterModule
- OuterModule requires InnerModule
  - InnerModule creates a new coroutine/event called Wrap
Wrap requires OuterModule

Wrap yields forever. Wrap should wait for OuterModule to finish running.

This bug happens everytime. This bug happens on production/www.

Repro: RequireLoopWrapRepro.rbxl (13.2 KB)

Note that Required InnerModule(Wrap)->OuterModule is never printed.

More info:

This seems to happen because Roblox assumes that if the to-be-required module’s coroutine status is normal, then there must be a require loop and the coroutine should not be resumed. This assumption is incorrect in the above described scenario. In the above scenario, InnerModule still returns, so there is no require loop.

This assumption is usually correct because if InnerModule had required OuterModule directly, OuterModule’s status would be normal and they would get stuck in a require loop.

It’s not correct to only check the coroutine’s status. In the above scenario in Wrap, OuterModule’s status is normal, but Wrap is entirely capable of waiting for OuterModule to finish. In fact, if a wait is added to the top of the Wrap section, it does wait for OuterModule to finish. I think this is because the status of OuterModule is no longer normal and now suspended or dead. This is shown in the following file: WORKAROUND.rbxl (13.2 KB)

Wrap can be a coroutine.wrapped function, a function triggered through a BindableEvent, or anything else that creates a new coroutine that runs instantly.

einsteinK · February 27, 2018, 9:05am

Does roblox already check for require loops? Whenever I had anything complex, I just used a modulemanager that checks for loops.

actually, I even allow loops kind of

MM is like an advanced require (using __call), but also has other methods (__index)

-- Module A
local MM = getModuleManager()

local B = MM(B) -- immediately returns API
-- (even though B requires A, but because of SetFinished...)

return {}

-- Module B
local MM = getModuleManager()

local API = {}
MM:SetFinished(API) -- I forgot the exact API, but you get the idea

local A = MM(A)

return API

obviously doesn’t work if using the vanilla require

Corecii · February 27, 2018, 9:29am

It appears to.

If you try to require a module whose coroutine is in the normal state, then the running coroutine is suspended and never resumed.

If you try to require a module whose coroutine is in the suspended state, then the running coroutine is suspended and resumed when the module’s coroutine is dead.

If Roblox did not implement anything special then I would expect both the normal and suspended states to do the same thing. This points to an intentional optimization/fix for require loops, but the fix is too broad and applies to more cases than just require loops.

einsteinK · February 27, 2018, 12:58pm

I know that this happens:

-- Module A requires B, so yields until B finishes
-- Module B requires A, so yields until A finishes
-- now they're both yielding, waiting for the other to finish

That’s not any special checking, that’s just how it works. I meant, does that error or produce a warning?

Corecii · February 27, 2018, 2:46pm

There is no error or warning. Not in an actual require loop or this improperly-detected one.

I don’t think these coroutines are actually “waiting for the other to finish”. I’m pretty sure Roblox suspends them indefinitely as a sort of optimization. They will never be resumed, so they’re not really waiting.

If they were actually waiting, then why doesn’t the Wrap coroutine resume when OuterModule finishes? Roblox never resumes it. Roblox must be treating it differently, seemingly depending on the coroutine status of the to-be-required module.

I realize now that the example in the original post can actually be simplified into a single module:

coroutine.wrap(function()
	require(script)  -- status of Module is normal
	print("Required.")  -- never prints
end)()

return true

This should work. This does not work. Roblox is not tracking the requires backwards, though, as made evident by the following, working snippet:

coroutine.wrap(function()
	wait()
	require(script)  -- status of Module is dead
	print("Required.")
end)()

return true

Furthermore, this is not because the module has already finished by the time require is used. This works too:

coroutine.wrap(function()
	wait()
	require(script)  -- status of Module is suspended
	print("Required.")
end)()

wait(5)

return true

This bug only occurs if the status of the to-be-required module is normal. I’ve also checked the status of the requirer while requiring in all of these cases and it’s suspended. This means that Roblox suspends the requirer without ever resuming it if the to-be-required module’s status is normal.

If it was actually a require loop, then this would be a good optimization. Since the requirer could never resume naturally anyway, it makes sense to not even try to resume it. The behavior should never even be visible from the Lua side because “waiting forever” is effectively the same as “never resuming”.

But this isn’t actually a require loop, so this comes off as buggy behavior instead of an invisible optimization. The Wrap coroutine never resumes, but it’s supposed to!

I don’t know if that’s what Roblox does internally, but it sure looks like it. Regardless of what causes this internally, it’s still a bug.

This bug occurs if you start up a new coroutine in any way that results in the to-be-required module being normal. For example, using ChildAdded causes the same bug:

script.ChildAdded:connect(function()
	require(script)  -- status of Module is normal
	print("Required.")  -- never prints
end)
Instance.new("BoolValue", script)

return true

einsteinK · February 27, 2018, 4:42pm

Does a print after the coroutine.wrap()() print?

I remember something like this would also “break” the require process, e.g. the main thread would finish, but the modules never finishes loading:

-- I don't remember the exact thing, but it went something like
local co = coroutine.running()
spawn(function() coroutine.resume(co) end)
wait()
return true
-- maybe it was coroutine.resume(create( or wrap()() instead of spawn
-- (in which case the wait() shouldn't matter either)

Corecii · February 27, 2018, 4:59pm

I found that situation by testing! If you resume the module’s coroutine then it breaks the require process. The module will finish but all of the requires made to it will stay yielding.

That’s not what happens here.

My actual test code for the simplified version was something like:

print("Starting Module")
coroutine.wrap(function()
	print("\tStarting Wrap")
	print("\tRequiring Wrap->Module")
	require(script)  -- status of Module is normal
	print("\tRequired Wrap->Module")  -- never prints
	print("\tFinishing Wrap")
end)()

print("Finishing Module")
return true

and the Script has “Starting Script” and “Finishing Script” prints.

Everything prints except “Required Wrap->Module” and “Finishing Wrap”.

einsteinK · February 27, 2018, 5:17pm

Interesting.

I tried just doing your wrap()() in a regular script, but using a module with return (wait()) as Source. It didn’t yield. So it’s not the “isn’t a roblox thread, which might somehow matter for resuming when the module finishes”.

Maybe once a module yields in the main thread, it goes in a state of “loading”, on which other require() calls “subscribe” to, but when a module gets required before that happens, it breaks somehow.

Corecii · February 27, 2018, 6:22pm

That’s possible but sounds very odd.

It’s possible that this is legitimately an internal bug and not a wrongly-assumed optimization case.

Here’s a more detailed view of what I think is going on internally:

require(Module)
- If Module has already ran, then return the result.
- If Module is not already running then…
  - Set up an array of Lua coroutines to resume on finish
  - Yield the calling coroutine
  - Add the calling coroutine to the array
  - Start the Module
  - When the module finishes, resume all of the to-resume-on-finish coroutines
- Else, if it is already running
  - Yield the calling Lua coroutine
  - If the module’s coroutine’s state is not normal then… (optimization)
    - Add the calling Lua coroutine to the to-resume-on-finish array

I’ve already stated why Roblox would choose to make this optimization and why the assumptions behind it are incorrect.

One of the reasons I think Roblox is making an optimization is because they can garbage-collect coroutines if they’re in a require loop. Roblox can have a “require loop” return type for require() that just yields the caller, then they can remove all references to the coroutines and garbage-collect them. One part of “remove all references” would be to make sure that the coroutines don’t reference each other in their to-be-resumed arrays. This would be achieved by yielding the calling coroutine and not adding to that array in the first place – exactly matching the current behavior.

I’m guessing the way you see it looks something like this:

require(Module)
- If Module has already ran, then return the result
- If Module is running, then…
  - Yield the calling coroutine
  - If Module has a waiting list, then add the calling coroutine to it
- If Module has not ran and is not running then…
  - Start the module
  - If the module yields, then…
    - Yield the calling coroutine
    - Create the waiting list
    - Add the calling coroutine to the waiting list

This would result in not adding the calling coroutine to the waiting list since the module has not yielded yet.

I think it’s odd that Roblox would both yield the coroutine and check if the module has a waiting list. Why would Roblox write that if knowing that in some cases there is no waiting list when there should be? If this design is the case then I would expect that Roblox would write no if and we’d get an internal error instead of the never-resuming behavior.

On the other hand, this implements the “don’t add normal state coroutines to the to-be-resumed list” optimization by the nature of its design. It doesn’t even need a “require loop” return type – if the module is “running” and doesn’t have a waiting list, then it will exhibit the same behavior. This also only creates a waiting list if the module yields, which is a plus.

Neither of us can see the internals, but speculating as to what causes this bug is fun.

This is why I mentioned using BindableEvents and stuff like ChildAdded. I use coroutine.wrap in the examples because it’s quick and easy to use and understand. Roblox-supported coroutine/thread “types” work just as well. This bug was actually discovered using a BindableEvent-based spawn()-like method.