PSA: Connections can memory leak Instances!


#1

Preface

I’ve seen several experienced developers not knowing this, or not having a full understanding of what’s going on, so I thought I’d write a dedicated post on the issue of accidentally leaking Instances (so that they are not GCed even when there are no references to them) through lingering event connections.

What circumstances cause a leak

do -- All good, this part will be GCed just fine
    local p = Instance.new('Part')
    p.Touched:connect(function() print("Touched") end)
end

do -- OOPS!... this part will hang around in memory forever!
    local p = Instance.new('Part')
    p.Touched:connect(function() print(p) end)
end

do -- OOPS!... still leaking, it doesn't have to be a direct reference!
    local p = Instance.new('Part')
    local dataTable = {Message = "Test"; Part = p}
    p.Touched:connect(function() print(dataTable.message) end)
end

do -- All good, we break the connection so the part is free to be GCed
    local p = Instance.new('Part')
    local cn = p.Touched:connect(function() print(p) end)
    cn:disconnect()
end

do -- Also also all good, as Destroy() implicitly disconnects all connections
    local p = Instance.new('Part')
    p.Touched:connect(function() print(p) end)
    p:Destroy()
end

Why the leak happens

The core of this problem comes down to the fact that the “connection list” of connected functions that Roblox objects store is a list opaque references, which just tell Lua “there is a reference to this function”, without telling Lua anything about where that reference is coming from. That means that you can get a circular reference which the garbage collector can’t recognize and collect. From the second example above where the part gets leaked indirectly:

  • The closure “function() print(dataTable.Message) end” has a reference to the dataTable, through the upvalue “dataTable”.

  • The dataTable has a reference the part as one of it’s members

  • The part has a connection list, which includes a reference to the closure

  • Normally you might expect this cycle to not be an issue, since the Lua garbage collector can detect cycles with no other references

  • However the connection list is a C++ list of “I have a reference to this Lua function” references, so Lua has no way of knowing that the only reference to that closure in fact comes from the very part it has a reference to (indirectly through the table)

  • RIP client/server RAM if you ran this code really frequently

How to prevent it

The solution is very simple. Always :Destroy() Instances that you’re not going to use again… unless of course they’re things that you’re not holding any script references to, then they’ll GC nicely regardless of what you do. Destroying breaks any connections that an object has, and does it recursively, so if you just destroy all of your top level objects that will clear out any of these problematic connections that would otherwise leak references.

Easy… unless, you’re like me and you use an object oriented style with a custom “Signal” class to make events on your Lua objects. If you use a BindableEvent object internally to implement that Signal class, then you’d better make damn sure that you’re actually disconnecting all of the connections that you make on any copies of those Signal objects with a lifetime shorter than that of the server/client that they’re on. @Quenty has a very nice pattern for this with his “Maid” class that you can find in multiple of his projects if you need a way to fix it. My approach is to literally put a “Destroy” method on my Lua objects which disconnects all the connections / Destroys the sub-Lua-objects and call it when I’m done with something, since I find the Maid pattern a bit obfuscating.

Edit: A note on temporary scripts

If I recall correctly (and as brought up by some other posters), there is some behavior to try to break connections made in temporary scripts (for instance, scripts under the character) when the scripts are removed. I don’t know the full details on that since I write almost all of my scripts as permanent ones to avoid race conditions, I’ve never explored the right way to write temporary variants.

No matter how you look at it though, this issue is far more important on the server, because most client visits are short enough that even if there are some small leaks there won’t be time for them to add up to a problematic level.

Closing

If you didn’t know about this, you should probably take a second look at your game code to make sure you haven’t fallen into this trap anywhere, especially if you’re using a custom Signal class, since it’s so easy to be lazy and not store the connection objects for those.


Game slowly get's more laggy the longer the server is active?
Server delays (Gets worse over time)
Featured Games Program - Getting Started & Expectations
Memory Leak Issue
Constant Game Lag
Server-side loops lag a LOT, while single actions don't
#2

Nice, that’s what I do too.

This is all a good example of how we can be learn to design our systems better. Lua falters from making all of this not very obvious. In Swift, you can explicitly mark variables as ‘weak’ to avoid this issue, which I’ve really enjoyed using.


#3

Might be a good idea to move to #tutorials cause this seems too useful to die from drowning in other development discussion posts :wink:


#4

I wouldn’t call this an issue with Lua. Really the issue causing this is that garbage collection on the Lua side is being mixed with Reference Counting (like Swift) on the C++ side.

You could theoretically use a non-mixed solution, where every object exists primarily on the Lua side under the purview of the Lua garbage collector, but that would have very significant performance implications so it isn’t really feasible. Specifically, you could allocate the C++ objects as Lua userdatas rather than allocating them on the C++ side and only allocating “Ref” objects to point to them on the Lua side.

But the mixing does have some awkward side-effects like this one.

Not sure that it’s really a “tutorial” since it doesn’t actually tell you how to solve the problem on a detailed level, just that it exists. I feel like it would take a lot of expansion to make it into a tutorial.


#5

So if one has a build-up of these connections over time, it could cause the server to lag? When you say use :Destroy(), could this be used neatly on the connection itself to detach all of the references, or do you need to use it on each individual instance?

Sorry if this seems like an odd question, I’m just trying to understand the issue and its solution.


#6

It will gradually manifest as lag, because it will create a huge amount of stuff in memory and make the garbage collector run really slowly. Eventually it will straight up crash the server as it runs out of memory.

No, connections / events don’t have a Destroy on them. You just need to disconnect the connection to the function that is holding the reference somehow. That somehow can be:

  • Explicitly disconnect()ing the connection
  • Destroying the instance that has the Event in question on it, which will disconnect all connections on all of its events
  • Destroying something further up the hierarchy that the instance is in. When you Destroy something all of its children are destroy-ed, and all of those childrens children, etc, recursively. You only need to destroy your top level objects. For instance if you have a car with Touched connections on some of its parts which reference those parts—you don’t need to Destroy the parts / disconnect the connections, you just need to Destroy the whole car model when you’re done with it, and that will indirectly break the problematic connections.

#7

Are Player Instances destroyed when they leave the game? Is special logic required to prevent memory leaks with Player related connections?


#8

Interesting, thanks!

I recommend putting this in #tutorials


#9

I’ve been using this to disconnect one-time use connections;

local con
con = something.something:connect(function() 
print("works")
con:disconnect()
end)

That way once the connection has been used for its purpose, it should no longer exist. I can’t see any immediate issues with it, anything I’ve overlooked?


#10

Wouldn’t the :Wait() method be better for single usage events?

local passedArgs = something.something:Wait()
print("works")

#11

That’s possible, but if “something” can be destroyed, that :Wait() will hang forever. And if you want to do code after it, you’ll have to wrap it in its own thread anyway, hence an event connection would be cleaner.


#12

Thanks for the reply! So just to reiterate, using :disconnect() on all connections you make (once you’re done with them) cleans up this issue?


#13

Doesn’t ROBLOX’s default Health script cause a nasty memory leak every time a player dies?

-- Gradually regenerates the Humanoid's Health over time.

local REGEN_RATE = 1/100 -- Regenerate this fraction of MaxHealth per second.
local REGEN_STEP = 1 -- Wait this long between each regeneration step.

--------------------------------------------------------------------------------

local Character = script.Parent
local Humanoid = Character:WaitForChild'Humanoid'

--------------------------------------------------------------------------------

while true do
	while Humanoid.Health < Humanoid.MaxHealth do
		local dt = wait(REGEN_STEP)
		local dh = dt*REGEN_RATE*Humanoid.MaxHealth
		Humanoid.Health = math.min(Humanoid.Health + dh, Humanoid.MaxHealth)
	end
	Humanoid.HealthChanged:Wait()
end

Since characters never get :destroy() called on them, they just get parented to nil. This is also the case for Player instances.


#14

This has been my suspicion for quite a while. It’s nice to know that the closure is aware of usage underneath it. I’d highly recommend disconnecting all events you don’t intend to be permanent.

General notes:

  • :Destroy() is recursive so you only have to call it on the parent hierarchy
  • If your client scripts don’t persistent when the player dies, you may be fine. This isn’t a great assumption, but Roblox definitely disconnects scripts on death. Module scripts are an exemption to this, and don’t disconnect (for good reason)

I stole this pattern from Anaminus, although my implementation does some nicer things when nested Maids and on-cleanup calls.

Lua has weak tables, but as we’ve discussed before, these don’t work out too nicely for GC.

No, these scripts in the character and definitely removed / destroyed in some manner.

while true do
	wait(0.5)
	print("Still running script", script:GetFullName())
end

Stops execution after death. This implies that signals are disconnected as well.

For players themselves, I’m not sure. I always disconnect events on players. I think scripts inside of them are also destroyed.


#15

It seems scripts specifically within the character are disabled, but events connected to the character from elsewhere are not. So if you have a HealthChanged function (for example) connected to players OnCharacterAdded from the server, this connection does not disappear just because the character dies. The garbagecollection for this event then follows the rules posted in the OP. I’d imagine this is the same case for players.


#16

This post right here will fix a lot of lag issues in my game. Great post! :slight_smile:

(You might want to consider moving this to the public tutorial section. This is seriously useful information.)


#17

Could the core issue be worked around with a backport of Lua 5.2’s ephemeron weak tables?


#18

Yes, I believe it could… you could use an ephemeron table to hold references to Lua functions in connection lists. However, it suspect that it would also make the Lua bridge much more complicated:

  • The Lua bridge would have to be “aware” of something being passed over it is being used for, since the closure passed to connect would have to be associated with object whose connection list it is being stored in when being put in the ephemeron. Currently it’s just a flat system opaquely passing values across.

  • You would probably have to hold a strong reference to the Lua-side “Ref” object associated with the instance while that instance is in the object hierarchy and it has any connections to Lua functions (Note: The connection interface is uniform between C++ and Lua, the engine can connect events to C++ functions too) so that the closure doesn’t get GCed while the object still exists on the C++ side (currently unless you get a reference to a given instance, that instance doesn’t exist on the Lua side at all—even though it may have connections to Lua functions which don’t reference it)

  • There is probably additional complexity involved in getting RBXScriptSignal::Wait to function correctly in all circumstances with this ephemeron implementation.

TL;DR: It would definitely be a non-trivial project even if the backporting part were free.


#19

connect() is deprecated, you’re spreading false info!111!1!!

Jk, good guide and definitely something I never thought about.


#20

I’m an old fogey… so uppercase Workspace and lowercase connect it is.