A Beginner's Guide to Lua Garbage Collection

This article is designed to introduce complete beginners to the concept of Lua garbage collection and how Lua handles automatic memory management.

A Beginner’s Guide to Lua Garbage Collection


References and Variables
[Section Abstract: This section will focus entirely on the distinct difference between objects and the references that point to them. By the end of this section you should be able to understand “copy-type” and “complex-type” objects as well.]

The first concept we will be covering is the concept of “references” in programming and what it means. Luau programmers, specifically on Roblox, usually call any reference a “variable”. When you think of a variable, you probably think of the following code.

You are not incorrect to call this a variable; in fact, it aligns pretty well with what variable means to a scripting language like Lua. However, if I were to tell you that this is also a “reference”, you may not understand what I mean.

When you assign a variable in Lua (at least in this situation), you are creating a new object in your code’s Lua heap. This object (of type number, and value 10) did not exist before this line of code ran. You have created it. It works like this.

This may seem obvious, but the important distinction that this image is attempting to show is that a reference name and what object that reference contains are two completely different things. A reference name is a name given to a specific object that occupies memory in your program.

There are two ways that these objects can occupy memory, and which way is dependent on whether or not Lua and C dignify an object with its own identity. The first type of object will be considered a “copy-type” object, and the second type of object will be considered a “complex-type” object. Here are examples of both.


Note. The terms “copy-type” and “complex-type” are not official terms, but will be refered to in this way throughout the article to help with understanding.

The “copy-type” object does not have Lua identity. When created it will contain that value, but if a separate reference were to write to that value, they do not equate to the same object in memory (the new reference copies the value of the original reference instead of pointing to the same one).

The “complex-type” object has identity, and has a sense of “selfness”. When another reference points to it, the value is not copied, it just points to the exact same object. This is why the exact same table still exists under the second reference in the complex-type example.

Copy-type objects consist of strings, numbers, booleans, and nil. Complex-type objects consist of functions, tables, threads, and any non-native type that contains identity (you can test this on your own in Roblox Studio).


Intro to Lua 5.1’s Incremental Garbage Collection
[Section Abstract: This section will completely cover the fundamentals of Lua garbage collection and how it works internally. This section will also cover strong and weak references, as well as weak references in regard to weak tables in Lua.]

You may have heard the term “GC” or “garbage collection”, but had no idea what it meant. This section will cover this entirely.

Now that you understand that a distinction exists between created objects and the references which point to them, you will be able to understand garbage collection and its function.

In Lua 5.1, garbage collection info can be accessed through collectgarbage(); however, this is deprecated on Roblox, and your primary interaction with the collector is through gcinfo() instead.

Garbage collection, simply put, is Lua’s automatic memory management system. It handles objects that are obviously no longer needed by your program and frees that memory so it can be used elsewhere. This is nice in an embedded language like Lua because the scripter can focus more heavily on how the program works instead of constantly being under pressure of whether or not memory is being cleaned up.

There are two types of references to objects in programming- “weak” references and “strong references”. A weak reference is a reference that does not protect an object from garbage collection. A strong reference is a reference that does protect an object from garbage collection. As a Luau programmer you don’t need to know much about functionally programming with weak references. The references that you create in your programs on Roblox are always strong unless declared under a weak table (which is really the only way that Lua allows for explicit weak references). A reference will become weak only when it: goes out of scope, is declared weak under a weak table, or the program ends.

Up until Lua 5.0, there was a traditional mark-and-sweep collection system. This simply means:

  • A marking step where the collector traverses through the program and marks objects to collect and leave based on what kind of references exist to those objects.

  • A sweeping step where objects which were marked for collection are collected, and the GC process is completed.

The biggest problem with this system, understandably, is that in order for the collector to run, long pauses could exist in your program and managing memory and references was a more tedious process because of this. In Lua 5.0 this changed.

Lua 5.0 introduced the incremental garbage collector, which solved this problem by running GC incrementally. Some of a program runs, some GC happens. Whether or not GC happens is dependent on how much memory has been allocated since its last iteration. In native Lua, you can manipulate those settings with collectgarbage(“setpause”) and collectgarbage(“setstepmul”); but, as previously stated collectgarbage() is deprecated and unnecessary on Roblox. You can instead use gcinfo() to access information about memory usage. gcinfo() works the same in Luau as it does in Lua 5.1.

image
Note. This figure shows the definition of what gets pushed in C as a result of calling gcinfo().

Incremental collection brought a lot of improvements to Lua, and works more seamlessly with “the mutator” (which just means the program itself in all GC literature, and will be referred to as such moving forward).

The incremental collector contains a tri-colored collection algorithm. When garbage collection happens, the heap is divided up into three sets. The set of black objects, or objects which are not considered for collection as they are accessible. The set of gray objects, which have been visited but remain untraversed. And, the set of white objects, which have not yet been visited (or traversed).


Note. Objects that are accessible from Lua’s root set are immediately colored black or gray, and the rest of the graph is traversed from that point. Gray objects act like a barrier between black and white objects.

In the beginning of this process, all nodes are colored white (bar Lua root set objects which are colored gray or black at all times). The “root set” is the set of immediately accessible data in your program without following any pointers. The collector starts here, coloring these objects gray or black depending on their exact hierarchy.

The collector discovers reachable nodes by finding an intersection between a gray and white node and coloring that white node gray. In simpler terms, the marking phase will continuously find a gray object and mark its white children as gray. A gray node is colored black when all of its children are also gray. A marking cycle ends when there are no more gray objects remaining as this signifies having reached all accessible children in the available graph.

Once this traversal has happened, the “atomic step” of GC occurs signaling the end of the marking phase. The atomic step will re-traverse all gray objects, clear weak tables twice, and separate objects which will be finalized (collected and discarded).

There are some asterisks to pay attention to under the marking phase, which go as follows (these are important because collection co-exists with the mutator):

  • Assignment of tables move black tables back to gray.
  • Assignment of a metatable moves a white metatable forward to gray.
  • Objects moved back to gray are kept in a separate list, only to be returned to during the atomic step of the marking phase.
  • Generally, gray objects create a border so that black and white objects do not touch; but, the mutator can surpass this expectation as it runs incrementally alongside the collector.
  • Stacks are kept gray.

The sweep phase then commences. During the sweep phase, objects which have been selected for finalization and have a metatable attached will attempt to execute their __gc metamethod.

The incremental collector reduces the length of pauses in the mutator, but does not decrease the overall GC overhead. There will be links which discuss this concept more in-depth at the bottom of this article.

Lua is heading towards a more generational collector in the future, but this isn’t worth understanding the logistics of because Luau is built from the Lua 5.1 codebase.

Let’s look at and understand a garbage collection cycle from a more traditional scripter’s standpoint now:


Note. This figure shows usage of collectgarbage() to manually initiate garbage collection in Lua 5.1.

As discussed earlier, an object will be collected when no more strong references exist to that object (which is what the tri-color mark phase works to determine). In the first collection that we force with collectgarbage(), our table object is still considered important because the mutator maintains a strong reference to it. Then, we nullify the only strong reference to this object, which makes it suitable for collection.

An object is not immediately collected just because no strong references exist to it. Many Roblox scripters seem to believe that when they nullify a reference, that object no longer exists. That is simply not the case. It will, however, be collected once garbage collection occurs.

Before this section closes out, I will briefly cover weak references in regard to weak tables (as to create as complete of a beginner’s guide as possible). Lua implements weak references exclusively in the form of weak tables, where either the keys “k”, values “v”, or keys and values “kv” are considered weak as determined by its metatable’s __mode metafield. An example of this is shown below.

image

In Lua and Luau there are very limited reasons that you would need to declare a weak table. Unless you’re caching results of some statement or function that is consistently called you probably don’t need one, never the less it is the only way to declare weak references at all in Lua and worth understanding. Weak tables for object caching, when used correctly, can result in major benefit depending on how intense a Lua structure is on its call-stack.

The following example shows caching of some account system, where accounts that are no longer active are garbage collected (because the references to them in the weak table are weak, and when a user discards their strong reference, nothing else remains to hold it from collection).

image

It is also good practice to never make changes to __mode on an active metatable. The behavior will be unstable and is not well documented. You should always create a weak table’s __mode field during its inception.


How Garbage Collection Works in Roblox and Luau
[Section Abstract: This section will entirely cover the differences between garbage collection with Luau on Roblox and garbage collection from Lua 5.1. It will also cover good practices when working with engine internalized objects (instances).]

Now that we understand how garbage collection, objects, and references work- we can look at the differences between GC in an IDE running native Lua vs. GC on the Roblox engine with Luau.

The first thing to note is that your interaction with the collector itself is heavily limited on the Roblox engine, the only available function for direct interaction being gcinfo().

On Roblox, garbage collection is a natural part of its cyclical “task scheduler”. It happens just before the simulation job in each cycle. This means that the only way to actually force a garbage collection cycle is to yield your code until the next task scheduler heartbeat occurs (because Heartbeat is prompted after a collection cycle).

Once we are sure that a new heartbeat has occurred, we can also be sure that the task scheduler has passed its collection step (this doesn’t necessarily mean your object was collected, but in most cases it does).

It’s worth noting that references to Roblox instances aren’t weak (because these objects are maintained internally and through the Lua/C exchange stack). In order to properly dispose of an object that the engine itself is keeping track of you should always be calling :Destroy() on that instance. Not understanding this concept can lead to serious memory leaks in your game that you may not realize exist at all.

image

There are also noteworthy differences in best practice when it comes to weak tables on Roblox. You should not expect instances to be collected just because a weak table contains the only remaining Lua reference to them. The engine itself, even in this situation, is keeping track of this object- which means the collection is halted.

image

You should always be calling :Destroy() on instances that you no longer intend to use. :Destroy() is recursive, and when called on a parent instance, will also be called on the child instances of that parent.

Do not fall into the trap of assuming that just because you nullify instance references that the instance no longer exists. You should, for absolute safety and memory leak avoidance, both call :Destroy() on instances and manually remove them from weak tables. This is something that can be done easily under a paradigm of object-oriented programming.

You can write your own cleanup function to be assured that your instances are being properly disposed of. An example is shown below.


Note. Example of a “Life” class which supports a call to dispose itself (but assumes manual disposal of any strong references remaining to its table object.

A Roblox scripter who does not carefully handle instances which they are finished with is asking for memory leaks. Find out what paradigm you like to program under, and implement disposal accordingly for instances.

There are also differences between the native Lua 5.1 garbage collection and the garbage collection that exists in Luau.

The Luau garbage collector still runs incrementally with memory allocation acting as its time pacer (once a certain amount of memory is used, a certain amount of collection happens). Similarly to Lua 5.1 (but different in the way that through aggressive allocation this can be forced in Lua 5.1) the collector will never traverse the entire heap at once. Luau has a “proportional–integral–derivative” controller which helps aim for a target heap size.

The Luau garbage collector still contains its atomic step, which can still be cause for semi-short pauses in execution of the mutator or collection process. The atomic step is given far greater emphasis in the Luau codebase while still working under the same traditional principles that Lua 5.0 contained. The sweep step also contains significant improvements in Luau, which can be read about here.

To explicitly call for garbage collection in C (which we briefly spoke about earlier in this article), in a non-Roblox Luau environment you must include lua.h (where definitions contain the prefix lua_). This defines Luau’s basic functions provided by Luau. You can then manipulate GC through int lua_gc(lua_State* L, int what, int data);.


Best Practices & Closing Remarks

  • If you aren’t sure whether or not your object will be garbage collected, nullify all in-scope references and upvalues.
  • Always call :Destroy() on objects that the game engine itself also keeps track of (instances).
  • In an OOP situation, always write a custom function for table and newproxy objects to assure destruction (where that function disconnects all important connections and destroys the instance objects).
  • Yield for a garbage collection cycle by calling RunService.Heartbeat:Wait() or task.wait(). Be mindful of doing this in loops- where attempts to modify garbage collected objects becomes much easier.

If you found the logistical sections of this article confusing, do not be alarmed. You do not need to understand how garbage collection works internally in order to appeal to it from within your own Lua code.

Lua is an extensible (and an extension) language. You do not need to understand C (or garbage collection) in order to apply good scripting form when dealing with objects and references. It is, however, important to understand what garbage collection is so that you are not programming in the dark.

Many Roblox scripters care very little for why things work and what happens behind the scenes, but practicing and understanding these more foundational concepts is what improves your habits and overall quality as a game developer.

I recommend anyone who read this entire article to equally teach themselves about the Roblox task scheduler at large and what happens in every step of its cyclical process. These are the concepts that genuinely improve codebases.


Works Cited

Lua-Users.Org.” Lua Users , lua-users.org. Accessed 15 Apr. 2022.

“Lua - Garbage Collection.” Tutorialspoint , Lua - Garbage Collection. Accessed 15 Apr. 2022.

Soldevila, Mallku, et al. “Understanding Lua’s Garbage Collection.” 22nd International Symposium on Principles and Practice of Declarative Programming , 2020. Crossref , Understanding Lua’s Garbage Collection | Proceedings of the 22nd International Symposium on Principles and Practice of Declarative Programming. Accessed 16 Apr. 2022.

“The Programming Language Lua.” Lua.Org , www.lua.org. Accessed 16 Apr. 2022.

Stravant. “PSA: Connections Can Memory Leak Instances!” DevForum | Roblox , 3 Aug. 2021, PSA: Connections can memory leak Instances!. Accessed 17 Apr. 2022.

“Tri-Color Garbage Collector [Sean.Cm].” Sean.Cm , sean.cm/a/tricolor-garbage-collector. Accessed 18 Apr. 2022.


Keep Learning

Understanding Lua’s Garbage Collection
Cyclical reference counting with lazy mark-scan
“No Silver Bullet” - Garbage Collection for Java in Embedded Systems
Garbage Collection Lecture
Garbage Collection on tutorialspoint
My Video Tutorial on Weak Tables vs. GC

166 Likes

Great post on Lua Garbage Collection. I come from a C/C++ background and found this very useful!

1 Like

Is a table inside of another table a strong reference?

You would need to be more specific in your question. The real answer here is that in vanilla Lua (not necessarily Luau because this will muddy the water a little) every single reference that you create is a strong reference. The only time a reference is not strong is when that reference exists specifically under a weak table’s specified weak array (either Key, Value, or Both). When under a weak table, the garbage collector will no longer consider these references during its collection cycle. See example below.

local A = {}
local B = {}

A.B = B
B = nil --> Only remaining reference exists inside of the A table.

collectgarbage()
print(A.B) --> table: 0x...

setmetatable(A, {__mode = "kv"})
collectgarbage()

print(A.B) --> nil

If I have

dict = {
 [InstanceHEXCODE] = {
    Speed = 25,
    Colors = { 
       Piece1 = {2, 6, 88}
       Piece2 = {5,1,62}
    }
 }
}

and then I do 
table.clear(dict)

will it collect garbage?

no only collectgarbage() would collect it