Interested in learning more about the heap in Lua

C_Sharper · May 18, 2019, 7:45pm

Hey, so more recently I’ve been re-studying C#. Since C# is object-oriented, it’s a lot different to learn than something like Lua. I was just wondering if somebody could explain to me how the heap is implemented/works in Roblox specifically. I know Lua doesn’t support classes and instantiation of them as objects, but I’m guessing that metamethods is sort of like a replacement for classes? Might be mistaken here, but that’s what I assume.

Anyways, let’s think of this example:

local table = {1,2,3,4}

If I were to do this

 table = nil

Would the actual data for “table” still be located in the heap, and be obligated for garbage collection? Or does Lua handle it a bit differently as compared to something like the .NET framework in C#, as if Lua does this automatically when I set it to nil.

Again, I’m not even 100% sure my logic is even correct in this matter. I’m not entirely sure what happens behind the scenes when I declare a table to my identifier.

I hope I don’t sound stupid. In a nutshell, I’m just trying to learn more about how Lua handles the heap, and what happens behind the scenes when I run code like this, that seemingly is stored on the heap.

Thank you!

ViktorCorvinus · May 18, 2019, 7:54pm

Well here’s some information on heap and the functions of heap and its objects from a wiki:

Heap contains the following functions:
.new(comparator) - Creates a new Heap
comparator: Uses this function to compare values. If none is given, will assume values are numbers and will find smallest value
Comparator should accept two values and return true if a should be further up the heap than b and false otherwise
:Heapify(oldTable, comparator) - Converts a table to a Heap - Will destroy the provided table
comparator: The comparator to pass to Heap.new(comparator)
:Meld(heap1, heap2) - Creates a new Heap using the two provided Heaps
A Heap object has the following functions:
:Insert(newValue) - Adds an element to the Heap
:Pop() - Removes the first element in the Heap and returns it
:Peek() - Returns the first element in the Heap but does not remove it
:GetAsTable() - Returns a table of the elements in the Heap
:Clear() - Removes all values from the Heap
:Print() - Prints out all the values in the Heap
:Size() - Returns the size of the Heap
:Clone() - Creates and returns a new copy of the Heap

ViktorCorvinus · May 18, 2019, 7:57pm

So you could use GetAsTable then iterate through the elements of the table to see if what you’re looking for is there, hope this helps!

goldenstein64 · May 18, 2019, 8:18pm

Is this the source of the information?

github.com

Roblox/Wiki-Lua-Libraries/blob/master/StandardLibraries/Heap.lua

-- Heap contains the following functions:
--		.new(comparator) - Creates a new Heap
--			comparator: Uses this function to compare values. If none is given, will assume values are numbers and will find smallest value
--						Comparator should accept two values and return true if a should be further up the heap than b and false otherwise
--		:Heapify(oldTable, comparator) - Converts a table to a Heap - Will destroy the provided table
--			comparator: The comparator to pass to Heap.new(comparator)
--		:Meld(heap1, heap2) - Creates a new Heap using the two provided Heaps
-- A Heap object has the following functions:
--		:Insert(newValue) - Adds an element to the Heap
--		:Pop() - Removes the first element in the Heap and returns it
--		:Peek() - Returns the first element in the Heap but does not remove it
--		:GetAsTable() - Returns a table of the elements in the Heap
--		:Clear() - Removes all values from the Heap
--		:Print() - Prints out all the values in the Heap
--		:Size() - Returns the size of the Heap
--		:Clone() - Creates and returns a new copy of the Heap

Heap = {}
Heap.__index = Heap

This file has been truncated. show original

I googled "heap contains the following functions", and that was the first result.

Amiaa16 · May 18, 2019, 8:33pm

Lua’s garbage collector removes data from memory if it is no longer referenced anywhere (weak tables are an exception).

Quote from lua pil:

garbage collector can collect only what it can be sure is garbage; it cannot know what you consider garbage (…) it is up to you (i.e., your program) to assign nil to these positions so that they do not lock an otherwise free object

Fortunately roblox handles all garbage collection stuff you would normally have to handle yourself in pure lua (hence why you can only use count of collectgarbage()), so you don’t have to worry about it.

IdiomicLanguage · May 18, 2019, 10:02pm

That’s a different type of heap. Those heaps satisfy the min-heap or max-heap properties that parents nodes are either respectively smaller or larger than their children. Siblings and cousins have no guarantee of structure but the smallest or largest node is always on top. To find the next smallest/largest one need only look at its children.

The name collision between heap data structures and the heap is truly unfortunate. Look at a StackOverflow question I see that no one else can see a clear reason for the name collision besides a “heap” in the traditional sense being a messy pile of something. A heap data structure isn’t completely ordered and the heap is a very, very messy place.

As for Lua’s heap… A better question would probably be how Lua’s garbage collector works. I found a nice article here that explained a lot of it. Every object in Lua is part of two structures, a graph and a linked list. The graph is the graph of tables that make up the Lua environment with the connections being the links between tables. The linked list is a single, long linked list between every object in the Lua state. Every once in awhile the garbage collector pauses Lua and traverses the table graph, recursively marking objects. If it encounters an already marked object then it doesn’t mark it again or visit its connections. Once every object connected to the root of the graph is marked, it runs through the linked list of every object in Lua, removing every object from the linked list that isn’t marked.

At least that is how Lua version 5.0 worked. 5.1 introduced an incremental garbage collector. Basically it had three marks instead of two, marked, unmarked, and fringe. Marked nodes and unmarked nodes have to be separated by the fringe nodes, and any mutation that occurs between steps needs to keep this true. This means that every assignment needs to check if it violates this rule and if so the garbage collector runs to correct it.

This method of garbage collection means that Lua doesn’t have to keep track of the number of references to objects like some scripting languages do, meaning that there is hopefully less overhead and in general Lua runs faster. One of the problems with this method of garbage collection is that it is latency bound by pointer chasing, being heavily graph based, whereas traditional reference counting takes place when the object is presumably already in a CPU cache. The advantage over reference counting is that extra work isn’t done as the reference count fluctuates in between collection cycles.

More specifically for tables, when a table is created a Table structure is filled out:

typedef struct Table {
  CommonHeader; // This contains the linked list of all objects
  lu_byte flags;  // I'm not entirely sure what these tags are used for
  lu_byte lsizenode;  /* log2 of size of `node' (hash table) array */
  struct Table *metatable;
  TValue *array;  /* array part */
  Node *node; // the hash table
  Node *lastfree;  /* any free position is before this position */
  GCObject *gclist;
  int sizearray;  /* size of `array' array */
} Table;

The Node structures are a key and value pair, both of which have garbage collection links too. The Key is composed of a value and a optional link to another key in case of hash collisions. Check it out here:
https://www.lua.org/source/5.1/lobject.h.html#Table

Autterfly · May 18, 2019, 11:12pm

Expanding on this; these aren’t used widely in 5.1, but they were a start to an attempt to optimize metamethods lookups. The idea (as far as I can tell) was to mark a single bit flag for commonly used metamethods when they are set/unset in a table.

That way, when you were to look up if the metatable contained a metamethod every time you were going to call it (say, for __index), you could first check if the bit was set before performing a search in the metatable’s nodes for the __index field.

Example of it used is luaT_gettm.