In my java class, we’re learning about single/double/circular linked lists and hash maps. AFAIK tables in lua have all of these functionalities. From reading here on the forum, trees seem to be an important data structure to define ourselves, but would these be of any benefit? What are some beneficial custom data structures?
Intrusive doubly linked lists can be useful at times (Basically, using the Lua objects you already have themselves as the nodes in a linked list by adding previous and next fields on them). The reason being that an object can remove itself from an intrusive linked list which is a nice property to have.
Hash maps: Yes, every Lua table is just a hash map. There’s nothing to implement here, but there’s a lot of common coding patterns on Roblox using hash map behavior since Lua gives it to you for free.
Heaps: Be very wary of implementing these in Lua, I have yet to see a Lua heap implementation that was actually faster than just using a flat array, without a JIT compilation the math / function calls to do heap management / traversal are just too slow compared to linearly iterating an array.
In every computer science programs that I know of, there is at least one class dedicated to data structures. The study of data structures is based around “asymptotic runtimes,” which is a fancy way of saying how does the algorithm scale when we add more data. The main notation for the runtime is the big-O “O(n)”, which means in the worst case how does our runtime scale? It doesn’t consider the constant time cost of individual operations, only the scale. Thus, as @stravant said, heaps which have better asymptotic runtime than arrays for insertion in ordered data, may be less performant in some languages due to runtimes. Big-O is a theoretical measure meant to be independent of the machine it is running on, thus allowing algorithms to be better compared.
Since tables are the only data structure in Lua, all other structures in Lua must be built with tables (although technically you could implement custom ones using upvalues and closures, but we’ll ignore that).
There are many, many data structures and each have their own properties and make specific applications more efficient. For example, given an array of Vector3 points A
how would you find all the points with distance d
from a given point b
? You would have to perform a scan over the entire array. If the array had n
elements, this would result in a runtime of O(n). If you use an octree, the search takes O(log(n)), MUCH faster for larger numbers of points.
Spatial structures like the Octree allow fast spatial operations. Priority queues keep an ordered list of elements for quick access and insertion. Graphs have a HUGE number of applications, although sometimes not considered a “data structure”. Graph theory finds its fingers in neural networks / algorithmic differentiation and their related optimization of loss functions, pathfinding, computer vision, electric potentials, symbolic AI, compilers, and MUCH more. There are many ways to represent graphs, problems can be solved on them directly like in the case of pathfinding or image segmentation for computer vision, or graphs can be manipulated to optimally solve other problems like NNs and compilers. Graphs can efficiently store data, operations, 3D objects, regions, or pretty much anything else. Trees are a subset of graphs without cycles and a specified root node.
But to get back to the benefit of data structures in Lua… it depends on what you want to do in Lua. Many on this forum are of the opinion that more resource intensive applications should be done in the engine in C++ and not in Lua and the cases for which an intense algorithm would need to run in a script to make a game are few, thus these data structures are not needed. Most of the cases I’ve seen are for preprocessing of maps / parts to automatically generate relationships, creating a custom pathfinder to get features not included in Roblox’s, or plugins. By far most applications are offline when a server is not running.
Challenge accepted! What problem / conditions did you have in mind?
Results
Okay, so I’ve written a heap implementation, verified it, and tested it. Here are is the code:
Code
local function getMin(array, last)
local min_element = math.huge
for _, element in ipairs(array) do
if element < min_element and element > last then
min_element = element
end
end
return min_element
end
local function heapInsert(heap, element)
local i = #heap + 1
local p = (i - i % 2) / 2
while p > 0 do
if element <= heap[p] then
heap[i] = heap[p]
i = p
p = (i - i % 2) / 2
else
break
end
end
heap[i] = element -- if i == 1, this will initialize the heap.
end
local function popMin(heap)
local i = 1
local c = 2 * i
local min = heap[1]
local element = heap[#heap]
heap[#heap] = nil
if #heap == 0 then
return min
end
while c < #heap do
if heap[c + 1] < heap[c] then
c = c + 1
end
if element <= heap[c] then
heap[i] = element
return min
else
heap[i] = heap[c]
i = c
c = 2 * i
end
end
if c == #heap and element > heap[c] then
heap[i] = heap[c]
i = c
end
heap[i] = element
return min
end
local scale = {[0] = '', 'milli', 'micro', 'nano', 'pico', 'fempto', 'atto', 'zepto', 'yocto'}
local function toMetric(v, t)
local i = 0
while v < 1 do
v = v * 1000
i = i + 1
end
return ('%.3f %s%s'):format(v, scale[i], t)
end
local function avgTime(f, n, ...)
local runs = {}
wait()
for i = 1, n do
local start = tick()
f(...)
local finish = tick()
runs[i] = finish - start
end
wait()
local avg = 0
for i, v in ipairs(runs) do
avg = avg + v
end
avg = avg / n
local std = 0
for i, v in ipairs(runs) do
std = std + (v - avg) ^ 2
end
std = math.sqrt(std) / (n - 1)
return toMetric(avg, 'seconds'), toMetric(std, 'seconds')
end
local table_insert = table.insert
local function verify(num_el, get_first_k, insert, pop)
local a = {}
for i = 1, num_el do
insert(a, math.random())
end
local last = 0
for i = 1, get_first_k do
local new = pop(a, last)
print(new)
assert(new >= last)
last = new
end
end
local function test(num_el, get_first_k, insert, pop)
local a = {}
for i = 1, num_el do
insert(a, math.random())
end
local last = 0
for i = 1, get_first_k do
last = pop(a, last)
end
end
The code to verify the implementations is at the bottom. Here is how to run it in studio:
wait(3)
local num_runs = 100
local num_elements = 1
local num_to_get = 1
assert(num_to_get <= num_elements)
print(('Num Runs: %d\nNum Elements: %d\nNum to Get: %d'):format(
num_runs, num_elements, num_to_get))
print(('Table: %s, std = %s'):format(
avgTime(test, num_runs, num_elements, num_to_get, table.insert, getMin)))
print(('Heap: %s, std = %s'):format(
avgTime(test, num_runs, num_elements, num_to_get, heapInsert, popMin)))
Inserting a single element arrays obviously beat the heap by a little:
Num Runs: 1000
Num Elements: 1
Num to Get: 0
Table: 751.257 nanoseconds, std = 51.777 nanoseconds
Heap: 778.913 nanoseconds, std = 29.530 nanoseconds
Interestingly, even inserting and then getting a single element, my heap outperforms the array. This pattern of the heap outperforming the is consistent when the number of inserted elements is equal to the number of elements retrieved.
Num Runs: 1000
Num Elements: 1
Num to Get: 1
Table: 890.255 nanoseconds, std = 41.224 nanoseconds
Heap: 858.784 nanoseconds, std = 40.408 nanoseconds
Performing 7 insertions then 6 pops allows the heap to outperform the array.
Num Runs: 1000
Num Elements: 7
Num to Get: 6
Table: 4.586 microseconds, std = 86.836 nanoseconds
Heap: 4.372 microseconds, std = 98.535 nanoseconds
This shows that as the number of insertions are nearly the same at smaller numbers, heaps will outperform arrays.
As the number of elements increases though, arrays performance greatly decreases. Take for example 1000 insertions and just 6 get minimum operations:
Num Runs: 1000
Num Elements: 1000
Num to Get: 6
Table: 297.271 microseconds, std = 842.197 nanoseconds
Heap: 280.652 microseconds, std = 862.436 nanoseconds
However if we perform 12 get minimum operations:
Num Runs: 1000
Num Elements: 1000
Num to Get: 12
Table: 462.817 microseconds, std = 1.129 microseconds
Heap: 286.938 microseconds, std = 831.686 nanoseconds
The general pattern is that the heap outperforms the array when the number of elements removed nears the number of items inserted. Insertion in an array is O(1), insertion in a heap is O(log(n)). However, removal from an array is O(n) where a heap is O(log(n)). As the number of elements increases, the more attractive the heap becomes, and the less the ratio of insertions per removal can be while the heap is more efficient.
Since decrease key operations are a very commonly used with binary heaps, they are worth considering. However, since the asymptotic runtime is the same for arrays and heaps as the insertion operation (given that the position of a value is already known), we can replace ‘insertion’ in the previous paragraph and all the statements will still hold. In other words, the ratio of decrease key operations and inserts to the number of removals determines if a heap is more efficient, with the threshold becoming lower as the number of elements increases (which increases with each insert).
As shown in the trivial cases, heaps are not vastly outperformed by arrays and are still viable in Roblox Lua.
Edit: I’ve corrected the results with proper standard deviation, more tests, and a corrected version of the heap.
The particular case is for pathfinding. I was unable to get more perf out of my A* implementation using a heap for the open list than just using a flat array the last time I attempted it. This was for a case where my open list peaked at about 100 elements.
Also it’s possible that Luau has changed things. Did you have the new VM enabled in these tests?
No, this is the old VM. I unfortunately haven’t tried out the new one yet. The last time I performed A* was on up to a 60 by 60 hexagonal maze. At the time I did a lot of digging into the performance since I made it to demonstrate how using information available to script allows developers to write pathfinders that vastly outperform the builtin pathfinder. I’ve posted the video on YouTube now for ease of viewing:
The original post is here and includes the place file with the pathfinder:
It averaged around 1000 open nodes and in my experience a heap yielded better performance. I’d have to double check if this was the case for the smaller grid sizes as well, I was more worried about large grids. I’ll pop it open next time I’m in Windows, I’m currently running a process I can’t stop for awhile.
I assume the reason why the heap performs better is that the number of nodes removed is nearly equal to the number of nodes inserted. In every grid size tested on in the video, the number of nodes removed from the queue is 20 to 50 nodes less than the number of nodes inserted. The number of nodes visited ranged from 206 to 2347 with an average number of open nodes while inserting was between 139 to 1300.
BTW,
I noticed in the test code above that the standard deviation is calculated incorrectly. What you see is roughly the standard deviation squared, so the actual deviation is greater. I’ll also rerun those tests when I get into Windows, unless someone else beats me to it and wants to post their results.
Nice results. That’s pretty conclusive, it looks like I might just be wrong on this.
So I did some digging into the A* implementation I posted above. I realized that I actually rolled an ordered linked list for the priority queue, so I did some testing with an unordered array, ordered linked list, and min heap.
I realized a couple things while making this. First, my previous implementation of the ordered link list never removed a node when it was visited, it simply moved onto the next node. This is normally fine, but artificially inflated the average number of nodes in the fringe while inserting. I resolved this issue for these tests.
Second, this test on a maze doesn’t reflect many A* use cases. Since the maze is “perfect” (contains no loops) a node is never reach from different directions and thus its cost is never decreased once inserted into the fringe. Thus, these results although impressive are still not enough for me to feel completely confident saying that heaps always outperform arrays in A*. It may be that in your case there were many decrease key operations which are free in an unordered array but cost logarithmic time in a minimum heap.
Third, this is a hexagonal map instead of rectangular. I’m not entirely sure if this increases the average number of inserts per a node or not (since it still can’t create loops), but it may be a factor.
Also, I discovered a bug in the heap code that may have affected the above tests. I’ll rerun them with the fixed heap and fixed standard deviation, and then update that post.
The following results are on the 20x20 and 60x60 maps seen in the video above and 100x100 / 200x200 maps to see how the algorithms scale. Here is a picture of a 200x200 maze:
Interestingly, the heap was able to perform decently on this large maze with only ~0.07 seconds to find a path from one corner to another on my machine.
My Machine
Roblox Studio – old VM, not in beta
The tests:
Ordered Linked List
Here is the Linked List code:
Code
local function remove(key)
local s = tick()
len = len - 1
key.next.prev = key.prev
key.prev.next = key.next
t = t + tick() - s
end
local function getMin()
local s = tick()
local key
key = fringe.key
remove(fringe)
fringe = fringe.next
t = t + tick() - s
return key
end
local function insert(key)
c = c + len
i = i + 1
len = len + 1
local s = tick()
local cur = fringe
while key.value > cur.value do
cur = cur.next
end
key.prev = cur.prev
key.next = cur
cur.prev = key
key.prev.next = key
if cur == fringe then
fringe = key
end
t = t + tick() - s
end
local function decrease(key)
remove(key)
insert(key)
end
And here are the results:
20x20 results
Num Nodes: 400
Run time: 0.00085234642028809
Percent time spend in queue ops: 0.10769230769231
Num inserts: 205
Avg nodes in queue during insert: 6.2585365853659
Num nodes visited: 197
60x60 results
Num Nodes: 3600
Run time: 0.0082039833068848
Percent time spend in queue ops: 0.24739901191514
Num inserts: 2634
Avg nodes in queue during insert: 22.60782080486
Num nodes visited: 2618
100x100 results
Num Nodes: 10000
Run time: 0.024052381515503
Percent time spend in queue ops: 0.25522635131786
Num inserts: 7797
Avg nodes in queue during insert: 25.584840323201
Num nodes visited: 7771
200x200 results
Num Nodes: 40000
Run time: 0.14127492904663
Percent time spend in queue ops: 0.26282001518859
Num inserts: 31729
Avg nodes in queue during insert: 47.935894607457
Num nodes visited: 31657
Unordered Array
Here is the Unordered Array code:
Code
local function remove(key)
local s = tick()
len = len - 1
local last = fringe[#fringe]
fringe[key.index] = last
fringe[last.index] = nil
last.index = key.index
key.index = nil
t = t + tick() - s
end
local function getMin()
local s = tick()
if #fringe == 0 then
return
end
local min = fringe[1]
local ind = 2
while ind <= #fringe do
if fringe[ind].value < min.value then
min = fringe[ind]
end
ind = ind + 1
end
remove(min)
t = t + tick() - s
return min
end
local function insert(key)
c = c + len
i = i + 1
len = len + 1
local s = tick()
local ind = #fringe + 1
fringe[ind] = key
key.index = ind
t = t + tick() - s
end
local function decrease(key)
-- This operation is free.
end
And the Unordered Array results:
20x20 results
Num Nodes: 400
Run time: 0.0010368824005127
Percent time spend in queue ops: 0.39572315474822
Num inserts: 283
Avg nodes in queue during insert: 6.3462897526502
Num nodes visited: 280
60x60 results
Num Nodes: 3600
Run time: 0.010080814361572
Percent time spend in queue ops: 0.46859183576936
Num inserts: 2383
Avg nodes in queue during insert: 18.085606378514
Num nodes visited: 2356
100x100 results
Num Nodes: 10000
Run time: 0.047679424285889
Percent time spend in queue ops: 0.57080137212349
Num inserts: 9716
Avg nodes in queue during insert: 29.694112803623
Num nodes visited: 9703
200x200 results
Num Nodes: 40000
Run time: 0.27895498275757
Percent time spend in queue ops: 0.53525318327348
Num inserts: 36671
Avg nodes in queue during insert: 44.058956668757
Num nodes visited: 36628
Minimum Heap
Here is the Minimum Heap code:
Code
local function getMin()
local s = tick()
len = len - 1
local cur = 1
local child = 2 * cur
local min = fringe[1]
local element = fringe[#fringe]
fringe[#fringe] = nil
if #fringe == 0 then
t = t + tick() - s
return min
end
while child < #fringe do
if fringe[child + 1].value < fringe[child].value then
child = child + 1
end
if element.value <= fringe[child].value then
fringe[cur] = element
element.index = cur
t = t + tick() - s
return min
else
fringe[cur] = fringe[child]
fringe[cur].index = cur
cur = child
child = 2 * cur
end
end
if child == #fringe and element.value > fringe[child].value then
fringe[cur] = fringe[child]
fringe[cur].index = cur
cur = child
end
fringe[cur] = element
element.index = cur
t = t + tick() - s
return min
end
local function insert(element)
c = c + len
i = i + 1
len = len + 1
local s = tick()
local cur = #fringe + 1
local p = (cur - cur % 2) / 2
while p > 0 do
if element.value <= fringe[p].value then
fringe[cur] = fringe[p]
fringe[cur].index = cur
cur = p
p = (cur - cur % 2) / 2
else
fringe[cur] = element
element.index = cur
t = t + tick() - s
return
end
end
fringe[cur] = element
element.index = cur
t = t + tick() - s
end
local function decrease(element)
local s = tick()
local cur = element.index
local p = (cur - cur % 2) / 2
while p > 0 do
if element.value <= fringe[p].value then
fringe[cur] = fringe[p]
fringe[cur].index = cur
cur = p
p = (cur - cur % 2) / 2
else
fringe[cur] = element
element.index = cur
t = t + tick() - s
return
end
end
t = t + tick() - s
end
And here is the Minimum Heap results:
20x20 results
Num Nodes: 400
Run time: 0.00061559677124023
Percent time spend in queue ops: 0.27498063516654
Num inserts: 161
Avg nodes in queue during insert: 7.2360248447205
Num nodes visited: 144
60x60 results
Num Nodes: 3600
Run time: 0.01081109046936
Percent time spend in queue ops: 0.36372257139707
Num inserts: 3012
Avg nodes in queue during insert: 27.928618857902
Num nodes visited: 2982
100x100 results
Num Nodes: 10000
Run time: 0.01996636390686
Percent time spend in queue ops: 0.33850379127112
Num inserts: 5525
Avg nodes in queue during insert: 26.226787330317
Num nodes visited: 5483
200x200 results
Num Nodes: 40000
Run time: 0.077650547027588
Percent time spend in queue ops: 0.32077435598268
Num inserts: 18392
Avg nodes in queue during insert: 35.562200956938
Num nodes visited: 18361
To answer the original question, different data structures are faster in different cases. For smaller fringes, an ordered linked list is faster for the A* priority queue than either an array or heap. For large fringes it seems that heaps are the best option. If there are many ways to reach a node (such as in a very open map with a grid representation) then an array may be more performant (yet to be tested).
Anyways, I’m losing interest in this problem, so I probably won’t get around to testing on an open map (the maze one was quickly available to me). If anyone else would like to test it, I’d be appreciative.
(ffs, it is difficult to earn a “solution” from the topic author :P)