AnirioTable - comprehensive & optimized array

AnirioTable

The source & test codes can be found in AnirioUtillity - GitHub.
The module is also on Wally and Creator Store.

Brief Introduction to Tables

As game developers, we often use tables for storing complex data.
There are two types of tables, arrays and dictionaries, and both are useful in different situations.

array
dictionary
both

Array and Set of Values

While most of the time, we can store data using just one type of table, sometimes, we might want to utilize both.

Let’s say that we’re creating the game Wordle.

A key part of the game is a word list, which we use to achieve two goals:

  1. generate a random secret word from word list
  2. validate that the player’s input is a valid word from word list
To achieve 1)

We should use an array word list.
We can generate a random word by calling wordList_array[random(1, #wordList_array)] where random()can be either math.random() or Random.NextInteger().

To achieve 2)

We should use a dictionary word list.
Given inputWord from the player, wordList_dict[inputWord] will tell us if the player’s input is a valid word. This is equivalent to the Set.contains() method.

Note:
While an array is fine, we would need to use table.find(), and that is a less efficient approach (see the comparison table).

In short, we have to do both array operation and dictionary indexing with this wordList table.
To accomplish this, we can create a table of the other type by converting array values to a dictionary set (or the reverse). Specifically, the dictionary is a value-boolean hash map representing if a value “exists” in the array.

This leaves us with two tables: an array for getting a random value and a dictionary set for checking if a value exists.

Insert and Remove

So far, we have two tables representing the same data: an array of values and a dictionary of value-boolean mappings.

Now, let’s say that we want to perform basic table modifications like table.insert() and table.remove() to add or remove words from the word list.
In our case, a new word doesn’t need to be inserted if it’s already in the word list.

Inserting a word

Given a newWord to insert, we first validate if the word is not already in our wordList.
After inserting newWord, we also need to update the dictionary table by mapping newWord to true.

-- Inserting a word
if not wordList_dict[newWord] then
    table.insert(wordList_array, newWord)
    wordList_dict[newWord] = true
end
Removing a word

Given a word to remove, we first check if the word is already in our wordList.
We can then remove word from the array table by calling table.remove() with the position from table.find().

Note:
This isn’t an efficient operation to call; more on this in the next section.

Afterwards, we can maintain the state of the dictionary table by mapping word to nil.

-- Removing a word
if wordList_dict[word] then
    table.remove(wordList_array, table.find(wordList_array, word))
    wordList_dict[word] = nil
end

As you might have noticed, when inserting and removing values, we’ll do some operations on both the array and the dictionary table.
Once these operations are performed, we can then proceed to generate a random word and validate a player’s input word without problem.

A New Data Structure

In our Wordle, we’ve been accessing, inserting, and updating our wordList by using both the array and the dictionary table.

Why not create a new object that internally stores an array and a dictionary table, and when tasked with a table operation, makes the necessary changes to both?

That’s why I made AnirioTable, a data structure that internally stores an array and a dictionary table and provides methods for inserting values and removing values.

Time complexity & optimization

Let’s talk about the time complexity of the operations we implemented in our Wordle:

  • Getting a random value is essentially constant time given we have an array table for our data (though the length operator # may be logarithmic time).
  • Checking value existence is constant time since Luau’s dictionaries are hash maps, which have constant search time on average.
  • Insert value is constant time since both inserting an element to the end of an array and setting an element of a dictionary are constant time.
  • Removing value is linear time since both table.find() and table.remove() are linear time.
Optimizing remove

Removing value is the only operation in our data structure that’s inefficient.
Like arrays in other languages, there are several ways to make value removal efficient.

  • Swap the given value with the value at the end of array, then remove the last element.
    • This can be done in constant time if we keep track of the indexes of the values.
    • Since the value at the end of the array is moved to the middle, array order is modified.
  • Using a Linked List or another data structure to remove values.
    • To use array operations, we need to first convert the data structure to an array table.
    • While all four operations above can be constant time, the new conversion to array process will be linear time.
  • Lazy evaluation - store the values to be removed, and delay the removal operations until they need to be done.
    • We can store the values to be removed in a dictionary in constant time.
    • Evaluating the delayed remove operations is linear time.

Overall, lazy evaluation seems to be the most efficient way to remove values while maintaining the overall array order.

After using lazy evaluation to remove values, removing values is now constant time, and only the non-remove operation immediately after removing values is linear time.

Multiset dictionary

For completeness, the dictionary in AnirioTable is implemented as a multiset. Thus, duplicated values can be stored in the data structure.
However, you can also use AnirioTable with a normal set by storing only unique values by calling AnirioTable.insertUnique().

AnirioTable methods
  • new(): AnirioTable.TableType<T>
    • Create a new empty AnirioTable
  • setLazy(useLazy: boolean)
    • Modify whether the table delays evaluating removeValues()
  • insertUnique(values: {T})
    • Insert values from the given table into the stored table; already-stored values won’t be inserted
    • Full: insertUnique(values: {T}, validateTypeOf: string?, validateIsA: string?)
  • insertAll(values: {T})
    • Insert all values from the given table into the stored table
    • Full: insertAll(values: {T}, validateTypeOf: string?, validateIsA: string?)
  • removeValues(values: {T})
    • Remove values in the given table from the stored table
    • Full: removeValues(values: {T}, validateTypeOf: string?, validateIsA: string?)
  • getArray(): {T}
    • Access the stored values table
  • getValueCount(value: T): number
    • Access the count state of a value; equivalent to the count of the value in the stored table
  • contains(value: T): boolean
    • Check whether a value is found in the stored table
  • compareTableChanges(newTable: {T}): ({T}, {T})
    • Compare the stored table with the given table, returning the added and removed values as two tables
  • clear()
    • Clear the stored table
  • destroy()
    • Clean up the object

Bonus: can you think of other situations where we might use both the properties of array and dictionary?

Why efficiency?

From the explanations above, you might notice that the new data structure is only used for efficiency. That is true, but for a game with many objects to run smoothly at high fps, efficiency is important! The effect might seem small, but every efficiency made (without much tradeoff in memory use) is never something bad.

Things to keep in mind

  • When sending an AnirioTable remotely, keep in mind of argument limitations.
    • To achieve this successfully, we can send the content of AnirioTable.getArray() and reconstruct an AnirioTable from that.
    • Alternatively, we can remove metatables entirely and instead use utility functions as shown in sleitnick’s Fixing Lua OOP video.