What is the difference between these two Dictionaries?

I am pretty new to this, so apologies if my terminology is not accurate or if this question is just plain stupid, I’m trying to learn as I go.

I have been working on making a system that spawns objects (trees, rocks, bushes, etc) with randomized settings.

I figured out I can organize my settings in two different ways.

The first way is just storing all settings for each object in a single dictionary. This was hard to read so I added a comment line that helps me know where to reference. (for the sake of this example, I am only using one object and many of the values are simply placeholders)

objects = {
	--  ["Object"] = {[1]name,[2]type,[3]model,[4]size,[5]rotation,[6]HMin,[7]HMax,[8]AMin,[9]AMax,[10]ColorA1,[11]ColorA2,[12]ColorA3,[13]ColorA4,[14]ColorB1,[15]ColorB2
	["PineTree"] = {"Pine Tree","Tree",objj.Tree_Pine,"currently unused",math.random(0,360),0,0,0,0,"Earth green","Sea green","Dark green","Parsley green","Brown1","Brown2"},
}

The other way is to put each individual setting ‘one level deeper’ (not sure terminology here)

objects = {
	["PineTree"] = {
		["Name"] = "Pine Tree",
		["Type"] = "Tree",
		["Model"] = objj.Tree_Pine,
		["Size"] = "currently unused",
		["Rotation"] = math.random(0,360),
		["HMin"] = 0,
		["HMax"] = 0,
		["AMin"] = 0,
		["AMax"] = 0,
		["ColorsA"] = {"Earth green", "Sea green", "Dark green", "Parsley green"},
		["ColorsB"] = {"Brown1", "Brown2"},
	},
}

which is way easier to read/edit. I even have all the random colors options yet another level deep for even easier organization.

So, aside from the 2nd option being easier to read/edit as a developer, is there much of a difference between them? If so, what? Is there any reason to organize like option 1? Does the 1st option access values any faster or something?.. I dunno…

Any discussion is appreciated!

Edit: I would like to add, if it is relevant, that these settings are not going to be changed once they have been ‘perfected’ so the visibility/readability factor is not that important in the long run.
Also, there could be tens-of-thousands of objects that all spawn on server start, so if option 1 is indeed faster, I would rather have performance over human-readability in the long run.

1 Like

I would use the second option for sure, as it’s easier to read and much more organisable.
I’m not really into experimenting how fast would it get accessed, but I don’t think there will be any big difference!

2 Likes

So, believe it or not after reading A stack overflow article on this since I thought your post here was kinda intriguing and made me think a little bit. Now (this is for Python) but string keys do seem to be faster than their numerical counterpart, however as mentioned in the article dictionaries are optimized for string keys I don’t know how true this is for lua, but again as the article on stackoverflow says there is a very small variation in the speed of retrieving the value at that key. But intriguing post.

Here is the link to the stackoverflow article : python - Is it always faster to use string as key in a dict? - Stack Overflow.

2 Likes

This is interesting. And would mean that option 2 not only looks better to my brain, but it could also perform better too!. Which would mean that option 1 is just plain wrong.

I prefer the latter for one simple reason: ease.

Let’s say we have a table that tells us all about a chocolate chip cookie.

local ChcolateChipCookies = {
    Name = "Chocolate Chip Cookies";
    Type = "Baked";
    Ingredients = {"Sugar", "Vanilla Extract", "Butter", "Baking Soda"};
    Quality = "Fresh";
}

Now, if I wanted to add an ingredient, I can just do:

table.insert(ChocolateChipCookies.Ingredients, "Salt")

Okay, now let’s see the other option:

local ChcolateChipCookies = {"Chocolate Chip Cookies", "Baked", "Sugar", "Vanilla Extract", "Butter", "Baking Soda", "Fresh"}

Well now, it’s much more difficult. Where am I even going to insert the value? Even if I were to manually keep track of what’s what, you’re going to end up doing something arguably unnecessary.

Even if I added the ingredients in an array, I’ll still have to keep track of what index the array located in. If I were ever to add more things into the table, like maybe an expiry date, I might end up losing track of what each index refers to.

Though it looks more compact, it’s much trickier to handle data in this way. This is why I would personally choose dictionaries as I can always reference data if I know the key associated with it.

I also understand that you say that data will not be changed. Even so, if you ever did change your data (manually), you would have to restate variables, and it’s just more work on your end. You never know :man_shrugging:

1 Like

Hello, I know that this topic is rather dated, however, I was particularly drawn to it as it concerns the implementation of tables in Lua.

First and foremost, Roblox uses a ‘modified’ version of Lua 5.1.5. So we will have to briefly look at the implementation of tables for this particular version.

With this in mind, we must first understand that a table is a very very special data structure in Lua. It’s made up of two parts a ‘hash’ part (I’ll explain this a little more in a bit) and an ‘array’ part.

The array part stores entries with integer keys BUT the keys are implicit (i.e. there is no need to physically put the keys somewhere in some piece of memory as the data is already sequentially stored).

For the ‘hash’ part, think of another separate array now but each value in the array stores both the key and the value together as a pair instead of just storing the values.

test_table = { ["key1"] = value1, ["key2"] = value2, ["key1"] = value3, value4, value5, value6 }

--[[ Generates two arrays:
hash_part_array = { {"key1", value1}, {"key3", value3}, {"key2", value2} } (the exact order of this hashed array is dependent on the hashing algorithms but for this example I've made it arbitrary)
array_part = { value3, value4, value5 } ]]--

With regards to hashing itself, you can think of hashing as something akin to converting any piece of data (eg. tables, strings, etc) to an integer by applying some function to it. Then this hashed integer put through a function dependent on the ‘size’ of the ‘hash’ part to generate a final integer. This final integer is then used to index the key-value pair in the hash part array.

Coming back to the question as to which is faster for looking up a value with a specific key/index (“does the 1st option access values any faster”), option 1 certainly is faster as Lua simply needs to lookup the position in the ‘array’ part according to the index provided and return the value. Whereas for option 2, Lua needs to hash the key again and perform a whole bunch of equality or identity checks to perform the lookup in the ‘hash’ part. (Note: this is for lookups, inserting values into tables is a different ball game)

But, as mentioned in the other replies, I’d firmly recommend using dictionaries for that particular situation where readability comes into play. You’ll never know when you are debugging or wanting to modify the structure of the data and assigning string literals as keys provides that much more readability to help you in this area. Furthermore, optimising something for the sake of optimising can be quite inefficient for you as the programmer.

As aptly put by Donald Knuth, “premature optimization is the root of all evil (or at least most of it) in programming.”

The link below might provide some information as to how hashing works:

Lua 5.1 Internals: Tables I

Edit (These two links might be of some use too):

Lua Performance Tips
The Implementation of Lua 5.0

1 Like

Actually Lua 5.1.4 but the only difference you’ll see between those version are bug fixes.

This is such a valuable reply. Thank you!

1 Like