Does key length affect dictionary performance?

A while ago, I tried to check and see what can cause a dictionary to slow down. I’ve compared dictionaries with a number as a key against another one with a lengthy string as a key, and found that the results were inconsistent. My code might’ve had a flaw in it, and I don’t believe I understand what I can do differently to get valid results.

If anyone has any experience with optimizing dictionaries, what would you recommend doing when creating and using dictionaries?

  1. How does the key affect memory usage?
  2. How does the key affect performance?
  3. If keys matter, what is the fastest key to use? (number, instance, etc.)
  4. If keys matter, are the differences negligible enough when compared to the readability and convenience of keys that make sense?

I’m aware that having a readable code base is a good practice, but sometimes when you have large amounts of data that stays in memory, a tiny optimization can go a long way.

2 Likes

In this case it will be not a tiny optimization, it will be extremely insignificant, molecular amount of optimization.

Yes, the longer the dictionary, more time it requires to for example go through all of its keys in “for pairs”.
But we are talking here about maybe 0.0000000001 seconds difference. So if you are looking to optimize your code, you need some other tactics instead of making dictionaries shorter or using different keys.

Usually tho anywhere (be it dictionaries or otherwise), strings use more memory than integers.

2 Likes
Running Lua 5.5.3 in WSL I get these results:
local n = 2 * 10e7
local t0 = os.clock()
local o = {}
local t = {}
for i = 1, n do
        --t[1] = i --1.42s
        t[o] = i --2.66s
        --t.v = i --1.42s
        --t.testtestestestestsetestestsetsetest = i --1.42s
end
print(os.clock() - t0)
And in Studio I get these results:
local n = 2 * 10e7
local t0 = tick()
local o = {}
local t = {}
for i = 1, n do
	--t[1] = i --1.31s
	--t[o] = i --2.43s
	--t.v = i --1.31s
	--t.testtestestestestsetestestsetsetest = i --1.31s
end
print(tick() - t0)

Roblox’s version seems to be slightly faster, but it tells the same story: using tables as dictionary keys takes almost twice as long compared to integers and strings. String length doesn’t seem to matter, but that might be different if it wasn’t the same string being reused over and over again: if Lua keeps around the hash of a string to reuse, then that could be the explanation. It only actually gets hashed once. I don’t know for sure that’s what’s going on, and I can’t think of a way of testing it. Here’s an approach that definitely doesn’t work:

t[tostring(i)] = i -- > 200s, > 8 GB

The overhead of converting a number into a string takes over completely. I didn’t finish after several minutes on my PC, probably because it uses way too much memory.

The test was unsuccessful, but it does teach us something important: in a situation where you’re using lots of different strings as dictionary keys, you’re probably doing something wrong in the first place. Look for a solution that doesn’t involve creating 1000s of strings.

More optimization stuff is covered here: https://www.lua.org/gems/sample.pdf, although it’s quite old and Roblox’s version of Lua / Luau may have some important differences. As always, make sure you test the different approaches to actually know which one is faster, because theory will only help you so much.

1 Like