Collision-resistant Ids (CUIDs) are an alternative to GUIDs and UUIDs. In a good implementation they can be a very solid and sometimes better contender to GUIDs.
Here’s what a 24 character CUID looks like:
V0d4qRKRlxcFH9yvR20CgKyS
One area which makes a CUID a better option are areas where relative data size is relevant. Things like networking, and data persistence are areas where I intend to utilize a CUID over a GUID. This is because a CUID is shorter, and is already in a compressed state making them a faster, and smaller option in this area.
My particular implementation is optimized for speed, although it is still reasonably collision resistant. In my benchmarking I found that my CUID implementation was actually a little faster than GUID (although, speed here is probably negligible since it’s extremely fast in either case).
Why should you use this?
It’s probably not going to be the most valuable thing in the world. But if you find that you’re sending a lot of GUIDs across the network or saving a lot of GUIDs this might benefit you in a very small way. If you find that you’re compressing GUIDs to send them across the network and then uncompress them on the other side, then CUIDs will save you some processing time there for sure. A compressed GUID is typically 24 characters, default size on this CUID implementation is also 24.
The default size I’ve used in my CUID implementation is 24, this seems to be the standard for CUIDs. I tested with sizes as low as 12 and saw no duplicates after generating and checking 5,000,000 CUIDs. It’s important to note that shorter CUIDs are more likely to collide, so use lower values with caution.
I made this mostly for fun, and to help with network traffic in my games. Nothing fancy, or particularly good. Just thought I’d share.
Here’s the Module, if you’re interested.
CUID.luau
--!native
--!strict
export type CUIDString = string
local CUID = {}
local default_alphabet = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' -- Base 62
local length = 24
local salt_length = 12
local alphabet = {}
local base = 0
local log_num_base = 0
local base_sub_one = 0
local function initialize_alphabet(alphabet_string: string)
alphabet = table.create(#alphabet_string)
for i = 1, #alphabet_string do
alphabet[i - 1] = alphabet_string:sub(i, i)
end
base = #alphabet_string
log_num_base = math.log(base)
base_sub_one = base - 1
end
local function randomString(length: number): string
local result: {} = table.create(length)
for i: number = 1, length do
result[i] = alphabet[math.random(0, base_sub_one)]
end
return table.concat(result)
end
local counter = 0
local lastTime = 0
local function getTimestamp(): number
local time = os.clock() * 1000
counter = (time == lastTime and counter + 1 or 0)
lastTime = time
return time
end
local function getBase(num: number): string
local result: {} = table.create((math.log(num) // log_num_base) + 1)
repeat
table.insert(result, alphabet[math.floor(num % base)])
num = num // base
until num == 0
return table.concat(result)
end
-- Convert a number to a base string
local function toBase(num: number): string
return num < base and alphabet[num] or getBase(num)
end
local function hash(str: string)
local hash = 0;
for i = 1, #str do
local char = str:byte(i);
hash = bit32.band(bit32.bxor(bit32.lshift(hash, 5), hash) + char, 0xFFFFFFFF);
end
return hash
end
local oid = ''
local salts: { string } = table.create(10000)
local salt_index = 1
local over_length = false
function CUID.new(): CUIDString
local first_letter: string = alphabet[math.random(0, base_sub_one)]
local timestamp = getTimestamp()
local time_s = toBase(timestamp)
local count = toBase(counter)
local disp_oid = oid
local salt = salts[(salt_index % #salts) + 1]
salt_index += 1
-- Occasionally, the length of the generated id might exceed the
return (first_letter..time_s..salt..count..disp_oid):sub(1, length)
end
function CUID:SetAlphabet(alphabet_string: string)
initialize_alphabet(alphabet_string)
CUID:SetSaltLength(salt_length)
end
function CUID:SetLength(value: number)
length = value
local size = value - (salt_length + 7)
if (size > 0) then
oid = toBase(math.round((tick() - os.clock()) / 3) * 3):sub(1, size)
local extra_size = value - (salt_length + 7 + #oid)
if (extra_size > 0) then
oid = oid .. randomString(extra_size)
end
else
oid = ''
end
-- Avoid making the oid string too long to help avoid the need to trim the CUID string to length.
end
function CUID:SetSaltLength(value: number)
-- Salt length needs to leave space for the timestamp and the random letter in the front.
-- If the CUID length is relatively short, then the OID can be forfeit.
local max_length = length - (#(toBase(getTimestamp())) + 2)
local original_value = value
value = math.min(max_length, value)
if (value < original_value) then
-- Avoid making the salt string too long to help avoid the need to trim the CUID string to length.
warn(`Salt length exceeded maximum possible for CUID length. Maximum salt length: {max_length}.`)
end
salt_length = value
for i = 1, 10000 do
salts[i] = randomString(value)
end
salt_index = 1
CUID:SetLength(length)
end
initialize_alphabet(default_alphabet)
CUID:SetSaltLength(salt_length)
export type CUID = {
new: () -> CUIDString,
SetAlphabet: (alphabet_string: string) -> (),
SetLength: (length: number) -> (),
SetSaltLength: (length: number) -> (),
}
return CUID
Usage Example
local CUID = require(path.to.CUID)
print(CUID.new())
Output:
V0d4qRKRlxcFH9yvR20CgKyS
Benchmarking:
I certainly didn’t do anything fancy for benchmarking on this. Just ran the CUID generator millions of times versus the GUID generator in the same number of runs. I found that my CUID generation was consistently taking about 0.000299ms
to run (best out of 100,000 runs) where a GUID typically took 0.000399ms
to generate. It’s tough to say how accurate these numbers are, as we’re down into the 10,000ths of a millisecond which are small enough that I’m simply unsure of their accuracy. But it seems to track with about how long it takes to generate 5,000,000 CUIDs (1.878 seconds
) versus 5,000,000 GUIDs (2.834 seconds
).
This doesn’t really get into the time savings of compressing (encoding) GUIDs for network or data persistence, I’ve not written a compression/decompression algorithm.
It’s pretty crazy that Lua can actually do this much work in 3 ten-thousandths of a millisecond though!
Benchmark Example
--!strict
local CUID = require(script.Parent.CUID)
local speed_tests = 5000000
local collision_tests = 5000000
local c1 = os.clock()
for i = 1, speed_tests do
CUID.new()
end
local c2 = os.clock()
print(`SPEED TEST: Generated {speed_tests} CUIDs in {c2 - c1} seconds, ~{((c2 - c1) / speed_tests) * 1000}ms per CUID`)
task.wait()
local HTTPService = game:GetService('HttpService')
local c1 = os.clock()
for i = 1, speed_tests do
HTTPService:GenerateGUID(false)
end
local c2 = os.clock()
print(`SPEED TEST: Generated {speed_tests} GUIDs in {c2 - c1} seconds, ~{((c2 - c1) / speed_tests) * 1000}ms per GUID`)
task.wait()
local map = {}
local duplicates = 0
local t = os.clock() * 1000
local c1 = os.clock()
for i = 1, collision_tests do
local c = CUID.new()
if (map[c]) then
duplicates += 1
else
map[c] = 1
end
end
local c2 = os.clock()
print(`COLLISION TEST: Generated {collision_tests} CUIDs in {c2 - c1} seconds and got {duplicates} duplicates`)
task.wait()
local best = math.huge
for i = 1, 100000 do
local c1 = os.clock()
CUID.new()
local c2 = os.clock()
best = math.min(best, c2 - c1)
end
print(`SINGLE TEST: Generated CUID in {(best) * 1000} milliseconds`)
local best = math.huge
for i = 1, 100000 do
local c1 = os.clock()
HTTPService:GenerateGUID(false)
local c2 = os.clock()
best = math.min(best, c2 - c1)
end
print(`SINGLE TEST: Generated GUID in {(best) * 1000} milliseconds`)
for i = 1, 1 do
print(CUID.new())
end
Output (on my PC):
SPEED TEST: Generated 5000000 CUIDs in 1.878216000040993 seconds, ~0.0003756432000081986ms per CUID
SPEED TEST: Generated 5000000 GUIDs in 2.834328299970366 seconds, ~0.0005668656599940732ms per GUID
COLLISION TEST: Generated 5000000 CUIDs in 4.934043899993412 seconds and got 0 duplicates - Server
SINGLE TEST: Generated CUID in 0.0002998858690261841 milliseconds
SINGLE TEST: Generated GUID in 0.0003998866304755211 milliseconds
V0d4qRKRlxcFH9yvR20CgKyS
Final notes:
If you have any suggestions for this module, please feel free to share them.
By default I’m using a base-62 alphabet. This is customizable, should you feel like changing it up. Typical CUID implementations use base-36, which is likely more optimizable in a database. In this case, it a smaller alphabet would increase the chances of a collision. I also tested with base-36, performing 5,000,000 tests and had 0 collisions.
If you’re curious, my relevant PC specs are:
- CPU: Intel i7-13700F
- RAM: 64 GB DDR5 5600MH