CUID Generator (Alternative to GUIDs)

Collision-resistant Ids (CUIDs) are an alternative to GUIDs and UUIDs. In a good implementation they can be a very solid and sometimes better contender to GUIDs.

Here’s what a 24 character CUID looks like:
V0d4qRKRlxcFH9yvR20CgKyS

One area which makes a CUID a better option are areas where relative data size is relevant. Things like networking, and data persistence are areas where I intend to utilize a CUID over a GUID. This is because a CUID is shorter, and is already in a compressed state making them a faster, and smaller option in this area.

My particular implementation is optimized for speed, although it is still reasonably collision resistant. In my benchmarking I found that my CUID implementation was actually a little faster than GUID (although, speed here is probably negligible since it’s extremely fast in either case).

Why should you use this?

It’s probably not going to be the most valuable thing in the world. But if you find that you’re sending a lot of GUIDs across the network or saving a lot of GUIDs this might benefit you in a very small way. If you find that you’re compressing GUIDs to send them across the network and then uncompress them on the other side, then CUIDs will save you some processing time there for sure. A compressed GUID is typically 24 characters, default size on this CUID implementation is also 24.

The default size I’ve used in my CUID implementation is 24, this seems to be the standard for CUIDs. I tested with sizes as low as 12 and saw no duplicates after generating and checking 5,000,000 CUIDs. It’s important to note that shorter CUIDs are more likely to collide, so use lower values with caution.

I made this mostly for fun, and to help with network traffic in my games. Nothing fancy, or particularly good. Just thought I’d share.

Here’s the Module, if you’re interested.

CUID.luau
--!native
--!strict

export type CUIDString = string

local CUID = {}

local default_alphabet = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' -- Base 62

local length = 24
local salt_length = 12

local alphabet = {}
local base = 0
local log_num_base = 0
local base_sub_one = 0

local function initialize_alphabet(alphabet_string: string)
	alphabet = table.create(#alphabet_string)
	for i = 1, #alphabet_string do
		alphabet[i - 1] = alphabet_string:sub(i, i)
	end

	base = #alphabet_string
	log_num_base = math.log(base)
	base_sub_one = base - 1
end

local function randomString(length: number): string
	local result: {} = table.create(length)
	
	for i: number = 1, length do
		result[i] = alphabet[math.random(0, base_sub_one)]
	end
	return table.concat(result)
end

local counter = 0
local lastTime = 0
local function getTimestamp(): number
	local time = os.clock() * 1000
	counter = (time == lastTime and counter + 1 or 0)
	lastTime = time
	return time
end

local function getBase(num: number): string
	local result: {} = table.create((math.log(num) // log_num_base) + 1)
	repeat
		table.insert(result, alphabet[math.floor(num % base)])
		num = num // base
	until num == 0

	return table.concat(result)
end

-- Convert a number to a base string
local function toBase(num: number): string
	return num < base and alphabet[num] or getBase(num)
end

local function hash(str: string)
	local hash = 0;
	for i = 1, #str do
		local char = str:byte(i);
		hash = bit32.band(bit32.bxor(bit32.lshift(hash, 5), hash) + char, 0xFFFFFFFF);
	end
	return hash
end

local oid = ''
local salts: { string } = table.create(10000)
local salt_index = 1
local over_length = false

function CUID.new(): CUIDString
	local first_letter: string = alphabet[math.random(0, base_sub_one)]
	
	local timestamp = getTimestamp()
	local time_s = toBase(timestamp)
	local count = toBase(counter)
	local disp_oid = oid
	local salt = salts[(salt_index % #salts) + 1]
	
	salt_index += 1
	
	-- Occasionally, the length of the generated id might exceed the 
	return (first_letter..time_s..salt..count..disp_oid):sub(1, length)
end 

function CUID:SetAlphabet(alphabet_string: string)
	initialize_alphabet(alphabet_string)
	
	CUID:SetSaltLength(salt_length)
end

function CUID:SetLength(value: number)
	length = value

	local size = value - (salt_length + 7)
	if (size > 0) then
		oid = toBase(math.round((tick() - os.clock()) / 3) * 3):sub(1, size)

		local extra_size = value - (salt_length + 7 + #oid)
		if (extra_size > 0) then
			oid = oid .. randomString(extra_size)
		end
	else
		oid = ''
	end
	
	-- Avoid making the oid string too long to help avoid the need to trim the CUID string to length.
end

function CUID:SetSaltLength(value: number)
	-- Salt length needs to leave space for the timestamp and the random letter in the front.
	-- If the CUID length is relatively short, then the OID can be forfeit.
	local max_length = length - (#(toBase(getTimestamp())) + 2)
	
	local original_value = value
	value = math.min(max_length, value)
	
	if (value < original_value) then
		-- Avoid making the salt string too long to help avoid the need to trim the CUID string to length.
		warn(`Salt length exceeded maximum possible for CUID length. Maximum salt length: {max_length}.`)
	end
	
	salt_length = value
	
	for i = 1, 10000 do
		salts[i] = randomString(value)
	end
	
	salt_index = 1
	
	CUID:SetLength(length)
end

initialize_alphabet(default_alphabet)
CUID:SetSaltLength(salt_length)

export type CUID = {
	new: () -> CUIDString,
	SetAlphabet: (alphabet_string: string) -> (),
	SetLength: (length: number) -> (),
	SetSaltLength: (length: number) -> (),
}

return CUID
Usage Example
local CUID = require(path.to.CUID)
print(CUID.new())

Output:

V0d4qRKRlxcFH9yvR20CgKyS

Benchmarking:

I certainly didn’t do anything fancy for benchmarking on this. Just ran the CUID generator millions of times versus the GUID generator in the same number of runs. I found that my CUID generation was consistently taking about 0.000299ms to run (best out of 100,000 runs) where a GUID typically took 0.000399ms to generate. It’s tough to say how accurate these numbers are, as we’re down into the 10,000ths of a millisecond which are small enough that I’m simply unsure of their accuracy. But it seems to track with about how long it takes to generate 5,000,000 CUIDs (1.878 seconds) versus 5,000,000 GUIDs (2.834 seconds).

This doesn’t really get into the time savings of compressing (encoding) GUIDs for network or data persistence, I’ve not written a compression/decompression algorithm.

It’s pretty crazy that Lua can actually do this much work in 3 ten-thousandths of a millisecond though!

Benchmark Example
--!strict

local CUID = require(script.Parent.CUID)

local speed_tests = 5000000
local collision_tests = 5000000

local c1 = os.clock()
for i = 1, speed_tests do
	CUID.new()
end
local c2 = os.clock()

print(`SPEED TEST: Generated {speed_tests} CUIDs in {c2 - c1} seconds, ~{((c2 - c1) / speed_tests) * 1000}ms per CUID`)

task.wait()

local HTTPService = game:GetService('HttpService')

local c1 = os.clock()
for i = 1, speed_tests do
	HTTPService:GenerateGUID(false)
end
local c2 = os.clock()

print(`SPEED TEST: Generated {speed_tests} GUIDs in {c2 - c1} seconds, ~{((c2 - c1) / speed_tests) * 1000}ms per GUID`)

task.wait()

local map = {}
local duplicates = 0

local t = os.clock() * 1000

local c1 = os.clock()
for i = 1, collision_tests do
	local c = CUID.new()
	if (map[c]) then
		duplicates += 1
	else
		map[c] = 1
	end
end
local c2 = os.clock()

print(`COLLISION TEST: Generated {collision_tests} CUIDs in {c2 - c1} seconds and got {duplicates} duplicates`)

task.wait()

local best = math.huge

for i = 1, 100000 do
	local c1 = os.clock()
	CUID.new()
	local c2 = os.clock()
	best = math.min(best, c2 - c1)
end	

print(`SINGLE TEST: Generated CUID in {(best) * 1000} milliseconds`)

local best = math.huge

for i = 1, 100000 do
	local c1 = os.clock()
	HTTPService:GenerateGUID(false)
	local c2 = os.clock()
	best = math.min(best, c2 - c1)
end	
	
print(`SINGLE TEST: Generated GUID in {(best) * 1000} milliseconds`)

for i = 1, 1 do
	print(CUID.new())
end

Output (on my PC):

SPEED TEST: Generated 5000000 CUIDs in 1.878216000040993 seconds, ~0.0003756432000081986ms per CUID
SPEED TEST: Generated 5000000 GUIDs in 2.834328299970366 seconds, ~0.0005668656599940732ms per GUID
COLLISION TEST: Generated 5000000 CUIDs in 4.934043899993412 seconds and got 0 duplicates  -  Server
SINGLE TEST: Generated CUID in 0.0002998858690261841 milliseconds
SINGLE TEST: Generated GUID in 0.0003998866304755211 milliseconds
V0d4qRKRlxcFH9yvR20CgKyS

Final notes:

If you have any suggestions for this module, please feel free to share them.

By default I’m using a base-62 alphabet. This is customizable, should you feel like changing it up. Typical CUID implementations use base-36, which is likely more optimizable in a database. In this case, it a smaller alphabet would increase the chances of a collision. I also tested with base-36, performing 5,000,000 tests and had 0 collisions.

If you’re curious, my relevant PC specs are:

  • CPU: Intel i7-13700F
  • RAM: 64 GB DDR5 5600MH
9 Likes