BufferTemplates | A super-aggressive compression library

SpaceDice999 · June 23, 2023, 6:16pm

Creator Marketplace | GitHub | Documentation

WARNING: BufferTemplates is in alpha and a proof of concept. DO NOT USE THIS IN PRODUCTION OR LIVE EXPERIENCES.

What is BufferTemplates?

BufferTemplates is a declarative abstraction layer above BitBuffers that provides a readable library for data compression while giving developers the maximum amount of control possible. Instead of reading and writing data onto a buffer, you instead get something called a “template” which is an instruction table that tells the library how to compress the specified type of data. It offers extreme compression that outperforms generic algorithms, even at smaller data sizes of less than a kilobyte.

Compression in Roblox

Ever since Roblox implemented DataStores there was always a need to compress save data since back then the size limit was significantly smaller than it is today. If Roblox developers wanted to store anything big like huge player-made builds they had to use some form of compression. Various generic methods were used over the years such as LZ4, zlib, and LZW but the one that was arguably the most intriguing were BitBuffers released sometime during 2014. Several more optimized and modern versions were eventually developed as well towards 2021.

To understand why BitBuffers work its important to understand how Roblox saves things. Data is saved by first converting it into a JSON string before writing it into a DataStore, which is an issue because strings take up a lot of bytes. For example, the number 100 would take 7 bits to store but Roblox would convert it into the string "100" which takes up 24 bits. That is a whole 17 bits wasted! BitBuffers were created to allow developers to actually use bits instead of this less efficient method. But most people still don’t use BitBuffers at all. So what’s the catch?

The Problem with BitBuffers

BitBuffers have vast potential in the world of Roblox compression but their usage is hampered by the fact that data is appended to a bitstream. This method by itself is not a problem. It’s how BitBuffer manages to condense so much information! But a one-dimensional representation like this is hard to visualize which makes maintaining and debugging the code become more nightmarish the more complicated the data becomes. Another problem is that serialization and deserialization are performed separately with no connection between the two making any update made to one function necessitate updating the other manually which makes maintenance even more tedious. Add more complex structures such as variable arrays, complex tables, and now a disaster is on the horizon! The main issue behind all this lies in BitBuffer’s minimal abstraction.

Templates!

How BitBuffer functions is presented raw to the developer. Write bits, read bits. But it is not really necessary to do that ourselves. Once upon a time programmers were manipulating raw bits too, but today programming languages exist which abstracts most of the bit manipulation under the rug. And it turns out we can do the same with BitBuffers too. That is where BufferTemplates comes in. Instead of using BitBuffers like this:

function compress(data)
	local bitBuffer = BitBuffer.new()
	
	bitBuffer:WriteUInt(8, data._version)
	
	bitBuffer:WriteUInt(18, data.stats.hp)
	bitBuffer:WriteUInt(18, data.stats.mp)
	bitBuffer:WriteUInt(18, data.stats.speed)
	bitBuffer:WriteUInt(18, data.stats.charisma)
	
	bitBuffer:WriteString(data.characterName)
	
	-- dynamic array size property
	bitBuffer:WriteUInt(24, #data.inventory)
	for _, itemData in data.inventory do
		bitBuffer:WriteUInt(8, itemData.itemId)
		bitBuffer:WriteUInt(6, itemData.amount)
	end
	
	return bitBuffer:ToBase91()
end

function decompress(compressedData)
	local data = {}
	local bitBuffer = BitBuffer.FromBase91(compressedData)
	
	data._version = bitBuffer:ReadUInt(8)
	
	data.stats = {}
	data.stats.hp = bitBuffer:ReadUInt(18)
	data.stats.mp = bitBuffer:ReadUInt(18)
	data.stats.speed = bitBuffer:ReadUInt(18)
	data.stats.charisma = bitBuffer:ReadUInt(18)
	
	data.characterName = bitBuffer:ReadString()
	
	local inventory = {}
	data.inventory = inventory
	
	-- dynamic array size property
	local inventorySize = bitBuffer:ReadUInt(24)
	for i = 1, inventorySize do
		local itemData = {}
		
		itemData.itemId = bitBuffer:ReadUInt(8)
		itemData.amount = bitBuffer:ReadUInt(6)
		
		table.insert(inventory, itemData)
	end
	
	return data
end

we can use BufferTemplates to handle the BitBuffer stuff for us:

local ITEM_DATA_TEMPLATE = BufferTemplates.Table({
	itemId = BufferTemplates.UInt(8),
	amount = BufferTemplates.UInt(6),
})

local PLAYER_DATA_TEMPLATE = BufferTemplates.Table({
	_version = BufferTemplates.UInt(12),
	
	stats = BufferTemplates.Table({
		hp = BufferTemplates.UInt(18),
		mp = BufferTemplates.UInt(18),
		speed = BufferTemplates.UInt(18),
		charisma = BufferTemplates.UInt(18),
	}),
	
	characterName = BufferTemplates.String(),
	
	inventory = BufferTemplates.Array(ITEM_DATA_TEMPLATE),
})

function compress(data)
	return PLAYER_DATA_TEMPLATE:CompressIntoBase91(data)
end

function decompress(compressedData)
	return PLAYER_DATA_TEMPLATE:DecompressFromBase91(compressedData)
end

Notice that the developer does not even have to write the compress and decompress functions themselves. This is all handled by the templates!

BufferTemplates still requires that the developer specify precise types and sometimes bit width. This is so BufferTemplates can save as much space as possible and gives programmers a lot of control over how BufferTemplate compresses.

Documentation

BufferTemplates methods

Template BufferTemplates.UInt(bitWidth: number)
Returns a template that acts on an unsigned integer.

Template BufferTemplates.Int(bitWidth: number)
Returns a template that acts on an integer.

Template BufferTemplates.Float32()
Returns a template that acts on a 32 bit floating point number.

Template BufferTemplates.Float64()
Returns a template that acts on a 64 bit floating point number.

Template BufferTemplates.Char()
Returns a template that acts on a single character.

Template BufferTemplates.StaticString(length: number)
Returns a template that acts on a string with a specified length.

Template BufferTemplates.String()
Returns a template that acts on a string with any length smaller than 16,777,216.

Template BufferTemplates.Bool()
Returns a template that acts on a boolean.

Template BufferTemplates.Table(t: {[string]: Template})
Returns a template that acts on a table.

Template BufferTemplates.StaticArray(size: number, template: Template)
Returns a template that acts on an array with a set size.

Template BufferTemplates.Array(t: {[string]: Template})
Returns a template that acts on an array with any size smaller than 16,777,216.

Template BufferTemplates.Enum(enum: {string})
Returns a template that acts on a user-defined enum.

Template BufferTemplates.Color3(enum: {string})
Returns a template that acts on a Color3.

Template BufferTemplates.Vector3(enum: {string})
Returns a template that acts on a Vector3.

Template BufferTemplates.Group(templates: {Template})
Returns a template that acts on ambivalent data that may use different templates based on circumstances.

Template BufferTemplates.Custom(write: function(data, buffer: BitBuffer?) -> (buffer: BitBuffer), read: function(buffer: BitBuffer) -> (data: any, buffer: BitBuffer)
Returns a template with a custom read and write method.

Template methods

string Template:CompressIntoBase91(data: any)
Returns a compressed string in Base91 using the specified template. (Recommended)

string Template:CompressIntoBase64(data: any)
Returns a compressed string in Base64 using the specified template.

any Template:DecompressFromBase91(compressedData: string)
Returns decompressed data from Base91 using the specified template. (Recommended)

any Template:DecompressFromBase64(compressedData: string)
Returns decompressed data from Base64 using the specified template.

Benchmark

Template used:

local Races = {
	"Human",
	"Elf",
	"Dwarf",
	"Dragon",
	"Demon",
	"Angel"
}

local RACE_TEMPLATE = BufferTemplates.Enum(Races)

local ITEM_DATA_TEMPLATE = BufferTemplates.Table({
	itemId = BufferTemplates.UInt(8),
	amount = BufferTemplates.UInt(6),
})

local HEADER_TEMPLATE = BufferTemplates.Table({
	version = BufferTemplates.UInt(24),
	banned = BufferTemplates.Bool(),
})

local USER_DATA_TEMPLATE = BufferTemplates.Table({
	_header = HEADER_TEMPLATE,
	
	stats = BufferTemplates.Table({
		hp = BufferTemplates.UInt(18),
		mp = BufferTemplates.UInt(18),
	}),
	
	hairColor = BufferTemplates.Color3(),
	
	characterName = BufferTemplates.String(),
	race = RACE_TEMPLATE,
	inventory = BufferTemplates.Array(ITEM_DATA_TEMPLATE),
})

The data we will compress:

local data = {
	_header = {
		version = 3,
		banned = false
	},
	
	stats = {
		hp = 679,
		mp = 440,
	},
	
	hairColor = Color3.new(.4, .6, .7),
	
	characterName = "Gandolf",
	
	race = "Angel",
	
	inventory = {
		{itemId = 4, amount = 34},
		{itemId = 70, amount = 12},
	}
}

We will compress this data with BufferTemplates, LZW, and zlib.

Uncompressed size: 191 B
BufferTemplates compressed size: 36 B (18.85% of original size)
LZW compressed size: 305 B (159.66% of original size)
zlib compressed size: 153 B (80.10% of original size)

Download

Creator Marketplace: https://create.roblox.com/marketplace/asset/13840098917/BufferTemplates
GitHub repo: GitHub - SpaceDice9/BufferTemplates

BufferTemplates uses the optimized BitBuffer from this GitHub repo: GitHub - rstk/BitBuffer: Fast BitBuffer for Roblox

metatablecatmaid · June 23, 2023, 6:28pm

Hey there, thanks for referencing my lz4 library. I’ve stated in the documentation that the compression algorithm used on it is not very, well, working. It was written mainly to decompress data streams from RBXM files.

It probably isn’t very good to use for benchmarking because of this

AverageLua · June 23, 2023, 7:01pm

Thanks for providing this awesome resource!

This is a great introduction to compression and data management and I would love to see a production-ready version of this module sometime in the future!

Cristiano100 · June 29, 2023, 12:08am

What’s the time cost for this?

SpaceDice999 · June 29, 2023, 1:19am

If you’re talking about performance then it’s pretty fast. It can compress 100,000 tables in around 0.2 seconds or about one table every 2 microseconds.

SpaceDice999 · June 29, 2023, 7:28pm

I finally created the GitHub repo for this that contains the source code. It’s my first time making one so go easy on me.

SpaceDice999 · June 30, 2023, 11:43pm

Small Update

Significantly optimized the performance of Group and Enum. They should no longer exhaust script execution time when compressing large amounts of data.

PR0XlM · July 29, 2024, 6:59pm

Nice work I like this overall idea and I’m going to try to use it in one of my projects. Good job on this

SpaceDice999 · July 30, 2024, 12:57am

Thank you for the encouragement! This was an old project of mine that existed before Roblox added the buffer datatype so it uses its own buffer class that somebody else made. I haven’t really made any updates to it recently but it should still work as expected. Also, make sure to get the library from GitHub since that is more recent and has new templates that haven’t been released onto the Creator Store.

123marble · August 26, 2024, 10:41pm

Awesome module!

Would you be able to explain why you chose Base 91 as the compressed format? Utf-8 uses only 1 byte for the first 127 unique characters, so I’m thinking there may be storage gains for a higher base still?

SpaceDice999 · August 27, 2024, 12:05am

Not all the characters in utf 8 can be properly stored.