How to serialize and deserialize dictionaries using BitBuffer?

I want to serialize and deserialize a dictionary, but the dictionary is dynamic. The BitBuffer library I am using: GitHub - rstk/BitBuffer: Fast BitBuffer for Roblox

-- example array I want to save
local data = {
	{
		a = 1,
		b = true,
		c = "hello",
		d = {
			e = "world",
			f = 2,
		},
	},
	{
		a = 3,
		b = false,
		c = "qwerty",
		d = {
			g = 4,
			h = false,
			f = 3,
		},
	},
	... -- more
}

Table d is different in every table in the array. How do I serialize and deserialize them? They all have different keys and values types.

local function serialize()
	local buffer = BitBuffer.new()

	local amount = #data

	buffer:WriteUInt(8, 1) -- version
	buffer:WriteUInt(12, length) -- length of data array

	for _, v in ipairs(data) do
		buffer:WriteUint(5, v.a)
		buffer:WriteBool(v.b)
		buffer:WriteString(v.c)
		
		-- what to do for v.d?
		for k, v2 in pairs(v.d) do
			-- ???
		end
	end

	return buffer:ToBase91()
end

local function deserialize(string)
	local buffer = BitBuffer.FromBase91(string or "")
	local version = buffer:ReadUInt(8)
	local length = buffer:ReadUInt(12)

	local newData = {}
	for _ = 1, length do
		local v = {
			a = buffer:ReadUInt(5),
			b = buffer:ReadBool(),
			c = buffer:ReadString(),

			-- what to do for the dictionary?
		}
		table.insert(newData, v)
	end

	return newData
end

How do I serialize the dictionary if it has different keys and value types?

You could add a key/field to each “d” dictionary that acts as a marker which indicates the deserialized value the serialized data represents.

When it comes to serialization you’d check this marker and act accordingly.

I have slightly edited the example code.

d can have any combination of keys and values. d can also be empty. Some more examples of d

d = {
	e = "world",
	f = 2,
},
-- or
d = {
	e = "hello",
	h = true
},
-- or
d = {
	e = "world",
	f = 3,
	h = true,
},

I am confused as to what you mean by this. d is already the deserialized version and you want to have a key in d that points to itself?

No, the key/field would indicate whatever class the serialized data represents, i.e g = "Color3" or some other class.

That would not work. d.e will always be a string. d.f will always be a number.


After thinking for a bit, I feel like I am describing my problem wrong.

When I deserialize the data I need to know which parts are for the d dictionary. After I call buffer:Read,

a = buffer:ReadUInt(5),
b = buffer:ReadBool(),
c = buffer:ReadString(),

How do I know what :Read to call next and how many times I need to call it. If I call it too many times, it’ll will read the data for the next table incorrectly.


I think I have an idea on how to serialize the dictionary. You just count the dictionary and save that number. But the problem is deserializing, you need to :Read the same order you :Write.

local function serialize()
	local buffer = BitBuffer.new()
	
	local amount = #data
	
	buffer:WriteUInt(8, 1) -- version
	buffer:WriteUInt(12, length) -- length of data array
	
	for _, v in ipairs(data) do
		buffer:WriteUInt(5, v.a)
		buffer:WriteBool(v.b)
		buffer:WriteString(v.c)

		local amount = 0
		for k, v2 in pairs(v.d) do
			amount += 1
		end

		buffer:WriteUInt(12, amount) -- the amount of keys
		
		for k, v2 in pairs(v.d) do
			local type = typeof(v2)
			if type == "boolean" then
				buffer:WriteBool(v2)
			elseif type == "" then
				-- and so on for the types
			end
		end
	end
	
	return buffer:ToBase91()
end

Alright, here’s a basic way you can do it!

Sorry I don’t really have the time to do like a complete breakdown of how it works, but I’ve commented this code for you to take a look at.

This version only works with string indexes and values. however i bet you could expand this to work with numbers and bools etc.

local BitBuffer = require(game.ServerScriptService.BitBuffer) 


local BinaryDefintions = {
	ValueType = {
		-- Conversion 
		[0x01] = "Table", 
		[0x02] = "String",
		-- NormalIndex values
 		Table = 0x01, -- Indicator for tables
		String = 0x02, -- indicator for our only type.
	}
}

-- ValueTypeIndicators 

function RecursiveStore(tbl) 
	local Buffer = BitBuffer.new()
	
	for Index,Value in pairs(tbl) do 
		if typeof(Value) == "table" then  
			-- Use our function to store again!
			-- This is the idea of recursion.
			local ConvertBuffer = RecursiveStore(Value)
			
			--WriteIndexTable
			Buffer:WriteUInt(8,BinaryDefintions.ValueType.Table)
			Buffer:WriteString(Index)	
			Buffer:WriteString(ConvertBuffer)
			continue
		end
		-- This is just a basic implimtation that just uses the implicit casting of the tostring() method. 
		
		Buffer:WriteUInt(8,BinaryDefintions.ValueType.String)
		Buffer:WriteString(Index)	
		Buffer:WriteString(Value)
	end
	
	return Buffer:ToString()
end	

function RecursiveRead(String) 
	
	local Table = {} 
	
	local Buffer = BitBuffer.FromString(String)

	-- This will contuine
	repeat
		-- Index to retrive the type written
		local Type = BinaryDefintions.ValueType[Buffer:ReadUInt(8)] 
		
		-- Associate Types with reading methods
		if Type == "Table" then 
			local Index,RawValue = Buffer:ReadString(),Buffer:ReadString()
			
			local Value = RecursiveRead(RawValue) 
			
			-- Store values now! 
			Table[Index] = Value
		end
		
		-- String
		if Type == "String" then 
			local Index,Value = Buffer:ReadString(),Buffer:ReadString()
			
			-- Store values now! 
			Table[Index] = Value
		end
		
		-- So we know when were done reading!
	until not Type 
	
	
	return Table 
end 


local OriginalTable = {
	test = "Working?",
	test2 = "working?",
	upper = {
		test = "Working?"
	}
}


-- Serialize Method
local Data = RecursiveStore(OriginalTable)

-- Deserialize Method
local Table = RecursiveRead(Data)
	
print("==== Compare ====")
print("Orginal    :",OriginalTable)
print("Serialized :",Table)

Things to note,

When serializing table data, use a header value. Baiscally a value that indicates how to read the next stream of information heres of represintation of what i mean.

[ Header 1 ] : [ String ] [ Number ] 
[ Header 2 ] : [ Number ] [ Number ]

So whenn you read that first number you’ll know how to read the next bit of info.

How to know you’ve reached the end of the data, because when reading from the buffer values will return nil. Another way is to keep track of the bytes read and compare the size. (i’ve done this in my modules) and it works pretty well.

Use recursion, think of table data as holding a base level information to later decode. If your confused about how recursion works I’d recommend watching a video about it.

Use header identification, use enum matching to know what your reading as literals.

1 Like

Your method seems to increase the string length a lot.

Comparing JSONEncode vs BitBuffer on your method: This chart shows the string lengths using Buffer:ToBase91 and Buffer.FromBase91 but it also applies to Buffer:ToString and Buffer.FromString.

I kept adding more elements to OriginalTable.upper like so:

local OriginalTable = {
	test = "hello",
	test2 = "world",
	upper = {
		test = {
			test = "hello",
			test2 = "world",
		},
		test2 = {
			test = "hello",
			test2 = "world",
		},
		test3 = {
			test = "hello",
			test2 = "world",
		},
		test4 = {
			test = "hello",
			test2 = "world",
		},
		test5 = {
			test = "hello",
			test2 = "world",
		},
	}
}

Modifications I made in the testing:

print("==== Compare ====")
local json = game.HttpService:JSONEncode(OriginalTable)
print("Orginal:\n", json, #json)

warn()

-- Serialize Method
local Data = RecursiveStore(OriginalTable)

-- Deserialize Method
local Table = RecursiveRead(Data)

print("Serialized:\n", Data, #Data)
The whole script with my modifications
local BitBuffer = require(game.ServerStorage.BitBuffer) 

local BinaryDefintions = {
	ValueType = {
		-- Conversion 
		[0x01] = "Table", 
		[0x02] = "String",
		-- NormalIndex values
		Table = 0x01, -- Indicator for tables
		String = 0x02, -- indicator for our only type.
	}
}

-- ValueTypeIndicators 

function RecursiveStore(tbl) 
	local Buffer = BitBuffer.new()

	for Index,Value in pairs(tbl) do 
		if typeof(Value) == "table" then  
			-- Use our function to store again!
			-- This is the idea of recursion.
			local ConvertBuffer = RecursiveStore(Value)

			--WriteIndexTable
			Buffer:WriteUInt(8,BinaryDefintions.ValueType.Table)
			Buffer:WriteString(Index)	
			Buffer:WriteString(ConvertBuffer)
			continue
		end
		-- This is just a basic implimtation that just uses the implicit casting of the tostring() method. 

		Buffer:WriteUInt(8,BinaryDefintions.ValueType.String)
		Buffer:WriteString(Index)
		Buffer:WriteString(Value)
	end

	return Buffer:ToBase91()
end	

function RecursiveRead(String) 

	local Table = {} 

	local Buffer = BitBuffer.FromBase91(String)

	-- This will contuine
	repeat
		-- Index to retrive the type written
		local Type = BinaryDefintions.ValueType[Buffer:ReadUInt(8)] 

		-- Associate Types with reading methods
		if Type == "Table" then 
			local Index,RawValue = Buffer:ReadString(),Buffer:ReadString()

			local Value = RecursiveRead(RawValue) 

			-- Store values now! 
			Table[Index] = Value
		end

		-- String
		if Type == "String" then 
			local Index,Value = Buffer:ReadString(),Buffer:ReadString()

			-- Store values now! 
			Table[Index] = Value
		end

		-- So we know when were done reading!
	until not Type 


	return Table 
end 


local OriginalTable = {
	test = "hello",
	test2 = "world",
	upper = {
		test = {
			test = "hello",
			test2 = "world",
		},
		test2 = {
			test = "hello",
			test2 = "world",
		},
		test3 = {
			test = "hello",
			test2 = "world",
		},
		test4 = {
			test = "hello",
			test2 = "world",
		},
		test5 = {
			test = "hello",
			test2 = "world",
		},
	}
}

print("==== Compare ====")
local json = game.HttpService:JSONEncode(OriginalTable)
print("Orginal:\n", json, #json)

warn()

-- Serialize Method
local Data = RecursiveStore(OriginalTable)

-- Deserialize Method
local Table = RecursiveRead(Data)

print("Serialized:\n", Data, #Data)

When it comes to binary format that contain large amounts of normal visible ascii data such a strings; it increases such structures that it may not even be worth it to store in a binary format.

Optimizing further:
Encosion methods for binary data arent optimzied as they are limited to visible characters [32-128] .

In order to do this, we can simply dump values, that need to be represented in normal values, instead of binary:

Firstly dumping a raw binary data automatically to the closest byte is kind of complicated yet here it is.

local ByteSizeDecimal = function(bytes) return 2^(bytes*8) end 
local GetNearestByte = function(decimal) 
	-- stops at int overflow, 2^64 -> bytes -> 2^(byte:8*8)
	for i=1,8 do 
		if decimal < ByteSizeDecimal(i) then 
			return i 
		end
	end 
	-- Int overflow!
	return 8
end 

This can be used to turn a decimal into raw then into base91. here is this example :

However heres the main issue, once we want to do such an optimization like this, it is no longer feasible to use bit-buffer. Due to the fact of null terminating characters used to indicate the ending of strings.

Honestly, you’d probably want to make your own customer encoder and parser if your really looking to cut down json size. But unless your,

  1. Storing your tables in raw data and not encoding them,
  2. Or caching common indexes and values.

You are not going to get an object structure ( in binary ) smaller than json.

Here is an example of an structure that i actually started but never released.

Index"Value"Number"300#Table"Index\"Value\"^

Same table in json

{"Table":{"Index":"Value"},"Index":"Value","Number":300}