Sera - Low-level schematized serialization library

Galactiq · December 13, 2024, 4:09pm

This module seems pretty helpful for decreasing the amount of data being replicated at once. However, it’s unclear to me why this would be preferable over the usual table. What are some specific use cases that you’ve found Sera useful for? When should we use this and when should we not?

loleris · December 13, 2024, 6:18pm

Roblox games have a modest budget for networking and data storage, so serialization only shines for projects with a lot of data throughput.

Serialization doesn’t make less efficient systems obsolete.

baseparts · December 13, 2024, 7:46pm

It would be interesting to see how the performance of this paired with remote events differs from popular networking resources such as Bytenet or Warp. Do you think this could be faster or possibly result in lower bandwidth due to its simplicity?

loleris · December 13, 2024, 8:22pm

I’ve been making individual benchmarks like creating new buffer objects VS running the serialization code again while making this library. I’m afraid of making any wrong claims, but it feels like, generally speaking, the buffer library seems faster than a handful of table lookups in most cases.

Your average game wouldn’t even benefit much from a network library with serialization, so I thought why not make a schema serdes library that doesn’t try to be solve-all library for everyone but a high-speed serdes utility where it still requires the developer to work with buffers.

This module is my swing at what I understand as “hella fast” in terms of Roblox lua, but I haven’t been working with buffers for that long so I might be wrong

Regardless, my goal for this project is to create a super performant serializer that does not bother with miniscule compression gains.

drewbluewasabi · January 21, 2025, 3:01am

For anybody interested, I’ve created a few custom SeraTypes to use.
Read ‘More Info’ to see important behavior

More Info

When using these types, it’s important that you know the following:
Dictionaries are defined as key-value pairs
Arrays are defined like a stack; a list of items with no gaps

Dictionaries store the following:

Value count (table length)
Index/Key
Value

Arrays store the following:

Value count (table length)
Value

By default, all of the values are stored at 8-bit unsigned integers, this can be configured by editing the code.

Dictionary

local u8_Ser, u8_Des = module.Uint8.Ser, module.Uint8.Des

--Index of u8, value of u8
--Intended for small, variable-size dictionaries with keys.

--Has a base cost of 1 byte to store value count
--Each value + key takes 2 bytes
module.u8_u8_dict = table.freeze({
	Ser = function(b: buffer, offset: number, value: {number}): number
		local curr_offset
		do --Store the inner value count
			local value_count = 0
			for _ in value do
				value_count += 1
			end

			--Write the value count
			curr_offset = u8_Ser(b, offset, value_count)
		end

		--Store the inner values
		for i, v in value do
			curr_offset = u8_Ser(b, curr_offset, i)
			curr_offset = u8_Ser(b, curr_offset, v)
		end

		return curr_offset --Return the final offset after writing all values
	end,
	Des = function(b: buffer, offset: number): ({number}, number)
		local value_count, curr_offset = u8_Des(b, offset)

		local t = {}

		for count = 1, value_count do
			local i, v
			
			i, curr_offset = u8_Des(b, curr_offset)
			v, curr_offset = u8_Des(b, curr_offset)

			t[i] = v
		end

		return t, curr_offset --Return the final offset after reading all values
	end,
})

Array

--Index of u8, value of u8
--Intended for small, variable-size arrays

--Has a base cost of 1 byte to store value count
--Each value takes 1 byte
module.u8_u8_array = table.freeze({
	Ser = function(b: buffer, offset: number, value: {number}): number
		local curr_offset
		do --Store the inner value count
			local value_count = 0
			for i in value do
				value_count += 1
				
				if value_count ~= i then
					error("Invalid array")
				end
			end

			--Write the value count
			curr_offset = u8_Ser(b, offset, value_count)
		end

		--Store the inner values
		for _, v in value do
			curr_offset = u8_Ser(b, curr_offset, v)
		end

		return curr_offset --Return the final offset after writing all values
	end,
	Des = function(b: buffer, offset: number): ({number}, number)
		local value_count, curr_offset = u8_Des(b, offset)

		local t = {}

		for count = 1, value_count do
			local v
			v, curr_offset = u8_Des(b, curr_offset)

			t[count] = v
		end

		return t, curr_offset --Return the final offset after reading all values
	end,
})

Array (Converted to f32 values)

local u8_Ser, u8_Des = module.Uint8.Ser, module.Uint8.Des
local f32_Ser, f32_Des = module.Float32.Ser, module.Float32.Des

--Index of u8, value of f32
--Intended for small, variable-size arrays

--Has a base cost of 1 byte to store value count
--Each value takes 4 bytes
module.u8_f32_array = table.freeze({
	Ser = function(b: buffer, offset: number, value: {number}): number
		local curr_offset
		do --Store the inner value count
			local value_count = 0
			for i in value do
				value_count += 1

				if value_count ~= i then
					error("Invalid array")
				end
			end

			--Write the value count
			curr_offset = u8_Ser(b, offset, value_count)
		end

		--Store the inner values
		for _, v in value do
			curr_offset = f32_Ser(b, curr_offset, v)
		end

		return curr_offset --Return the final offset after writing all values
	end,
	Des = function(b: buffer, offset: number): ({number}, number)
		local value_count, curr_offset = u8_Des(b, offset)

		local t = {}

		for count = 1, value_count do
			local v
			v, curr_offset = f32_Des(b, curr_offset)

			t[count] = v
		end

		return t, curr_offset --Return the final offset after reading all values
	end,
})

athar_adv · January 21, 2025, 3:43am

I feel like you could get rid of needing to specify the size of each datatype by simply using null terminators (value 0 u8 bytes at the end of each dynamically-sized datatype) which is what I did for my own buffer serde BufferConverter.
This completely removes the string and table size limit at the cost of not being able to directly get the size of a datatype (which noone would probably be doing anyway )

athar_adv · January 21, 2025, 4:25am

Honestly when it comes to buffers it is fast whatever you do (unless you code really really badly)

For example my buffer serde BufferConverter takes 8 milliseconds to serialize a table with 500 members (max, 1000 repeats)

and 3 milliseconds mean

And for 100 members it takes 0.5 milliseconds! (mean, 1000 repeats)

and 2 milliseconds max

The buffer library is really fast, you don’t really have to worry about performance anytime soon, just focus on compression since that’s what buffers excel at!

athar_adv · January 21, 2025, 4:38am

By the way, you do NOT need 48 bytes for a CFrame

Instead you can do it like this:

And when reading:

This way it’s only 22 bytes!

The reason you can do this is because R00 to R22 are gauranteed to be between -1 and 1, so you can simply represent that as a fraction of -127 to 127!

netheround · January 21, 2025, 6:44am

Does Replica support sera for serializations?

loleris · January 21, 2025, 8:16am

There’s also Sera.LossyCFrame that costs 28 bytes and has perfect coordinate and 0.0005’ish degree precision for rotations.

I have a feeling your implementation might lead to rotation imprecisions up to a degree or maybe even more. At that point you could just convert to euler angles and risk gimbal lock at 24 bytes.

athar_adv · January 21, 2025, 10:43am

Good point, I’ll add an option to use the rotation matrix or axis angle

loleris · January 21, 2025, 10:59am

I wrote my SerDes library completely from my own 15 years of Lua expertise in what I understand as “fast code” so I haven’t been doing much comparison to other libraries. For fun I’ve taken your module for comparison. Here are the results:

First of all it’s a bit of a apples to oranges comparison to compare schematized and non schematized SerDes libraries since they have different goals in mind, but we can try to see just how different they can be if they’re trying to do the same thing:

By defining strict types and anticipating value order with a schema the serialized result becomes pretty compact. Your library defaulting to non f64 numbers destroyed the UserId field - Roblox floats support integer precision and UserId’s have surpassed 2 ^ 32 which means the only native datatype in Roblox to hold UserId’s is f64 aka the generic Lua number type. You can take that into consideration whether a non-schema SerDes should default to f64 numbers or something else.

As for the speed:

There’s over a 30x speed difference when serializing this type of table. I’ve created Sera for a project where I will need tons of serialization at runtime for replicating game state so I’m planning to push Roblox to it’s limits lol.

athar_adv · January 21, 2025, 11:11am

Dang, this is definitely one hell of a wake up call… that speed difference is really big, I assume because of more loops in mine. Though, for the UserId I would store them as strings instead of numbers.

Also, you can specify a number size type (idk what to call them) by doing Converter.Serialize(…, {numbersAs = “f32”}) (for example), though this does do it for all numbers and would benefit from a schema system (like yours), I guess it really is comparing apples to oranges.

Also also, I would still like you to implement null terminators to completely eliminate the size limitation, I honestly am considering switching to this from my own module if you do implement it

loleris · January 21, 2025, 11:15am

I want to avoid operations where my module would have to look for a null terminator - my goal with Sera is to do as few Lua operations as possible and let the native code behind Lua do the most heavy lifting. Using Sera.String32 would give you limitless string size while having negligible serialized size impact.

athar_adv · January 21, 2025, 11:29am

Well, alright. You learn something new every day…

Also, 15 years?? That’s longer than I’ve been alive!

LerpSoh · January 21, 2025, 4:36pm

Holy?? It is that fast? I was planning to use Squash, but I am planning to switch now, thank you for this godly resource

XoifailTheGod · January 21, 2025, 7:05pm

Quite fast but caching and/or splitting the workload speeds it up hell a lot more. I use mine for exporting or importing large game assets even so still takes a bit of time

drewbluewasabi · January 21, 2025, 7:28pm

Also, for a frame of reference on how much data/bandwidth this can save, I checked the size of data for the serialized and unserialized versions of this data:

and the results were:
Unserialized (roblox default): 83 bytes
Serialized: 27 bytes !!

If you want to know how I saw how much data they took up, check this post: Introducing UnreliableRemoteEvents - #110 by Luaction

loleris · January 21, 2025, 10:10pm

I was doing a few tests benchmarking Sera against Squash - Although I don’t feel like my tests were super high quality I do believe Sera would have to be up to x1.7 faster or at worst just as fast as Squash. Schematized SerDes is easier to write than manually filling the buffer with Squash. I also think Squash might’ve been written better in some regards (I expected it to run faster than Sera), but I’m personally only interested in schematized SerDes.

loleris · January 21, 2025, 10:17pm

Benchmarks are meant to compare actual processing time of pieces of code - putting processing in new coroutines or threads during the benchmark beats the purpose of a benchmark.

Putting code into coroutines can make throttling easier, but it doesn’t prevent overloading the processor since normally Lua runs on a single processor core and simply resumes a coroutine immediately after a previous one yields.