This module seems pretty helpful for decreasing the amount of data being replicated at once. However, itās unclear to me why this would be preferable over the usual table. What are some specific use cases that youāve found Sera useful for? When should we use this and when should we not?
Roblox games have a modest budget for networking and data storage, so serialization only shines for projects with a lot of data throughput.
Serialization doesnāt make less efficient systems obsolete.
It would be interesting to see how the performance of this paired with remote events differs from popular networking resources such as Bytenet or Warp. Do you think this could be faster or possibly result in lower bandwidth due to its simplicity?
Iāve been making individual benchmarks like creating new buffer objects VS running the serialization code again while making this library. Iām afraid of making any wrong claims, but it feels like, generally speaking, the buffer library seems faster than a handful of table lookups in most cases.
Your average game wouldnāt even benefit much from a network library with serialization, so I thought why not make a schema serdes library that doesnāt try to be solve-all library for everyone but a high-speed serdes utility where it still requires the developer to work with buffers.
This module is my swing at what I understand as āhella fastā in terms of Roblox lua, but I havenāt been working with buffers for that long so I might be wrong
Regardless, my goal for this project is to create a super performant serializer that does not bother with miniscule compression gains.
For anybody interested, Iāve created a few custom SeraType
s to use.
Read āMore Infoā to see important behavior
More Info
When using these types, itās important that you know the following:
Dictionaries are defined as key-value pairs
Arrays are defined like a stack; a list of items with no gaps
Dictionaries store the following:
- Value count (table length)
- Index/Key
- Value
Arrays store the following:
- Value count (table length)
- Value
By default, all of the values are stored at 8-bit unsigned integers, this can be configured by editing the code.
Dictionary
local u8_Ser, u8_Des = module.Uint8.Ser, module.Uint8.Des
--Index of u8, value of u8
--Intended for small, variable-size dictionaries with keys.
--Has a base cost of 1 byte to store value count
--Each value + key takes 2 bytes
module.u8_u8_dict = table.freeze({
Ser = function(b: buffer, offset: number, value: {number}): number
local curr_offset
do --Store the inner value count
local value_count = 0
for _ in value do
value_count += 1
end
--Write the value count
curr_offset = u8_Ser(b, offset, value_count)
end
--Store the inner values
for i, v in value do
curr_offset = u8_Ser(b, curr_offset, i)
curr_offset = u8_Ser(b, curr_offset, v)
end
return curr_offset --Return the final offset after writing all values
end,
Des = function(b: buffer, offset: number): ({number}, number)
local value_count, curr_offset = u8_Des(b, offset)
local t = {}
for count = 1, value_count do
local i, v
i, curr_offset = u8_Des(b, curr_offset)
v, curr_offset = u8_Des(b, curr_offset)
t[i] = v
end
return t, curr_offset --Return the final offset after reading all values
end,
})
Array
--Index of u8, value of u8
--Intended for small, variable-size arrays
--Has a base cost of 1 byte to store value count
--Each value takes 1 byte
module.u8_u8_array = table.freeze({
Ser = function(b: buffer, offset: number, value: {number}): number
local curr_offset
do --Store the inner value count
local value_count = 0
for i in value do
value_count += 1
if value_count ~= i then
error("Invalid array")
end
end
--Write the value count
curr_offset = u8_Ser(b, offset, value_count)
end
--Store the inner values
for _, v in value do
curr_offset = u8_Ser(b, curr_offset, v)
end
return curr_offset --Return the final offset after writing all values
end,
Des = function(b: buffer, offset: number): ({number}, number)
local value_count, curr_offset = u8_Des(b, offset)
local t = {}
for count = 1, value_count do
local v
v, curr_offset = u8_Des(b, curr_offset)
t[count] = v
end
return t, curr_offset --Return the final offset after reading all values
end,
})
Array (Converted to f32 values)
local u8_Ser, u8_Des = module.Uint8.Ser, module.Uint8.Des
local f32_Ser, f32_Des = module.Float32.Ser, module.Float32.Des
--Index of u8, value of f32
--Intended for small, variable-size arrays
--Has a base cost of 1 byte to store value count
--Each value takes 4 bytes
module.u8_f32_array = table.freeze({
Ser = function(b: buffer, offset: number, value: {number}): number
local curr_offset
do --Store the inner value count
local value_count = 0
for i in value do
value_count += 1
if value_count ~= i then
error("Invalid array")
end
end
--Write the value count
curr_offset = u8_Ser(b, offset, value_count)
end
--Store the inner values
for _, v in value do
curr_offset = f32_Ser(b, curr_offset, v)
end
return curr_offset --Return the final offset after writing all values
end,
Des = function(b: buffer, offset: number): ({number}, number)
local value_count, curr_offset = u8_Des(b, offset)
local t = {}
for count = 1, value_count do
local v
v, curr_offset = f32_Des(b, curr_offset)
t[count] = v
end
return t, curr_offset --Return the final offset after reading all values
end,
})
I feel like you could get rid of needing to specify the size of each datatype by simply using null terminators (value 0 u8 bytes at the end of each dynamically-sized datatype) which is what I did for my own buffer serde BufferConverter.
This completely removes the string and table size limit at the cost of not being able to directly get the size of a datatype (which noone would probably be doing anyway )
Honestly when it comes to buffers it is fast whatever you do (unless you code really really badly)
For example my buffer serde BufferConverter takes 8 milliseconds to serialize a table with 500 members (max, 1000 repeats)
and 3 milliseconds mean
And for 100 members it takes 0.5 milliseconds! (mean, 1000 repeats)
and 2 milliseconds max
The buffer
library is really fast, you donāt really have to worry about performance anytime soon, just focus on compression since thatās what buffers excel at!
By the way, you do NOT need 48 bytes for a CFrame
Instead you can do it like this:
And when reading:
This way itās only 22 bytes!
The reason you can do this is because R00 to R22 are gauranteed to be between -1 and 1, so you can simply represent that as a fraction of -127 to 127!
Does Replica support sera for serializations?
Thereās also Sera.LossyCFrame that costs 28 bytes and has perfect coordinate and 0.0005āish degree precision for rotations.
I have a feeling your implementation might lead to rotation imprecisions up to a degree or maybe even more. At that point you could just convert to euler angles and risk gimbal lock at 24 bytes.
Good point, Iāll add an option to use the rotation matrix or axis angle
I wrote my SerDes library completely from my own 15 years of Lua expertise in what I understand as āfast codeā so I havenāt been doing much comparison to other libraries. For fun Iāve taken your module for comparison. Here are the results:
First of all itās a bit of a apples to oranges comparison to compare schematized and non schematized SerDes libraries since they have different goals in mind, but we can try to see just how different they can be if theyāre trying to do the same thing:
By defining strict types and anticipating value order with a schema the serialized result becomes pretty compact. Your library defaulting to non f64 numbers destroyed the UserId field - Roblox floats support integer precision and UserIdās have surpassed 2 ^ 32
which means the only native datatype in Roblox to hold UserIdās is f64 aka the generic Lua number type. You can take that into consideration whether a non-schema SerDes should default to f64 numbers or something else.
As for the speed:
Thereās over a 30x speed difference when serializing this type of table. Iāve created Sera for a project where I will need tons of serialization at runtime for replicating game state so Iām planning to push Roblox to itās limits lol.
Dang, this is definitely one hell of a wake up callā¦ that speed difference is really big, I assume because of more loops in mine. Though, for the UserId I would store them as strings instead of numbers.
Also, you can specify a number size type (idk what to call them) by doing Converter.Serialize(ā¦, {numbersAs = āf32ā}) (for example), though this does do it for all numbers and would benefit from a schema system (like yours), I guess it really is comparing apples to oranges.
Also also, I would still like you to implement null terminators to completely eliminate the size limitation, I honestly am considering switching to this from my own module if you do implement it
I want to avoid operations where my module would have to look for a null terminator - my goal with Sera is to do as few Lua operations as possible and let the native code behind Lua do the most heavy lifting. Using Sera.String32
would give you limitless string size while having negligible serialized size impact.
Well, alright. You learn something new every dayā¦
Also, 15 years?? Thatās longer than Iāve been alive!
Holy?? It is that fast? I was planning to use Squash, but I am planning to switch now, thank you for this godly resource
Quite fast but caching and/or splitting the workload speeds it up hell a lot more. I use mine for exporting or importing large game assets even so still takes a bit of time
Also, for a frame of reference on how much data/bandwidth this can save, I checked the size of data for the serialized and unserialized versions of this data:
and the results were:
Unserialized (roblox default): 83 bytes
Serialized: 27 bytes !!
If you want to know how I saw how much data they took up, check this post: Introducing UnreliableRemoteEvents - #110 by Luaction
I was doing a few tests benchmarking Sera against Squash - Although I donāt feel like my tests were super high quality I do believe Sera would have to be up to x1.7 faster or at worst just as fast as Squash. Schematized SerDes is easier to write than manually filling the buffer with Squash. I also think Squash mightāve been written better in some regards (I expected it to run faster than Sera), but Iām personally only interested in schematized SerDes.
Benchmarks are meant to compare actual processing time of pieces of code - putting processing in new coroutines or threads during the benchmark beats the purpose of a benchmark.
Putting code into coroutines can make throttling easier, but it doesnāt prevent overloading the processor since normally Lua runs on a single processor core and simply resumes a coroutine immediately after a previous one yields.