Serde | Schema-Based Serialization & Deserialization

serde | Schema-Based Serialization & Deserialization


serde is a serialization/deserialization library that uses schemas to convert structured data into compact binary buffers and back. Its buffer-based design enables fast processing and minimal output size, making it ideal for networking, DataStores, and any scenario where efficient data representation is critical. Although not exactly built for speed, serde aims to support whatever you throw at it.

serde currently supports all base creatable roblox and luau datatypes!

Use


serde is schema-based, meaning you describe how your data is structured through various data-types, and the library will perform actions according to your schema (think of it like instructions on how to put together and take out data!)

As forementioned, serde supports all base creatable datatypes with a few exceptions. See here:

Exceptions & Caveats
  • Instance - To prevent a dangling pointer of sorts while referencing an Instance, serde will only serialize the properties of the object. You can specify which properties you wish to be serialized / deserialized.
  • Object - This datatype is not creatable.
  • Content - The Object property is not supported, however, Content can still be used, just not if it uses the Object property. If it does, Content.none will be used in place of it.
  • DockWidgetPluginGui - Property access on this instance seems to be disabled.
  • OverlapParams - Instances in the instances array are not serialized; and so it will be empty.
  • RaycastParams - Ditto.
  • RaycastResult - The Instance property will be ignored. Due to RaycastResults not being creatable, a fake struct resembling one will be returned instead.
  • RBXScriptConnection: The Instance is not creatable, and so a struct resembling the instance will be returned instead.
  • RBXScriptSignal: This datatype is not creatable.
  • Secret: Ditto.
  • SharedTable: Can have any content; Some of which may not be serializable / deserializable, and therefore it has been excluded.

All data-types can be found under the main module. Each data-type has its own Converter<T> interface, where T is the datatype. A Converter<T> is used to serialize and deserialize data. To actually serialize and deserialize, the module exports a serialize and deserialize function which respectively do as their name says. Here’s an example of serializing a string:

local serde = require(path.to.serde)

-- `serde.string` is a `Converter<string>`, meaning it serializes the `string` datatype
local serialized = serde.serialize(serde.string, "Hello, World!") -- `buffer`
local deserialized = serde.deserialize(serde.string, serialized) -- "Hello, World!"

serde has also built-in a few complex data-types ({T}, {[K]: V}, structs, tuples, etc) for ease of use. These complex functions, given the layout of the object (that be an array, dictionary, etc), return a newly made Converter resembling the data. This is how it looks in action:

:warning: Important
Using a struct is almost always smaller and more efficient that using a dictionary. This is because the keys are known beforehand and so they do not need to be stored in the buffer (unlike in a dictionary). For this reason, only the value of each key is stored, making a struct the same as a tuple)

Arrays, Dictionaries, Structs, & Tuples

Arrays

local serde = require(path.to.serde)
local string = serde.string
local array = serde.array

-- `array` takes a `Converter<T>` and returns a new `Converter<{T}>`
local arrayOfStrings = array(string)

local serialized = serde.serialize(arrayOfStrings, {"Hello, ", "World!"})
local deserialized = serde.deserialize(arrayOfStrings, serialized)
-- ^^^ `{"Hello, ", "World!"}`

Key-Value pairs (dictionaries)

local serde = require(path.to.serde)
local string = serde.string
local number = serde.number
local dictionary = serde.array

-- `dictionary` takes a `Converter<K>` & `Converter<V>`
-- and returns a new `Converter<{[K]: V}>`
local newDict = dictionary(string, number) -- `{[string]: number}`
local myDict = {
    ["hello"] = 1,
    ["world"] = 2,
}

local serialized = serde.serialize(newDict, myDict)
local deserialized = serde.deserialize(newDict, serialized)
-- ^^^ `{["hello"]: 1, ["world"]: 2}`

Structs

local serde = require(path.to.serde)
local struct = serde.struct
local string = serde.string
local bool = serde.bool
local number = serde.number

-- `struct` takes a dictionary of fields and their types
local userSchema = struct({
    name = serde.string,
    age = serde.number,
    isCool = serde.bool
})

local user = {
    name = "ROBLOX",
    age = 19,
    isCool = true
}

local serialized = serde.serialize(userSchema, user)
local deserialized = serde.deserialize(userSchema, serialized)
-- ^^^ The same as our `user` variable

Tuples

local tuple = serde.tuple
local string = serde.string
local bool = serde.bool
local number = serde.number

-- Pass as many `Converter`s as you like, of any type!
local myColorSchema = tuple(serde.number, serde.number, serde.number)

local serialized = serde.serialize(myColorSchema, {
    255, 127, 65 -- You can also pass a tuple as an array if it is easier. *
})

-- That means that the following code is also valid 
--local serialized = serde.serialize(myColorSchema, 255, 127, 65)

local a, b, c = serde.deserialize(myColorSchema, serialized)
print(a, b, c) -- (255, 127, 65)

* - As alluded, you can also pass an array of the values instead of the tuple itself. This makes it possible to have tuple values in dictionaries and structs:

local tuple = serde.tuple
local struct = serde.struct
local string = serde.string
local number = serde.number

-- Pass as many `Converter`s as you like, of any type!
local footInches = tuple(serde.number, serde.number)
local userSchema = struct({
    name = serde.string,
    height = footInches
})

local serialized = serde.serialize(userSchema, {
    name = "ROBLOX",
    height = {4, 11} -- 4'11'', haha he's a shortie!
})


local deserialized = serde.deserialize(userSchema, serialized)
print(deserialized) -- `height` will be reserialized as an array,
-- meaning it will be {4, 11}!

For details on how to customise serde to serialize and deserialize your own custom converters, give the following a read:

Implementing your own Converter

Implementing your own Converter


:warning: Important
It will greatly help if you are well-versed in how buffers work before reading this section.

serde doesn’t limit you to its inbuilt data-types, but also allows you to register your own via the custom method. Here’s an example use case: Currently all numbers in serde infer what type of integer they should be, and take up 2 or more bytes (the first byte tells the deserializer what type of integer it is, and the rest of the bytes are the integer itself). In this scenario, we assume you want your number to efficiently be stored as a u8 (0 - 255).

A Converter<T> consists of 3 functions: read, write, and getLengthOf. What read and write do is self-explanatory, and getLengthOf returns the length in bytes a value would be if it were allocated. Simply, all that needs to be passed is a dictionary with 3 these functions:

local serde = require(path.to.serde)
local custom = serde.custom

local u8 = custom({
    read = function()
    end,

    write = function()
    end,

    getLengthOf = function()
    end,
})

Since the length of a u8 in bytes is already known (each u8 is 1 byte), the getLengthOf function can be completed by returning 1:

-- ...
getLengthOf = function(v: number)
    return 1
end,
-- ...

As for read and write, it will be slightly more complicated. As our u8 is being written from scratch and doesn’t depend on any of the in-built converters, we must allocate and write memory ourself using the CursorController, which is passed as the first argument to each function. Let’s start with the write function:

-- ...
-- You may have to explicitly define the 
-- type due to Roblox's typechecking not being the best...
write = function(cursor: serde.CursorController, v: number)

end,
-- ...

The cursor is an in-built utility that controls where in the buffer memory is being written. At this current moment, the cursor is conveniently placed exactly where your data needs to start being written. The first thing you do in any function is allocate memory.

:warning: Important
Always allocate memory first. If you retrieve the buffer THEN allocate, the buffer will not be updated. Allocate all the memory needed, then start writing.

CursorControl implements the allocateBytes function, and the functions… function is self explanatory:

-- ...
-- You may have to explicitly define the 
-- type due to Roblox's typechecking not being the best...
write = function(cursor: serde.CursorController, v: number)
    cursor.allocateBytes(1)
end,
-- ...

Now all the memory needed has been allocated, all that is left here is writing. Retrieve the buffer via getBuffer, and write a u8 value with buffer.writeu8. The offset of the value is the current position of the cursor (which you can get with getCursor:

-- ...
-- You may have to explicitly define the 
-- type due to Roblox's typechecking not being the best...
write = function(cursor: serde.CursorController, v: number)
    cursor.allocateBytes(1)
    local buff = cursor.getBuffer()
    -- Write `v` into `buff` with an offset of the cursor position
    buffer.writeu8(buff, cursor.getCursor(), v)
end,
-- ...

However, we’re not done yet! The cursor is still at its original position, and we should set it up for the next piece of data to be allocated! Since 1 byte was allocated, move the cursor forward once with incrementCursor (This means, effectively, for every byte you allocate, remember to move the cursor!):

-- ...
-- You may have to explicitly define the 
-- type due to Roblox's typechecking not being the best...
write = function(cursor: serde.CursorController, v: number)
    cursor.allocateBytes(1)
    local buff = cursor.getBuffer()
    -- Write `v` into `buff` with an offset of the cursor position
    buffer.writeu8(buff, cursor.getCursor(), v)
    cursor.incrementCursor(1) -- Move forward 1 byte
end,
-- ...

At last, the write function is done. That’s only the serialization however! This data still needs to be deserialized with the read function. Thankfully since our memory is already layed out, the function will just read the buffer at the cursors location, move the cursor forward for the next data to be serialized, and return our extracted value:

read = function()
    -- Get the buffer. No memory needs to be allocated so this is the first thing we do
    local buff = cursor.getBuffer()
    -- Read the u8 value with an offset of the cursor position
    local value = buffer.readu8(buff, cursor.getCursor())
    cursor.incrementCursor(1) -- Move the cursor forward 1 byte
        
    return value
end,

The full implementation for the u8 data-type (without comments) should now look like this:

local u8 = custom({
    read = function(cursor: serde.CursorController)
        local buff = cursor.getBuffer()
        local value = buffer.readu8(buff, cursor.getCursor())
        
        cursor.incrementCursor(1)
        
        return value
    end,
    
    write = function(cursor: serde.CursorController, value: number)
        cursor.allocateBytes(1)
        
        local buff = cursor.getBuffer()
        buffer.writeu8(buff, cursor.getCursor(), value)
        cursor.incrementCursor(1)
        
        return
    end,
})

Here, u8 is now a usable Converter<number>, let’s try it out:

local schema = serde.struct({
    name = serde.string,
    age = u8
})

local data = {
    name = "ROBLOX",
    age = 58
}

local serialized = serde.serialize(
    schema,
    data
)

local deserialized = serde.deserialize(schema, serialized)

…And it should work - deserialized should have the correct value of 58 (meaning reading and writing works). This is an optimised version of serde.number!

Why use serde over Squash, BufferEncoder or any other library?


Though some alternatives like BufferEncoder may perform marginally faster in microbenchmarks, the difference is insignificant in practical use. For example, in one test with 10,000 iterations, serde took approximately 1.19e-6 seconds per call, while BufferEncoder achieved 5.47e-7. These are microsecond-level differences; too small to meaningfully affect real-world performance (it’s still 836351 calls per second!).

What makes serde different is its high-level, schema-driven design. It abstracts away any low-level concerns like whether a number is signed or unsigned, and handles serialization logic for you. While it may trade a few extra bytes (e.g. adding a type identifier to distinguish between different numeric types), this small cost results in a much cleaner and more maintainable API for the end developer. On top of that, the system is flexible enough to support complex data structures and most Luau types out of the box.

In short, serde gives up a tiny bit of size and speed for a big gain in developer experience. It’s perfect when you a serialization system without worrying about the nitty-gritty of binary encoding.


Download the latest .rbxm from here: serde.rbxm

4 Likes

I don’t think this is a benefit considering BufferEncoder doesn’t require you to use schemas :thinking:

Tho yea if you wanna minimize size should probably use schemas

Fair point, BufferEncoder not requiring schemas could be thought as a benefit. That can be great for simple or one-off use cases where size and structure aren’t a big concern.

However, the main benefit of schemas in serde is consistency, maintainability, and safety. When you’re working with any sort of structured or versioned data (like saving player progress or performing networking), having your predefined schema ensures both ends know exactly what to expect. It eliminates bugs from mismatched reads/writes, helps with versioning, and makes the code more self-documenting (if that all makes sense!).

serde shines in larger systems where structure matters. It trades a tiny bit of setup for a lot of long-term clarity and reliability :bangbang:

Networking is often something that tends to have variability in what’s passed by the same remote, in which case you’d use the ‘optional’ type in schemas. This tends to negate the benefits of schemas and may be overall slower than not using schemas.

Migrating from using RemoteEvents to a networking library - or from one library to another - takes pretty long depending on the number of remotes you have, and even longer if the library you’re migrating to uses schemas as now not only do you have to change the code in every place sending data, you also have to define schemas for every single remote.

In my opinion, it’s usually better when networking libraries handle the serialization for you without you defining how it happens, and support almost everything that you could try to send in a remote, such as cyclic tables, mixed tables, having table keys be cframes, and so on, All those can’t be normally sent with RemoteEvents and many network libraries due to limitations, but they’re incredibly helpful and can be implemented with 0 overhead. This is actually one of the main reasons why I made BufferEncoder in the first place.

There are cases of schemas being useful, such as for data guaranteed to be structured, and when you’re sending large amount of data to players such as the cframe of each player’s cameras and replicating body part cframes of VR players. but for general networking, I don’t see an advantage for using them that outweighs their downsides.

Yeah, that makes sense, especially for stuff where the data being sent is unpredictable or varies a lot. But I do think schemas still have a lot of value depending on the use case. When you’re dealing with structured or sensitive data, having a schema makes things way more reliable as you catch mistakes early, know exactly what’s being sent and received, and it’s easier to debug and maintain. It also helps with compression and forward compatibility, since you can safely ignore or default missing fields. I get that defining schemas can feel like extra work, especially during a migration, but long-term (for me, atleast) it saves time and avoids a lot of headaches. So I wouldn’t say one approach is better across the board, though a dynamic system like BufferEncoder is still super useful when you need flexibility, but structured schema-based systems for myself make sense when you care about safety, clarity, and maintainability.

In-fact, it may make sense for serde to “guess” the layout of the type you give it and serialize it… I’ll look into this :eyes:

The main point of serde here is for those who wish to use schema-based serialization and deserialization while abstracting small pain points that newer devs may not understand. It’ll be hard to migrate new projects to use this - and so I don’t think existing projects should change, but for new projects, it could be worth it. Plus, you can still implement the entire networking side on your own, meaning you don’t have to make the entire schema an option.