It’s been mentioned here by Stravant, but you can actually pack the data even tighter than that.
Right now, you’re selecting the largest of the 4 numbers in the quaternion, dropping it and reconstructing it using the other three since it’s normalized, which is great. But that means that since we already know the largest of the squared numbers has been subtracted, we can actually also know for sure that none of the remaining numbers will have a square larger than 1/2 (since if it was greater than 1/2, then it would be the largest number, not the number we already removed). Hence why we can actually assume all values to be between -1/sqrt(2) and 1/sqrt(2). (instead of the -1 to 1 range you currently use)
The other optimization you could make (which is probably not necessary unless you really needed to squeeze out every bit or encode a lot of data) would be to encode on the bit-level, not byte level. This way, you could encode your index in 2 bits, not 8. And, you could have more freedom to do odd bit counts like 15 for the 3 numbers, instead of sticking to multiples of 8.
If anyone’s reading this and wants to do bit-level packing, one way you could do it is to write to a table of booleans (representing your bits), and then once you’ve “encoded” all of your data, you would go through the table in 8-sized chunks and encode/decode with read/writeu8 (unsigned 8 bit). This also has the benefit that if you know the layout of your data (i.e serializing an object with known parameters) on the sending and receiving end, you can stuff a bunch of dissimilar data types back to back in one giant “bitstring” without having to waste space on padding.
How likely would it be for EditableImages to support buffers? I’d appreciate if you guys would add such functionality.
Buffers would only need to store 1 byte for each component, giving really large memory savings, specially for transferring large image buffers over the internet.
Also, considering the new “–!native” feature, I’m sure EditableImages would take advantage of buffers even more, as the code is likely closer to C code.
Are those 25 extra bytes really that big of a deal though? Are there any statistics or data on how many extra bytes will actually start to impact performance?
If it’s only a little bit of data it’s not a big deal really but if you have a building game where you have to save every wall, floor, door, window, etc then size starts to matter A LOT.
Smaller data is also much faster to save and load from datastores and uses less bandwidth when send over a network.
You’re not expected to always keep data as small as possible but it’s a good practice to learn efficient data storage and structuring.
Keeping things small and compact has many benefits.
Luau buffer type support is now enabled in DataStore, MemoryStoreService, MessagingService, TeleportService (TeleportData) and HttpService (JSONEncode/JSONDecode functions).
Documentation updates for those services will follow soon.
No, the new feature is that you can have ‘buffer’ objects in the value field of SetAsync (directly or inside a table).
And you will get ‘buffer’ objects back from GetAsync etc.
I think someone worked out the answer in the Open Source Discord server but I want to ask just for everyone else’s sake: what does the expansion rate for buffers look like in datastores then? I assume they’re encoded in some capacity, so what does that look like?
Sorry if I’m asking a question that will be answered by the documentation, I’m just sure I’m not the only one who’s excited for this and I want to immediately dash everyone’s hopes and dreams with reality.
Right now, the expansion rate is approaching 4/3, so for every 3 buffer bytes, 4 bytes of the DataStore value is used.
There are also a few bytes of additional overhead.
To put it simply, buffer value should be kept slightly below 3MB.
Buffer data is still being compressed, so the absolute maximum buffer size that can be stored is 50MB, but only if it can be compressed below 3MB by the engine.
Because compression ratio depends on the data being stored, we recommend keeping the uncompressed buffer size below 3MB to avoid unexpected failures.
And as DataStore documentation mentions, you can use JSONEncode function to check how large the value being stored actually is.
No, it will not be possible to write an Instance into a buffer directly.
There also no plans to have unique ids in-game at this time. If we do add them, they will most likely be represented as 36 byte strings and that’s unlikely to be used for efficient networking.
What if you want to store booleans?
Since the smallest datatype we can do is 8-bit and a boolean can be represented in 1 bit wouldn’t that mean if we wanted a buffer to have booleans we’d have to use a 8-bit integer.
Could we use buffers for HttpService:RequestAsync’s body?
Can there be an option to receive HttpService:RequestAsync’s response body as a buffer?
Does MessagingService encode buffers to base 64 internally, or is it kept as a sequence of bytes? Internal JSON conversion with a limit of 1KB seems perilous.
Would it be possible to have a special frozen buffer that shares memory across threads/Actors, while keeping the improved access speed? The idea is that you have big chunks of static data that lots of actors need to be able to read efficiently. Use cases include:
Multithreaded animation systems with lots of animation data.
Behavior trees for NPC AI.
Simple machine learning demos.
For GetAttribute/SetAttribute, I’d like to see either:
Support for buffers as a type.
Option for GetAttribute to result in a buffer instead of a string. SetAttribute(name, buffer) can cast to a string automatically.
String attributes are already practically buffers internally. It may be best to keep them as strings, unless there’s some way to make the instance’s attribute actually share memory with the buffer (which would be awesome, but I’m not sure how it would work with respect to ChangeHistoryService and AttributeChanged.)
Maybe GetAttribute returns a frozen buffer that actually shares memory with the attribute (unless the attribute changes, in which case new data would be allocated for the instance.) Imagine someone’s in studio working with >10MB chunks of data in attributes; You wouldn’t want to copy this around more than needed.
Side note: It’s disappointing that instance:GetAttributes() returns a dictionary instead of an array of keys. This is quite slow for instances with lots of data.
These are just ideas. What’s important is performance, simplicity, and the ability to use it with existing Roblox APIs.
That’s not possible right now, but buffer library has functions to convert to string and back.
While that causes an extra allocation and copy, those operations should be pretty fast even on large sizes.
This API request can be posted on the feature request forum.
Yes, base64 is used today, sometimes buffer is compressed beforehand (if it’s compressible). That does limit reliable buffer size to 700 bytes.
Unfortunately no, that would be incompatible with how Luau VM data is organized.
Buffer will also be unable to point to the Instance attribute data without a copy, so this does raise a question of how much buffer attributes will be useful if they can’t cover use-cases that should avoid copies.
I’d really like to see Roblox start using a proprietary JSON format internally. It could be as simple as checking if the data starts with an illegal JSON byte (there are >200 to choose from.) If it does, try decoding using a simple-but-cleverly-optimized binary format.
It would save a fraction of file storage costs, but more importantly take load off of busy servers that would otherwise waste time checking bytes one at a time to escape strings, instead of just storing the length before the data so it can be skipped over.
If I were designing it, a value/object would be preceded by a byte, and byte ranges could be reserved for short string/buffer/array/dictionary lengths, with special cases for compact values:
0: false
1: true
2: null
3: f64
-- There are loads of ways to optimize number cases. It would need to be tuned using real data.
128-156: string (lengths 0 to 28)
157: string (u8) (lengths 28 to 284)
158: string (u16) (lengths 285 to 65820)
159: string (u32) (lengths 65821 to 2^32-1)
160-188: array (lengths 0 to 28)
189: array (u8) (lengths 28 to 284)
190: array (u16) (lengths 285 to 65820)
191: array (u32) (lengths 65821 to 2^32-1)
192-220: dictionary (lengths 0 to 28)
221: dictionary (u8) (lengths 28 to 284)
222: dictionary (u16) (lengths 285 to 65820)
223: dictionary (u32) (lengths 65821 to 2^32-1)
224-252: buffer (lengths 0 to 28)
253: buffer (u8) (lengths 28 to 284)
254: buffer (u16) (lengths 285 to 65820)
255: buffer (u32) (lengths 65821 to 2^32-1)
Some binary JSON alternatives do string deduplication, which can make sense to do for dictionary keys.
If enforcing valid unicode, there are various ways codepoints can be compressed. Otherwise developers could be allowed to send or store arbitrary binary strings.
HttpService:JSONEncode/Decode could still use the "m":null trick to produce human a readable version. There could even be support for Vector3.