Introducing Luau buffer type [Beta]

metatablecatmaid · November 30, 2023, 6:54pm

I wonder what the reason to not add an internal offset was. The main usecase for this, running through a filestream, would benefit greatly if the offset didn’t need to be externally tracked.

This datatype is also a great reason to add binary DataStores

ATPStorages · November 30, 2023, 6:56pm

Probably due to flexibility reasons - it’s easier to implement a cursor on an untracked buffer than it is the other way around (ref. TheNexusAvenger)

Dekkonot · November 30, 2023, 7:04pm

We talked about this somewhat in the Roblox OSS Discord server and WheretIB even made a prototype that had an internal cursor. It was ultimately decided against because it complicates things like the API and the networking.

It’s a matter of tradeoffs vs utility and in this case it was more important to keep things baremetal for performance.

C_Corpze · November 30, 2023, 7:05pm

I’m actually really curious if buffers can possibly store Vector3 and CFrame data in a much smaller format.

If I recall, a Vector3 is 13 bytes, numbers 9 bytes and CFrames roughly 15 bytes?

So that brings the question, how small could we make a Vector or a CFrame if we wanted the tiniest possible format without sacrificing too much precision?

Dekkonot · November 30, 2023, 7:13pm

A Vector3 can be stored as 12 bytes, since the individual components are all 32-bit floats.

CFrames are… more complicated to store compactly. Naively, they’d be 48 bytes because you’d write four Vector3s (position and the orientation). If you don’t do it naively, things get more interesting though.

Specifically, there’s some orientations that are extremely common because they’re aligned with the 3D axes. An example is the default rotation. What you can do is assign each of these ‘common’ orientations an ID and instead of writing the orientation, write the ID instead. If you only used 1 byte for the ID, the common examples become 13 bytes (position Vector3 + 1).

This is what Roblox does for CFrames and it’s very efficient. Implementing it yourself can be weird, but it’s not very difficult.

tnavarts · November 30, 2023, 7:14pm

The rotation part of a CFrame should be encoded as a quaternion if you want a minimal representation with no redundancy, which is four numbers that you can pick the precision of depending on your needs.

Roblox’ replication format is also variable-length for larger types like CFrame and Vector with affordances to make common values such as axis aligned rotations and integer sizes much smaller than other arbitrary values.

blinkybool · November 30, 2023, 7:21pm

Please note that buffers currently are not supported by DataStore APIs; using buffer.tostring(buf) may not produce a string that can be stored in DataStores because it will often contain non-UTF8 bytes.

Is there a plan to support this in the future? Either in the form of a .toUTF8-string(buf) or support for non-UTF8 characters in DataStores?

WheretIB · November 30, 2023, 7:26pm

Yes, we are exploring the possibilities to store buffer data in the DataStore.

Kironte · November 30, 2023, 7:28pm

Roblox has been cooking quite a bit lately, huh? No more string.pack for sending data over the network! I hope buffers get adapted to Datastores so their data can be saved permanently.

0MAR280 · November 30, 2023, 8:01pm

Thanks for this awesome one bro!
I now have a little bit more understanding, Thanks!

SelDraken · November 30, 2023, 10:42pm

This reminds me of the memory allocations in C, I love this stuff, keep up the great work!

As for feedback, I have to agree with others that we need a simple way to manipulate bits.

Also, the ability to directly read and write this from a data store will be great.

dannyminaya123 · November 30, 2023, 10:46pm

valla espero probarlo pronto me interesa mucho usar esta herramienta para que me pueda ayudar muchos mas rápido felicidades al grupoo a la persona en la cual lo creo

C_Corpze · December 1, 2023, 12:18am

OH I actually have another question, before I forget!

Okay so, currently I see we have functions for writing 8-bit, 16-bit, 32-bit and 64-bit numbers, correct?

What I am missing however, and would absolutely love to see.
Can we eventually also get the option to read/write 24-bit numbers?

I feel like 24-bit is really underrated here because sometimes 16 bit doesn’t give enough precision but 32 might be just too much.

I know for a fact that things like sound and audio data (often in the flac format) sometimes uses 24 bit precision and it’s used in other things as well.

Most ideally would be, if we could specify outselves how many bits we exactly want to read/write.

4 bits? 6 bits? 12 bits?

I would love having that amount of flexibility and control but I’d also already be very happy with 24 bit and 4 bit numbers.

tnavarts · December 1, 2023, 12:45am

The underlying CPU architectures don’t actually have native operations on 24 bits.

Even if such an operation were exposed Luau would have to be doing a write 8 bits + write 16 bits or similar underneath which you can already write a utility function to do.

Daw588 · December 1, 2023, 1:33am

@WheretIB The buffer.tostring(b: buffer) has incorrect type of buffer.tostring(). Luau type checker claims that it takes no arguments, even though it does take one, which is the buffer that has to be converted to string.

chrome_tl6YTW3EeT

MrChickenRocket · December 1, 2023, 1:48am

LetsGoooooo!

The big sticking point for me with using string.pack and bitbuffer was how slow it got when you had a lot of conditional data, eg: delta compression

This hopefully addresses that nicely. I’ll report back.

Bam_Sori · December 1, 2023, 4:12am

then Could you also support float16?

When serializing the CFrame type, I convert its rotation matrix into a quaternion and pack it into three float16 values, not float32 values.

like this:

local function quaternionVector3FromCFrame(cframe: CFrame)
	local axis, angle = cframe:ToAxisAngle()
	return axis * math.sin( angle / 2 )
end

local qVector = quaternionVector3FromCFrame(value)
buffer.writef16(serializedBuffer, 0, qVector.X)
buffer.writef16(serializedBuffer, 2, qVector.Y)
buffer.writef16(serializedBuffer, 4, qVector.Z)

--deserialize
local qx,qy,qz = buffer.readf16(serializedBuffer, 0), buffer.readf16(serializedBuffer, 2), buffer.readf16(serializedBuffer, 4)
local qw = (1 - (qx^2 + qy^2 + qz^2))^0.5

return CFrame.new(x,y,z,qx,qy,qz,qw)

tnavarts · December 1, 2023, 4:53am

For a quaternion you probably want to normalize it, then you can get the maximum amount of precision by encoding the components as integers. Using a float16 would waste a lot of potential precision.

EmeraldSlash · December 1, 2023, 5:09am

Is there a reason why you don’t mention storing rotations in axis angle format which would be a lot easier for beginners unfamiliar with quaternion math to use? I’m not super familiar with their numerical differences myself so I’m not sure if there are any weird gotchas with axis angle in comparison to quaternions.

tnavarts · December 1, 2023, 5:11am

Because we’re in the buffers thread. If you’re going to the trouble of using buffers in the first place you ought to be going for the optimal solution.

Also, even if you don’t understand quaternions you can look up some boilerplate conversion / normalization code easily enough.