How to compress or send less data through a remote event?

loyanafirst · July 9, 2023, 3:42pm

actually uses 64 bits. doc

Note that Luau only has a single number type, a 64-bit IEEE754 double precision number (which can represent integers up to 2^53 exactly),

which means that, if you need to use 10-bit numbers, an integer could fit five of those numbers.
To package the numbers you simply do a base conversion, to base 1024 (2^10) in this case, it’s as simple as that. Although I am not sure of the performance, it is likely to have little impact because it is pure arithmetic.

local base1024Number = x + y*1024 + z*1024^2 + v*1024^3 + w*1024^4

make sure that x, y, z,… are 10 bits to avoid overflow errors.

C_Corpze · July 9, 2023, 4:52pm

Lua uses 64 bit numbers?
Wow, that’s actually very useful to know, I might have to do a change of plans here then.

Should give room for putting 4 16-bit numbers in a single 64-bit number.

Using this method I could lossy compress a Vector3 plus still have 16 bits of room for additional data if needed which is also quite useful.

In theory I could give every player/entity in a game a unique ID and whenever a pellet from a shotgun hits them I can use the first 3 x 16 bits for position data and use the last 16 bits for additonal data like the ID of the entity that was hit or the ID of the material, etc.

16 bits of free space left over after the positional data is enough room for extra bonus data which might be really useful.

Though, I would have to know how to mash numbers together and be able to split them apart or even read a single bit of a 64-bit value.
To be fair, arithmetic and manipulating bits isn’t something I’m an expert at so I’d have to learn a bit more about it.

How in Roblox Lua for example would I read the 12th bit of a number?
How would I get the last 4 bits of a number or the 8 bits in the middle and convert it to something I can read?

I assume it would involve some bit masking and shifting, I somewhat understand bit shifting but I’m not entirely sure how bit masks work and how to use it to read a specific sequence of bits in a big number.

What functions like bor() or band() do sometimes still seems a bit like magic to me (I just know things like 1 + 0 = 0 and 1 + 1 = 1, etc) and I don’t see a bit64 library in Roblox Luau.

loyanafirst · July 9, 2023, 7:24pm

Unfortunately lua, and therefore luau, is known to be a lousy language for bit manipulation.

The best we have is arithmetic, mainly with power of 2 bases (2, 4, 8, …, 256, 512, 1024, …). For example in these bases multiplying/dividing a number by its base means a displacement of one digit. With the % modulo operator you can loop through each digit of the base and do whatever you want with them.

Basically it is all arithmetic with binary numbers, which can be extended to any base power of 2.

Regarding your problem, remember that the mantissa is 52bits which is used to represent all integers between -9.2x10^18 to 9.2x10^18 approximately. On the other hand, not all decimal numbers can be represented, since they are infinite (between 0 and 1 there are infinite decimals). When an operation is done with decimal numbers, it is almost always a rounding to the nearest reprecentable value (this does not happen with integers). This in turn makes the exponent a bit unpredictable, so in this case it is not advisable to use the 11 bits of the exponent.

local n = 15671
for _ = 1,11 do 
    n = math.floor(n/2) 
end
print("12th bit:", n%2)

local n = 1219449499797184
for _ = 1, 48 do 
    n = math.floor(n/2) 
end
print("the four last bits:", n)

local n = 1219449499797184
for _ = 1, 18 do 
    n = math.floor(n/2) 
end
print("the 8 bits in the middle:", n%256)

forget to mention that in these examples it is assumed that the 52 bits of the mantissa are being used

C_Corpze · July 9, 2023, 8:13pm

Thank you for this response.

Damn.

local n = 1219449499797184
for _ = 1, 48 do 
    n = math.floor(n/2) 
end
print("the four last bits:", n)

This does seem slow considering it’s a loop and I suppose I’ll have to learn some arithmetic to understand what exactly is happening here.

I understand rounding, division, division with remainder and what power-of does.
But struggle to understand how multiplying/dividing by X amount gets to an specific bit.

I will definitely look more into it, meanwhile I suppose using strings is also an option?
I know strings are sloooowwww but Lua optimizes them, right?

My question from the very beginning has been “how to merge and split integers” basically.
Which, there are multiple ways to do that.
What I eventually aim to achieve is having to send less data through remote events.

Firing a shotgun and sending position and hit data for 10 - 20 pellets every shot is too much data, especially for auto shotguns in a shooter game.

It has me wondering, am I looking at too specific of a solution?
Could I use strings instead and compress the data differently or more efficiently?
If I recall, a string is 1 byte per character, a byte is 8 bits.

My guess is that I’d have to convert between characters and numbers then.
Lua clearly wasn’t designed for manipulating bits so perhaps strings are a more viable option, the string library seems to have plenty of functions.

loyanafirst · July 9, 2023, 9:04pm

Yes, using strings is a good option and they are quite optimized. I have a lot to say about it, but I think your personal experience will be more valuable to you than what I can say in a few paragraphs.

In the forum itself you will find a lot of information about it.

By the way, it is not only important how much data is sent, but also at what time it is sent.

solutions to problems in game development are usually very specific. Is your problem really specific to your game? I mean, have you made prototypes where you have noticed poor performance? We often focus on problems that have not yet occurred or will never occur because of a small change in game design for example.

apictrain0 · July 9, 2023, 9:56pm

You are kinda wrong Roblox actually used a single float precision which is still 32 bits but loses precision the bigger the number.

Assuming you meant normal numbers and not floats of any sort, wouldn’t anything that exceeded 511 jump it back to -512 unless you are using a signed number.

Anyways. ANY single float or 32 bits can be represented in a string with the length of 4.
in hexadecimal it can be 8
(a string character can be represented from 0-255 or 8 bits)

I think the best solution is to convert vector 3 into a 12 length string.

but to this topic with about 10 bits you can use.

But other than those, you can use band for a number like lets say 0b1010_0101_0101_1101 to get rhe bits 1-10 (in reverse) you can do bit32.band(0b1010_0101_0101_1101,0b11_1111_1111)
you can do the rest with the other X,Y and Z
and because all the Dimesion with 10 bit precision Combined is less than 32 bits or 30 bits in total
You wont lose precision using BitShift
so you can do

local Compressnumbers=bit32.lshift(bit32.lshift(
bit32.band(Vect3.X,0b11_1111_1111),10)+
bit32.band(Vect3.Y,0b11_1111_1111),10)+
bit32.band(Vect3.Z,0b11_1111_1111)

also in this operation, because Never will the addition touch a zero you can replace it with bit32.bor()

and if you convert it into a string it can be represented in just about 4 characters.
And to convert it you can do

local Copy=Compressnumbers
local CompressedString=""
for loop=1,4 do
    CompressedString..=string.char(bit32.band(Copy,0xFF))
    Copy=bit32.rshift(Copy,8)
end

you can probably figure out how to revert it.
Edit after making a script I managed to compress and decompress the vector3 with 10 bit precision

local Vect3=Vector3.new(123,434,383)
--compress
print(Vect3)
local Compressnumbers=
	bit32.lshift(bit32.lshift(
	bit32.band(Vect3.X,0b11_1111_1111),10)+
	bit32.band(Vect3.Y,0b11_1111_1111),10)+
	bit32.band(Vect3.Z,0b11_1111_1111)
local Copy=Compressnumbers
local CompressedString=""
for loop=1,4 do
	CompressedString..=string.char(bit32.band(Copy,0xFF))
		Copy=bit32.rshift(Copy,8)
end
print("Compressed to\n",CompressedString)
--decompress
local CopyCompress=CompressedString
local DecompressedNumbers=0
for loop=4,1,-1 do
	DecompressedNumbers=bit32.lshift(DecompressedNumbers,8)+string.byte(CopyCompress,loop,loop)
end
local DecompressedVector=Vector3.new(
	bit32.band(bit32.rshift(DecompressedNumbers,20),0x3FF),
	bit32.band(bit32.rshift(DecompressedNumbers,10),0x3FF),
	bit32.band(DecompressedNumbers,0x3FF)
)
print(DecompressedVector)

C_Corpze · July 9, 2023, 10:41pm

Wait, are integers in Roblox actually 32 or 64 bit?
And how do we know if they are signed or not?

If I do

local n = 2 ^ 65
print(n + 1)

I get integers larger than what should be possible with unsigned 64 bits.

But doing

print(18446744073709552000 + 1)

still prints the largest possible 64-bit integer without any difference.

It’s also posible to have the largest possible number going negative.

local n = -2 ^ 65
print(n)

This prints -36893488147419103000 but that shouldn’t be possible if it’s a signed 64-bit integer because one bit would be used as the - sign.

Roblox only has a bit32 library yet we can use 64-bit numbers it seems?
Why is Roblox Lua so confusing?

This makes me have so many doubts about how numbers work in Lua.
It’s not like what I learned from any static typed language at all.

Large numbers also don’t seem to wrap around when hitting the theoretical limit which is even more confusing.

apictrain0 · July 9, 2023, 10:54pm

I was wrong, but let me make things clear, There is no integer in roblox, there is only floats,
Floats are different than integer, if you would like to know how look it up on wikipedia.
But roblox uses a double float for numbers, and a single float for Vector3s,
although A double may take up 64 bits of space
A double float uses 52 bits for the fraction
and You said it can go above 52 bits.

but the thing is it loses precision, It also has 11 buts for the exponent. which ranges from -1023 to 1024,
because after the 52nd bit it can only represent even numbers because it is [sign (1bit)][exponent][fraction]
So what happens on the 53rd bit

and because so it is a fraction, which is basically an integer with 52 bits, it is multiplied by 2,
and an integer is a whole number so no 0.5 exist,

the same thing applies to the 54th bit
fraction 52 bits multiplied by 2^2 which so It can only be a nimber divisable by 4.

Like you said,

this should basically explain why this happens
Tou would have to add 1594323 or a little bit lower for it to make a difference
which and the 1594323 number is just 2^13

Edit:
I know floats are confusing It look me alot to learn about it, but its not just luau that has this problem, things like javascript do too Or basically any programming languages with floats have an issue with this, and it is seen in almost every video game.

C_Corpze · July 9, 2023, 11:00pm

But if Roblox uses about 52 - 64 bits for numbers, why not have a bit64 library for manipulating all those bits?
This makes me wonder if combining 4 x 16-bit values into a single 64-bit value would result in losing precision when converting it back later.

Does this mean that most compression methods that involve manipulating bits end up becoming unreliable due to imprecision (In Lua, JavaScript, etc)?

apictrain0 · July 9, 2023, 11:10pm

I do not know much of the history.
I dont know why they dont have a 32bit library.

but Converting something into a string, which can basically go on forever and not converted into a float, gaurantees that the data will not lose precision.

But no, you will not lose precisiom with 64 bits unless it is a float, which we know in luau only uses floats. BUT you said you wanted it to be represented in 10 bits anyways.

2nd alot of compression methods do lose data.
and compression methods that dont are called lossless data compression.

Not necessarily, if you are using floats then most likely, which is usually a newbie developer mistake.

but not a problem for languages where you can not use floats like C, but like I said with strings, they dont lose precision, so by converting the data into a string, you can mess with it all you want.
AND most compression methods do not use floats at all.

Dudeguy90539 · July 9, 2023, 11:27pm

I initially thought this would be useless, but I suppose for something like an automatic shotgun, that is a hell of a lot of data to send.

Do we know if ROBLOX already has any kind of innate compression when sending things like Vector3s across the client/server boundary? I’d imagine so, but I’d have to ask a Roblox team member to be sure

C_Corpze · July 9, 2023, 11:29pm

Pretty interesting knowledge.

Now the 10 bits was more of an example, because the largest 10-bit number seems to be 1024 and I don’t think that’s big nor precise enough for bullets or raycast data in a game.

I will most likely be using relative positions or something since local space numbers tends to be smaller than world space numbers.
Not sure what would happen if I gave bullets/raycasts in a game let’s say… 1 decimal of precision.
And some weapons might shoot really far, possibly 3000+ studs.

Now let’s say I really don’t want to use too much unnecessary bandwidth in remote events so I do a little bit of light-weight compression.

Should I try to combine 4 x 16-bit numbers into a single 64-bit number or should I put 2 x 16-bit numbers into a 32-bit number?
I’m a bit unsure of what goes through Roblox’ remote events.
Maybe Roblox internally already compresses some data before it is send but I don’t know, can’t find much about it online.

I do know for a fact that most weapons will likely send a start and end position through a RE + whatever might get hit on the way.

I suppose I could also try to combine 2 Vector3s into a single Vector3, Vector3s use 32-bits for X, Y and Z, right?
But 16-bit numbers also can’t get super large so that would become a problem if the map of a game is any larger than 600 studs or you must be willing to only use 1-decimal of precision but this would also cripple you by limiting map size to only about 5000 studs or so and 1/10th of a stud might become really noticeable.

apictrain0 · July 9, 2023, 11:48pm

Well first in this post I would like to say that nothing in roblox is truly 32,16 or 64 bits,
But like I said, You can always find some way to convert a single float to a string with 4 characters, and use those,
I do not think it is good to combine 2 vector3s into 1 vector3

also I do not think data is compressed by remote events, by what I know it used JSON, which is not really meant to compress data, and usually makes data more bigger.

also in my code you can just add more 1’s after the 0b11_1111_1111
and for the 0x3FF just the exact same thing but in hexademical

But I Think NO map should be 2^52 studs because at the end of the map, all the precision is basically gone.
but atmost be something like 2^14 or 8k studs

so probably something like 22 bits, and 16 bits for the height of the map. Because nothing really goes beyond 1024 studs+1/64 or decimal precision.

So maybe, my compression method is still valid just a bit modified.

C_Corpze · July 10, 2023, 12:03am

I see, thank you!
I shall look into and explore some of these methods and see which one will eventually be the best choice.

If you have any tips such as for compressing big arrays (shotgun that fires 20 pellets per shot for example or designing systems in a way where less data is required for the server to know what a player fired at, etc) I’d love to hear!

Maelstorm_1973 · July 10, 2023, 12:08am

I see that there is a lot of confusion here about integers and floats.

Integer

First of all, unlike decimal numbers, integers are radix 2 (power of 2) numbers because each position in an integers can only be 0 or 1. For decimal numbers (radix 10), each position in a number can be 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. So if you have a number 10⁶ That is equivilent to 1,000,000. For an integer, 2¹⁶ is 65,536. 2³² = 4,294,967,296. These are unsigned numbers. For signed numbers, the formula is 2^{x - 1}-1. So for a signed 32-bit integer, the value range is -2,147,483,648 to 2,147,483,647. The maximum positive value for any unsigned integer is 2^x - 1 because you still have to represent 0.

Note that the exponent represents the hard limit as to the maximum values that a integer can hold. Unpredictable results can occur if the limit is exceeded.

As for the OP’s question, you can split and combine integers if you can guarantee that they will within 8, 16, or 32 bits. Roblox does not support 64-bit integers at this time. The way to do this is as follows:

-- Splits a 32-bit integer into two 16-bit integers.
local function split32to16(x)
	local low = bit32.band(0x0000FFFF, x)
	local high = bit32.band(0x0000FFFF, bit32.rshift(x, 16))
	return low, high
end

-- Combines two 16-bit integers into a 32-bit integer.
local function comb16to32(low, high)
	return bit32.bor(bit32.band(0x0000FFFF, low), bit32.band(0xFFFF0000, bit32.lshift(high, 16)))
end

-- Splits a 16-bit integer into two 8-bit integers.
local function split16to8(x)
	local low = bit32.band(0x000000FF, x)
	local high = bit32.band(0x000000FF, bit32.rshift(x, 8))
end

-- Combines two 8-bit integers into a single 16-bit integer.
local function comb8to16(low, high)
	return bit32.bor(bit32.band(0x000000FF, low), bit32.band(0x0000FF00, bit32.lshift(high, 8)))
end

Disclaimer: There’s a few things that you need to keep in mind when using these.

There might be some errors to this since I did this from memory. I wrote these routines and quite a few others some time ago in C/C++.
If you try to combine numbers greater than what it’s looking for, those extra bits will be masked off, so you may get a number you weren’t expecting.
No error checking is done.

Another thing to consider is endianness, or byte order. Although LUA insulates us from this, in other languages it can be a concern when dealing with CPUs that are not Intel/AMD/Cyrix (Little Endian). ARM CPUs (most, if not all mobile devices) have the ability to set the byte order to either 1234 (Big Endian) or 4321 (Little Endian). Other CPUs such as MIPS, Sparc, and IBM’s Z-Processor are big endian devices. Furthermore, byte order on the network is also big endian. Endian has to do with the order bytes are stored in memory for multi-byte integers in respect to increasing memory addresses. For instance, the 32-bit number 0x12345678 is stored as 0x12, 0x34, 0x56, 0x78 in memory for big endian machines. For little endian machines, it’s backwards: 0x78, 0x56, 0x34, 0x12. So make sure you get your byte order right.

Floating Point

Now the floating point specification is the IEEE-754 standard. It specifies the layout of floating point numbers in 16, 32, 64, 128, and 256 bit formats, also known as precision (someone did mention that). For all the formats, the basic layout is the same regardless of the width of the fields.

The sign bit. When it’s a 1, the number is negative.
The exponent. The exponent is encoded using offsets, so a 0 exponent is not 0 but another value. So for a double, its 0xb0111111111 (0x3ff). 0 and 0x7ff have special meanings which are mentioned in the double document on Wikipedia.
The mantissa or fraction. The leading 1 is always assumed, but the first bit of the mantissa is 1/2, the second is 1/4, the third is 1/8, and on down the line for however many bits the mantissa is.

A word of warning though. LUA does not support direct manipulation of floating point types at the bit level. I written code in C/C++ that does do this for a big number library (numbers that are so big they do not fit into a native CPU register). It can get quite complicated depending on what you are trying to do.

Another way you can shoot yourself in the foot with floats is comparison. It is not recommended to directly compare two floats using == or !=. In fact, C/C++ compilers will warn you of this. The best way to handle this is as follows:

local x = 0.33298575
local y = 0.33298243

if math.abs(x - y) < 0.0000000001 then
	-- Do something
else
	-- Do something else
end

Hopefully this helps people.

apictrain0 · July 10, 2023, 12:13am

I might of foreshadowed it, but I still forgot about overflowing of integers like about how I said with a signed 10 bit integer goes over 1023 it will go back to -1024, which may be something people could exploit to shoot someone across the map, But probably a way to defend this is by something like Checkimg if it goes over or using math.clamp

Maelstorm_1973 · July 10, 2023, 12:30am

@apictrain0 @C_Corpze

It’s not necessarily 64-bits. It usually is in the normal course of things, but I’ve ran into situations where I had numbers like 2¹⁰²⁴ be properly represented with complete precision. Variable in LUA and Roblox are variant types, so what I think it’s doing is setting the data type of the variable on the fly to meet the needs of the data. A variant type like in PHP and JavaScript is something like the following.

One or two bytes to denote the type.
One to four bytes to denote the size.
The data.

The value of field 2 depends on what the value of 1 is. Although on the command line, when I do print(2^1022), it prints 4.49423283715579e+307. But if I copy an inf value from a constraint in the workspace, I get this:

179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368

That number is 2¹⁰²⁴ as I confirmed it on a big number calculator. I have an open bug report about that because the constant values are missing from the documentation of the math library.

And math.huge = 2¹⁰²⁴ = inf

apictrain0 · July 10, 2023, 12:57am

Thanks for correcting me, but I realized that in the numbers document it says there are 3 types of numbers and missing a type where it could reach 2^1024

But also this 2^1024 type number doesn’t appear to be anywhere in the doc.

and In the doc It says it is a number is a double
I tried using type() and typeof() but both returns number.
I have always assume even in the doc the int and int64 would convert to a double during runtime but I appear to be wrong as a 2^1024 number could exist.

Maelstorm_1973 · July 10, 2023, 2:20am

But so far, full representation seems to only be in the workspace. So something about how it’s represented in the workspace is different than how it’s represented in the code. You can do constraint.force = math.huge and it will show as inf in the constraint when you view it in explorer. If you happen to copy the inf value (which is what I did) you get that big number in the previous post.

To fully represent 1024 bits requires 128 bytes, which is in the big number arena (and that is an old standard for RSA crypto back in the early 1990’s). So what’s represented in the constraint is for the physics engine to use and it may require the full 128 bytes. Either way, it’s not the same datatype as what’s used in the scripts.

I think math.huge is the full constant and is a big number datatype because if you type this in the console, you get this:

  19:18:48.517  > print(2^1024)  -  Studio
  19:18:48.518  inf  -  Edit
  19:19:25.834  > print(math.huge)  -  Studio
  19:19:25.835  inf  -  Edit
  19:19:35.788  > print(math.huge == 2^1024)  -  Studio
  19:19:35.789  true  -  Edit
  19:19:39.752  > print(math.huge == 2^1023)  -  Studio
  19:19:39.752  false  -  Edit

It’s definitely a unique situation.

EDIT: On a hunch, I did this:

print(179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368)  -  Studio
  19:21:52.387  1.7976931348623157e+308  -  Edit

Yes, I told it to print that big number and it came back with what appears to be the maximum value that you can have for a double before it codes to infinity.

Mystxry12 · July 10, 2023, 6:30am

Actually afaik a number takes up 9 bytes i.e 72 bits. And yes as @apictrain0 pointed out, they aren’t stored as integers but IEEE signed doubles, which should technically make them 8 bytes however in other posts, its stated as 9 bytes so guess we have to take their word.

For additional info, check out the wiki page.

Now coming back to your question, from the post I linked we know that a Vector3 takes up 13 bytes but if we were to send the components as numbers, it’d take up 9 * 3 bytes which is 27 bytes so that isn’t an option. However if you looked at the post, you will notice that a string of length 1 is 1 + 2 = 3 bytes which is pretty good. So what if we encode the components of Vector3 to a character?