How to compress or send less data through a remote event?

Can you explain a litte bit more about separating them

Let’s say I have the numbers 10, 2850 and 565.
I want to combine them into a single value so it’s compressed.

Once it reaches the other side (the server), I want to turn that single value back into the numbers 10, 2850 and 565 without losing important information.

I plan to possibly also use this method for compressing data in datastores if it gets really big for some reason.

If you are looking to compress the data, I would suggest looking at either LZW or PK compression methods. Both have their strengths and weaknesses. I’m sure there are LUA implementations out there you can use.

If you can read code, then 7-zip source code is available for download.

I’ve actually looked into that but it had me wondering if this would actually result in smaller data.

Because to compress data, you also need a dictionary of keys and values to decompress it later but the dictionary of course also takes up space which might result in the compressed data becoming larger if it already is small.

A array of vectors and numbers already is relatively small.

The problem is that if we want to “zip” it, it might compress the data itself, but since we now also have to send a dictionary to the server for decompression, the dictionary might use more data, making the zipping inefficient and just slower.

Like I said, strengths and weaknesses. In some cases, it’s probably best to just leave things alone.

Updated title to be a bit more relevant and general because I might have to seek different solutions.

I did actually come across a library that uses remote events in a more optimized way but I don’t want to rely on 3rd party libraries since I might write my own library instead to use for multiple projects.

Trying to learn techniques instead of just copying what someone else already did.

1 Like

If you’re sending a large amount of data at one time, then compression makes sense. The LZW table is 256 integers which contains the counts for each character that appears in the data stream. You just send the table along with the compressed data and the receiving end recreates the tree.

GZIP style algorithms work by creating a dictionary. However, to keep the dictionary small and dynamic, a new dictionary is created for every 8KB to 16KB of data or so. This allows the algorithm to adapt to content changes in the data stream. For example, take an ELF executable which has a multitude of different types of data. The binary data won’t compress very well. However, text data does compress well.

With data compression, the more data you have, the more it makes sense to use it. My understanding is that Roblox converts data to JSON before sending it on the wire. They might even compress and encrypt it. It would make sense to do so. Have you studied any materials relating to information theory and information content? I have some books on data compression. They are outdated, but they cover the basics of what you need to know.

2 Likes

Roblox converting stuff to JSON before sending it through a remote did leave me with questions.
I saw a post earlier that showed how much data every data type in Roblox used.

If I recall it was something like…
A number was 9 bytes.
A string 2 bytes + it’s length.

A Vector3 was roughly 12 or 14 bytes which surprised me because that implies that sending a vector is cheaper than 3 separate numbers.
As 3 x 9 would be 27 bytes, which is way more than a Vector3 uses apparently.

What ESPECIALLY baffled me is that sending a boolean, a value that is normally only 1 bit and can only be 1 or 0, takes up…

4 bytes

Yes a boolean apparently is 4 bytes, which left me wondering if the JSON theory is true.
because the word “true” on itself already is 4 characters long which would make up for the 4 bytes.

But "false"is actually 5 characters but if I recall, a boolean uses 4 bytes regardless of it’s value which is really weird because if Roblox truly does put everything in a JSON table then this should be 5 bytes, right?

I did stumble upon an public library/resource called BridgeNet2.
And apparently this library is really good at optimizing networking and I’m just trying to figure out how it manages to use less data somehow.

I’ve looked through it’s code on GitHub but can’t exactly find in what script or which function does the “compression” or “optimizing” of data.
I don’t want to rely on 3rd party libraries because I want to develop my own at some point likely and not straight up copy what someone else wrote.
I seek to learn how things work so I can eventually pass on the knowledge.

Blank remote call: ~9 bytes

string (len 0): 2 bytes
string (len 1): 4 bytes
string (len 2): 8 bytes
string (len 3): 9 bytes
string (len 4): 10 bytes
string (len 5): 11 bytes
string (len 6): 12 bytes
string (len 8): 14 bytes
string (len 16): 22 bytes
string (len 32): 36 bytes

boolean: 2 bytes
number: 9 bytes

table (empty): 2 bytes
table (array with 4 numbers): 38 bytes

EnumItem: 4 bytes
Vector3: 13 bytes
CFrame (axis aligned): 14 bytes
CFrame (random rotation): 20 bytes

I did find this post, by @Tomarty.

Oh, here it apparently says a boolean is just 2 bytes, huh? Maybe I got 2 sources mixed up.
But that is still a lot of bytes for something that is basically only on or off.

A CFrame apparently is 20 bytes which also absolutely blows my mind because CFrames hold a position which is 3 values AND a rotation which in Euler form (I think) also has 3 or 4 values depending on if it uses quaternions.

6 numbers should be 6 x 9, right? funny sum.
Wouldn’t that be 54 bytes? That’s more bytes than a string with 32 characters.
Howwwwwww?

Does this imply I could compress strings by putting characters inside CFrame components?
The more I learn, the less I seem to know about the subject.
Seems like I might not know as much as I initially knew.
Roblox engine under the hood surely has it’s mysteries.

2 Likes

Here’s the thing. I ran into a problem where some datatypes are fixed size. A Vector3 for instance, even though it says its a number isn’t big enough to store a UserId (yes, I tried it). So based on this and other things. I’m thinking that it’s a modified form a JSON or a proprietary data format. I’m leaning towards the latter.

struct dataframe
{
	int datatype;
	uint32_t size;
	char data[1];
};

Basically, the datatype determines the size of the data. Then it’s placed in a memory buffer and the entire buffer is sent. They might even be using a union to do this. In any case, this is a common hack in C/C++ when dealing with different types and lengths of data within the same memory buffer.


#define		TYPE_STRING	27

char *buffer = malloc(65536);
uint32_t index = 0;

void packData(char *buffer, uint32_t index, int datatype, void *data)
{
	/* Setup */
	uint32_t size;
	uint32_t i;
	dataframe *dfptr;
	char *charptr;
	uint32_t txa;

	/* Pack the data */

	/* Set the pointer */
	dfptr = buffer + index;

	/* Determine the size of the data */
	if (datatype == TYPE_STRING)
	{
		size = strlen(data);
	}
	else
	{
		size = getDatatypeSize(datatype);
	}

	/* Fill the structure */
	dfptr->datatype = datatype;
	dfptr->size = size;

	/* Copy the data over */
	charptr = &dfptr->data;
	for (i = 0; i < size; i++)
	{
		charptr[i] = (char *)data[i];
	}

	/* Compute sizes */
	txa = sizeof(int) + sizeof(uint32_t) + size;

	/* Return */
	return(txa);
}

Something like that. It’s in C but that’s how I would do it. The return value gets added to the index so its pointing to the byte after the structure in the buffer. That would be the most expeditious way to do it without compression. This works for both fixed and variable data types, although the only variable data type is string.

As for boolean values taking 4 bytes, there’s a reason for that. CPUs cannot access individual bytes in memory. They have to read an entire word from memory (It’s actually more complicated than that. It’s actually a block of 128 bytes, but that’s a topic for another conversation.). Because of it, they assign an entire machine word to it. There’s also issues with memory address alignment. In most CPUs since the 80286 or so, the lower order address bits (A0, A1, A2, A3) are missing, so memory is addressed on a 32-bit boundary. Because of this data structures must be aligned on a 4-byte boundary. It’s actually more efficient hardware wise because of the way memory is organized. Each bit in the RAM is on a separate chip, and is 32 or 64 bits wide. So it makes sense to do it this way because of parallelism.

Remember that 128-byte block? That’s the size of a cache line. When data is read from disk, it’s a full memory page (4096 bytes) that gets read into memory, and it’s aligned on a page boundary. Once in memory, the cache hardware reads in the data in 128 byte blocks depending on what the CPU is addressing. The data propagates from main memory to the L3 cache (if equipped), then the L2 cache, then the L1 cache. The L3 and L2 caches run at FSB (Front Side Bus) speed. The L1 cache runs at CPU core speed since the L1 cache is on the CPU chip itself. Once in the L1 cache, the CPU can access the data, but only 4 bytes at a time. When the data is in a CPU register, then it can access individual bytes using byte register operands on the instructions. Furthermore, there are separate caches for program and data.

1 Like

This does have me wondering.
Since a boolean must use an X amount of bytes.

Wouldn’t it be possible to map 8 - 16 booleans to just a single set of bytes though? Why not do that?
I could see maybe memory address and whatnot becoming a problem.

But say you have an environment in which the same 8 - 16 booleans are always present and used.
Why not let them all share the same memory address and let them each use one bit of a byte?

I feel like you could already do this by allocating just enough memory for a 8-bit or 16-bit number and manipulating/reading individual bits in a C++ program?

my post covers some bit manipulation, and how i put head rotation into a single 8 byte number and compressed it to 4 bytes, for maximum efficiency and sending it extremely small to the remote event

in addition i covered binary it in these messages
image
“mf” is cool kid lingo for “my friend”

1 Like

in regard to the post,

each value in the v3 must have a limit in order to be compressed how you want it. as you want each v3 value to be under 10 bits, each value is limited to 1023. you must manipulate the values in order to fit between 0 and 1023, and the server, if you want, can subtract 512 from the parsed value to make the 0 to 1023 range into a -511 to 512

now that each value is 10 bits, you can use bit32.replace and bit32.extract (i prefer using bit32.and and bit32.rshift, because it feels natty) to get the 3 v3 values

local v3Encoded = v3Encoded
local X = bit32.extract(v3Encoded, 0, 10)
local Y = bit32.extract(v3Encoded, 10, 10)
local Z = bit32.extract(v3Encoded, 20, 10)

-- i think i used the above functions correctly, not sure.
-- in X, Y, Z you can now subtract -512 in order to convert from a range of 0 to 1023, to -511 to 512

local v3 = Vector3.new(X, Y, Z)
2 Likes

In another post in this thread
i already went over this, Then it sort of someway got off topic to Floating points. and its accuracy

I didnt know bit extract existed so I used another hacky way.

I have code in C that already does that. You can fit 32 boolean values in a uint32_t. 64 values in a uint64_t. You just mask off the bits. Values like 0x80000000 will set/clear bit 31 of a 32-bit variable (bit counts start at 0).

#define MASK 0x80000000

uint32_t variable;

/* Set Bit */
variable |= MASK;

/* Clear Bit */
variable &= ~MASK;

And that’s it. Of course I have functions in a library to do this which uses a lookup table for the mask for performance reasons. C/C++ language provides bitwise field definitions for structs. But the code to pull that off is slow and clunky which is why just about everyone uses the above method. It’s fast and simple because it breaks down to a single AND or OR instruction on the CPU.

This might actually be super useful if I just need degrees and whatnot compressed into less data.
Thank you for your input!

I suppose since a server already pretty much knows what item a player is holding for example, that we can predict the distance it will travel.
So perhaps I could get away with just using a start position and a angle to fire at.

For things like shotguns I might be able to use Perlin Noise.

I was thinking, since we already know where a player is gonna shoot and what direction.
We can simply use noise or RNG with a seed to get a somewhat accurate prediction of let’s say… 20 pellets?

Shotguns are one of the biggest problems I might face due the fact they could fire 10 projectiles.
And if it’s an auto-shotgun it might fire 30 projectiles under just a second which is a lot of data to send through.

For this matter I was considering to only give the server the position + rotation, do projectile simulation on the server and have all clients predict the projectiles using noise or some seed-based RNG with fixed values to predict and generate random spread for every projectile.

Not sure if this method is ideal though since every projectile hit would still have to replicate and there might be issues where client and server predictions are slightly off.

1 Like

heres how i would do it:

on the server, calculate 100 firing angles (which are also sent to the client, and are reused for every weapon).
and calculate the player viewmodel nozzle position based on their information from playerPing/1000 seconds ago. calculate this position based on whichever X direction the player was facing with their humanoidrootpart at that time, and a 4 byte number (Y angle between 0 and 180, which is compressed into 4 bytes using my number module in that post i was talking about; and 90 is subtracted to give a range of -90 degrees to 90).
then you use the predetermined firing angle the client used on the server and raycast on the server to see what got hit.
on the server the amount of projectiles shot from each weapon is predetermined and sent to the client, and are reused

i probably was not clear when explaining this, its really difficult to explain

1 Like

The question is why are you sending all that data to the server? In my gun kit, no matter what weapon the player is using, it sends two values to the server:

  1. The 3D world coordinates of where the mouse pointer is at (for a TPS shooter like mine, that’s at the center of the screen).
  2. The CFrame of the player’s camera.

The server takes the coordinates and generates 9 projectiles with a radial spread and raycasts to those new points. That handles the spread from the shot. Then it checks the impacts. If a pellet impacted a player, then it does damage to the player.

The reason why I did it this way was for security. Shooting games are inherently insecure because they require precise information from the client which can be spoofed, hence aimbots.

I shall give some context.

So my current project is essentially a top-down shooter game so technically most things only really need to happen aligned to a plane.
Though I might apply some of the logic to a 3D game later since I’d just have to use one extra axis.

But this top-down shooter needs to be REALLY fast with absolute MINIMAL latency because it was inspired by games like Hotline Miami.

I intend for bullets to be dodge-able projectiles rather than long hitscans.
Everything in the game has to be as fast and optimized as possible since you can essentially run really fast and only have about the blink of an eye to dodge an incoming projectile.

Shotguns are supposed to fire 5 - 10 projectiles in a single shot, each of them having random spread and being able to dodge them is also essential, they move slower than machine gun / handgun projectiles.

Some projectiles might hit earlier than others as some pellets could hit the corner of a wall while the other pellets fly past it.
Projectiles might not hit in order or at the same time but are fired at the same time.

I want to optimize this project as good as possible since players will essentially just die in 2 - 3 hits.
You only have 2 - 3 hitpoints, some weapons even one-shot which is also dependent on range.

And I want to minimize input lag or frustration caused by delayed projectiles or being hit by invisible things that moved too fast for other clients to register.

I initially thought “maybe sending less data and as small as possible solves long remote event queues or slow packet delivery”.
At this point I’m kind of unsure what the most ideal solution would be for optimizing remotes here.

Players will also have states, which might be attributes since they replicate from server to client.
Basically one attribute which we’ll name “state” for now…

If this attribute is set to 0, the player is standing, if it’s set to 1, the player is stunned/laying down.
If the value is 2, the player is interacting with an object from which they cannot move.
If it’s value is -1, the player is dead, if -2, the player is dead + gibbed, -3 if the player burned to death, etc, etc…

And it goes on like that, a single state value that tells everyone else what the player is doing (because you cannot do or be more than 1 thing at a time).
All animations and graphic effects / sounds are client-side and whatever needs to happen is supposedly communicated through this one single state attribute.

But I wonder if there is a better and more optimal way to do it.

Unless you are doing everything as 2D GUIs, you are working in the 3D environment. I would still have the client just send the coordinates of the shot target and let the server figure out the rest. FPS games do not really suffer from input lag. You get lag for two reasons.

  1. Server Congestion
  2. Network Congestion

Both are generally self explanatory, and the only one that you need to worry about is server congestion.

1 Like

I’m looking further into this method of combining numbers.
Roblox finally has buffers now which is neat and makes combining numbers easy!

However to make data as small as possible I found out that I sometimes need to combine numbers that are smaller than a byte.

For a top-down shooter I’ve considered using 24-bit numbers for a projectile’s position
and 1 byte + 1 bit for it’s rotation (1024 possible values).

Though by doing this 1 byte remains with 7 unused bits because rotation is stored in one byte + 1 bit.

But 7 bits still leaves enough room for 128 possible values which I can use to assign an unique ID to a projectile since projectiles might not get destroyed in the same order that they spawn in.

So I got 2 bytes in a buffer.
I wonder if I can use the bit32 library to fit a 9-bit and a 7-bit number together and be able to separate them later.

If I do this right I could send weapon projectile data to the server in a small package that’s just 8 bytes in size!