Data Compression for DataStore & MessagingService?

Tl;dr I need a Module to compress Data for DataStore and MessagingService
(Well, not just me but also the community)

More Information

For most games DataStore’s String size limit wouldn’t really be a concern but in some cases you might need to save a lot of Data for Customization, House Building, Items & anything you can think of that requires a lot of space.

I know that you can use Multiple Keys and Scope for DataStore but I rather not get myself involved with that due to the limitations and current state of reliability it has to offer, so let’s not talk about that.

Indeed you can and should optimize your Data Structure to decrease it’s size but that’s not enough.

I have searched DevForum and found 3 notable DataCompression modules but they are old, some aren’t easy to implement into an existing game either and I’m not entirely sure if they are compatible with saving Instances in a certain way.

How do I achieve a Data Compression Module that works well with;

  • Saving Data for House Building (Instances, CFrame, Color, Name, etc)
  • Saving Data for a lot of Tables, Strings, Numbers, Booleans
  • & most importantly easy to implement in an existing game

This is also useful for MessagingService due to it’s current Data Size limits

I’m not entirely sure how to make one thus creating this thread

3 Likes

You are going to have a hard time finding a highly generalized solution that offers both a significant compression ratio and the kind of reliability you need for a game. It is dangerous to rely on ‘black-box’ compression tools where you cannot predict if they will fail to reduce your data down to an appropriate size (which would result in data loss).

This is the correct approach. You need to think carefully about exactly what data you need to store. Here are some examples of things you could consider:

  • Saving the contents of placed items within a house may be done more effectively using numeric identifiers instead of strings (e.g. 12 instead of "ModernStyleCupboard");
  • Saving some data is unnecessary, such as the size of objects whose size can be calculated in-game.

There are other considerations that may be relevant, but this depends on what kind of data you are working with (again, this is why generalized solutions like the modules you mention will likely not be helpful).

12 Likes

In addition to the above, if you want to send a table of data, you can turn it into a string for transfer or saving, and then turn it back into a table later. By default this will be done using the JSON methods in HttpService, but you can do it manually if you want to pack more stuff in.

Here’s an example where I make a really compact string to store an array of boolean values:

local array = {true, false, false, true, true, true, false}

local output = ""
for _, value in ipairs(array) do
    output = output .. (value == true and "1" or "0")
end
print(output) --> 1001110

It’s already shorter than the shortest JSON output:

[1,0,0,1,1,1,0]

You can do much, much better of course, but this is the idea.

3 Likes

I’m working on my own algorithm to solve this issue… Sadly we’re limited to bytes <= 127 which doubles the final size. Thankfully that won’t stop compression! :smile:

If you’d like to look at the source code of my current version I can send it to you later today.

1 Like

More than doubles the final size - another reason to be wary of generalized compression solutions in the Roblox context.

1 Like

As @sircfenner stated, it’s very hard to find a single method that works for everything, so I’ll offer a specific one:

An obscure way to save data when dealing with CFrame is to compress unit vectors into spherical coordinates. This drops a number for every unit vector, so if you’re saving the orientation of 1000 objects, and assuming the size of a float is 4 bytes, you’d be dropping 4KB. Though this is assuming you’d be saving the data as floats directly, which you won’t be. In reality, you’ll be dropping up to 32 bytes per unit vector, or 32KB for every 1000 objects.

This works because the length of a unit vector, called rho, will always be 1, and therefore can be omitted in the saved data. Here’s a pretty accurate snippet from my framework I use to do just this:

local function cartesianToSpherical(x, y, z)
	local rho = (x^2+y^2+z^2)^0.5 return rho, acos(z/rho), atan2(y, x)
end
		
local function sphericalToCartesian(rho, theta, phi)
	return rho*sin(theta)*cos(phi), rho*sin(theta)*sin(phi), rho*cos(theta)
end

Also note that JSONEncode will encode 0.2 as “0.200000000000000011102230246252”. If you don’t need the precision, compress your numbers.

4 Likes

This may be way over the top (and may not even work of the aenn can’t reconstruct your data correctly), but auto-encoder neural networks can actually help provide a solution to this. It will take time, but you can REALLY compress things with this methods.

You can watch this video to see if an auto-encoder will fit your needs (timestamp 5:53 specifically):

Yes, it is about dancing, but he runs into a problem where a file size is way to big so he has to compress it. It’s a good video, trust me.

2 Likes