Hello, I’m looking for a Roblox library that can compress actual text(sentences etc.) for datastore saving. By now every library I find is only good at compressing numbers into a text representation(and thus reducing the length of the string) but not actual text, in fact, sometimes they make text longer than the original input.
What I thought of is the following:
Step 1) Specify the charset of allowed characters(for example only English letters, numbers, spaces ? and dots)
Step 2) Convert the string to a decimal using the alphabet length as the base(so if the system can have 40 characters, from base 40 to base 10)
Step 3) Convert the string from decimal to base 93 to reduce the size.
Additional steps:
Determine frequent words/segments/letters and use Huffman codes
Use other cool compression methods I’m not aware of
PS: Please tell me someone has done it and I don’t have to reinvent the wheel
How much data are you planning on saving / what’s the purpose? Even thinking of using Huffman codes for text compression when you get 4MB of space per key sounds unnecessary.
The problem I’m facing is kinda odd for Roblox and maybe that’s why I can’t find solutions. You see what the users create(like for example a home, but simpler) is something on a permanent location in the server that all the people can see. So there’re two data stores, there is the player perspective at which I don’t have to compress anything due to the huge amount of space, and there is the public perspective where I have to load information about every user house at a radius(something like chunking). So to achieve this I have to compress the information of each “house” as much so the size of the chunk is less than 4MB by such an amount that I can ensure an overflow won’t happen(so like 2MB maximum under pressure). I’m also trying to load large chunks at once instead of small ones so I have to make fewer data store requests for loading the map around the players.
PS: Imagine a data store where each key is a chunk, a part of the map instead of a specific user.
I see, you’re trying to compress as much as possible not to avoid limits but for efficiency.
I’m not proficient with any compression libraries, as the only one I’ve ever used was here. I don’t know what you’ve done so far, so I recommend to focus on heavy serialization and game choices that influence data size before thinking about compression. If you have any questions about that, I can surely answer them.
Oh wow! What I was failing to see is the effect that the library has on the data as it increases. Using my silly base93 method I was able to reduce 7.3MB of data down to 3.6MB(the data was a bunch of randomly generated houses) but with the module, I was able to bring it down to only 900k and at a faster speed, thank you.
I know this has been resolved, but here’s what I managed to do:
local str = "Position = Vector3.new(1, -2, 3), Size = Vector3.new(4, 4, 4)"
local tab = {}
for k in str:gmatch(".") do
table.insert(tab, k)
end
table.concat(tab, " ")
local function compress(to_compress)
local compressed = {}
for index in ipairs(to_compress) do
table.insert(compressed, to_compress[index]:byte())
end
table.concat(compressed, " ")
return compressed
end
local function decompress(to_decompress)
local decompressed = {}
for index in ipairs(to_decompress) do
table.insert(decompressed, tostring(to_decompress[index]):char())
end
return table.concat(decompressed, "")
end
local compressed = compress(tab)
local decompressed = decompress(compressed)
print(compressed, decompressed)