Introduction
I did some digging and found a pure-lua version of the zlib/deflate compression library. After forking the code and editing it a bit, I managed to get it to work with luau. I have from there created an easy to use compression library which takes an input string and outputs a compressed string.
I won’t go too in-depth about how zlib/deflate works (you can find many articles/such online), but following @1waffle1’s lead, I decided to take the source code of all the chat and camera scripts, concatenated them into one string, and compressed 286748 characters into 58082 characters in 0.370 seconds with level 9 compression. I also managed to compress 286748 characters into 73702 characters in 0.08 seconds with level 1 compression.
With this library, there are varying levels of compression, ranging from 0 (no compression) to 9 (most compression), as you go up in compression level, it will take more time to compress the text, but will also compress it more the higher the level. Determine the best level for your specific use case. Here are my benchmark statistics:
level 0 : 10.4ms : 286748 → 286788
level 1 : 80.2ms : 286748 → 73702
level 2 : 85.5ms : 286748 → 70070
level 3 : 101.8ms : 286748 → 67777
level 4 : 137.0ms : 286748 → 62589
level 5 : 182.2ms : 286748 → 59749
level 6 : 274.1ms : 286748 → 58421
level 7 : 323.3ms : 286748 → 58160
level 8 : 383.8ms : 286748 → 58082
level 9 : 369.9ms : 286748 → 58082
As you can see, the higher the level, the more compressed the string becomes, but the longer it takes. There are also diminishing returns the higher you go, that is to say, the jump from level 0 → 4 is much higher than the jump from level 4 → 9, and there is also a large increase in compression time from level 5 to level 6.
It is important to note, compression relies on repetition, and the amount of characters compressed will depend heavily on what can be found repeating in the string.
Disclaimer: This uses a simplified version of the zlib algorithm, and a pure lua implementation. Higher levels of compression are not guaranteed to take longer, nor are they guaranteed to be smaller than a lower level, though it is highly probable that this will be the case. Also, do not use my benchmarks, it is better for you to benchmark your specific use case for more accurate results.
Installation
This package is a single module script:
https://www.roblox.com/library/5649237524/Compression-zlib-deflate
You can also view the source code on pastebin:
Documentation
Note:
-
“Compression” is the required module found under Installation.
-
The initial library has been edited for ease of use, rather than functionality. That being said, you can still access the initial library through Compression.Library. Additionally, the documentation is also available inside of the Compression ModuleScript.
View method-specific (function-specific) information/usage
Configs table:
{
level = 0; -- integer 0 -> 9 where 0 is no compression and 9 is most compression
strategy = "" -- "huffman_only", "fixed", "dynamic"
}
Method: Compression.Deflate.Compress(data, configs?):
-
Description: Compresses a string using the raw deflate format
-
Input:
- String: data = The data to be compressed
- table?: configs = The configuration table to control the compression
-
Output:
- String: compressedData = The compressed data
- int: paddedBits = The number of bits padded at the end of the output
Method: Compression.Deflate.Decompress(compressedData):
-
Description: Decompresses a raw deflate compressed data.
-
Input:
- String: compressedData = The data to be decompressed
-
Output:
- String: data = The decompressed data
Method: Compression.Zlib.Compress(data, configs?):
-
Description: Compresses a string using the zlib format
-
Input:
- String: data = The data to be compressed
- table?: configs = The configuration table to control the compression
-
Output:
- String: compressedData = The compressed data
- int: paddedBits = The number of bits padded at the end of the output
Method: Compression.Deflate.Decompress(compressedData):
-
Description: Decompresses a zlib compressed data.
-
Input:
- String: compressedData = The data to be decompressed
-
Output:
- String: data = The decompressed data
Additional Information
Explanation of algorithm
You can view Mark Adler’s explanation of this algorithm here.
Strategies:
There are 3 strategies:
- “fixed” : uses fixed deflate compression block
- “dynamic” : uses dynamic compression block
- “huffman_only” : uses purely huffman compression, doing no LZ77 compression
Levels of compression:
View information about the various levels of compression
Level 0:
- uses no lazy evaluation
- no previous good length
- no max insert length or max lazy match
- no nice length
- no max hash chains
Level 1:
- uses no lazy evaluation
- no previous good length
- max insert length and max lazy match of 4
- nice length of 8
- 4 max hash chains
Level 2:
- uses no lazy evaluation
- no previous good length
- max insert length and max lazy match of 5
- nice length of 18
- 8 max hash chains
Level 3:
- uses no lazy evaluation
- no previous good length
- max insert length and max lazy match of 6
- nice length of 32
- 32 max hash chains
Level 4:
- uses lazy evaluation
- previous good length of 4
- max insert length and max lazy match of 4
- nice length of 16
- 16 max hash chains
Level 5:
- uses lazy evaluation
- previous good length of 8
- max insert length and max lazy match of 16
- nice length of 32
- 32 max hash chains
Level 6:
- uses lazy evaluation
- previous good length of 8
- max insert length and max lazy match of 16
- nice length of 128
- 128 max hash chains
Level 7:
- uses lazy evaluation
- previous good length of 8
- max insert length and max lazy match of 32
- nice length of 128
- 256 max hash chains
Level 8:
- uses lazy evaluation
- previous good length of 32
- max insert length and max lazy match of 128
- nice length of 258
- 1024 max hash chains
Level 9:
- uses lazy evaluation
- previous good length of 32
- max insert length and max lazy match of 258
- nice length of 258
- 4096 max hash chains