Compression for limited data (Datastore)

Well i started of with tables named Chunk|000|000 but renamed to 000000.
I have also changed to instead saving the blocks name to being 1 as the ID.
Im currently watching the video you linked to see if I can get my head around this.
Im also tring to see if i can get it to just save as one whole string to help size.

Using huffman coding, you only need to store 2 things:

  • The encoded string
  • The huffman tree

Just wondering, would it be better if i just saved the ID’s within that layer? As it would be like:

Layer0 = "0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:
0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:
0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:
0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:
0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:
0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:"

Instead of this:

Table = {
    ["12341234"] = {
        ["0;0;0"] = 0,
        ["0;0;1"] = 0,
        ["0;0;2"] = 0,
        ["0;0;3"] = 0,
        ["0;0;4"] = 0,
        ["0;0;5"] = 0,
        ["0;0;6"] = 0,
    }
}

EDIT:
This is not a string. Its a set of numbers for the block position ranging from 1 - 16.
UPDATE:
from 33 keys to 23 with string conversion.

I don’t know what you mean, or what that data is… Could you explain how that string is formatted?

So i just get all the data within that table. Such as block posion and ID and add it into the string that is saved within the Chunk table
Edit:
1 layer of a chunk is 2KB instead of 4KB.

For everyone thanks for your help! I have reached to the point where I think it should be ideal to save with the least amount of data.
image
First 3 lines is uncompressed data
Second 3 is my string compression
Third 3 is a module that i used from the link by 1waffle1
And has produced data that used to be 59 keys to 11! Thats 18.6% of what it used to be!
I am still accepting other ideas but they may not be used. Thanks again!

1 Like

The Huffman Table is very efficient when a lot of the data is the same. But if a majority of it is varying, then it is not useful.
I recommend taking the whole table, converting it into a string, converting that string into a series of numbers, then taking those numbers and then compressing them using the Huffman table. Every 1 byte can be 3 bytes, as a 1 = 10000000; so a 10000000 can equal 10110011 10110011 10110011 and a 01000000 can equal a 10110011 10110011 10011111
or you can set the ratio to be 2 bytes can equal 4 bytes as 2 bytes have a extremely large amount of data combinations than 1 byte; 8 bits = 256 - 1; 16 bits = 65536 - 1;

And to take it a step further; if you go through the table, and a majority of it is the same, E.G 2049 instances of 244, or 11110100, you can change the byte byte ratio; a simple 10000000 can equal 2049 bytes; This is a very very extream, nearly impossible instance;
And to take it a step even further; since huffman is mainly for binray, and since we are dealing with strings; we can instead do base 36, as that contains all the alphanumerical characters.
thus a simple “a” can equal 20 bytes, 1 byte to 3 20 bytes, “aa”, 2 bytes can equal 20 bytes, etc…
You will have to make a bizarre huffman table where 10 characters can be stepping stones, just like how no huffman binary number starts with a 0
1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ~ ` ! @ # $ % ^ & * ( ) - _ + = etc…

The only downside is that the data table may be way larger than the actual thing your trying to compress

You’re 6 months late ._.
That’s pretty much what I said about the huffman table though…

1 Like

Has it really been 6 months… wow.
I also used this system for my building game. Worked very well.
Also realised that you can save a place state. So i wont need to worry about saving the world.