Compression for limited data (Datastore)

generally speaking you would not save the whole game but the changes made by the player

As i have already said. I know, But what if a player edited that amount. Such as using TNT? I would run into the same issue.

You would track the TNT position then remove the blocks?

Still, What if someone edited that mass amount? This is only one layer of each chunk.
Edit:
For one chunk layer, (256) blocks, its 4KB.

You are not limited to the number of keys you use. It might help to use multiple keys in this case.

There really are a lot of solutions available to you. I would first get a system in place then look at optimization.

It sounds like the chunks aren’t necessary to the saving process if you can just get which chunk a part belongs to based on its position. Chunks can be re-constructed in the loading process. If you were just storing positions and material types then the data could be pretty concise, the extra information complicates that. If you’re trying to store 640k blocks and the datastore key limit is 260k then you would be lucky to only need 3 keys if you could store one block per character. Depending on how repetitive your data is, actually compressing it should highly reduce the size. I’ve shared a resource for that here https://devforum.roblox.com/t/text-compression/163637

Yes, but its best to try to compress the data to where its efficient and small. Unless this ends up to being the last option.

I can attempt on removing the chunks table and will replay with the results in this messages edit.
EDIT:
from 4KB per chunk for one layer went down to 3KB.
Will edit again with results on 50 by 50.
EDIT 2:
From 45 keys i got 44. I will not be keeping this change as it causes more maths involved for chuck generation and its a small change.

Compression like @1waffle1 posted works better the larger the data as it builds the word dictionary over time. I would also take a look at LZSS if your data is repetitive or simple have a hard coded wordlist.

I removed the table for data to see the result. 45 Keys to 40!
I am going to attempt to use the string compression and will edit this message with results.
EDIT:
Failed to compress. Game script timeout.
I will fix this within tomorrow as i have to go.

1 Like

I’m really late to this thread but if they already haven’t been implemented i would consider the philosophy of a huffman tree, or you could count consecutive blocks and save the id and the number of the consecutive string in order to reduce size.

The second suggestion was applied in this video that may help : https://www.youtube.com/watch?v=fqdTj27xVMM

This youtuber is working on a voxel game and combats problems very similar to the ones you do.

So from what im getting is have a string that has values of blocks. Say if it was 0 for air and 1 for grass within a 16 by 16 line?

Do you need to store info about the blocks? If so, a huffman tree isn’t possible. If not, then it is quite a complicated process, and storing the tree isn’t easy either, so you only really want to do this if it creates massive gains. huffman coding is especially effective if there are very many blocks of the same type, and very few other type blocks in the world. e.g. a completely stone chunk would be very efficient, a very mixed chunk would be less efficient (but still better than without huffman).

I can store the infomation of the blocks in a diffrent way. But how can i add this?

Update on current size.
59 keys was what I got at the start of this post.
It has basically halved to 33 keys!
I still have not added in compression and most of it remains as tables.

Are you sure that that other way will not take up too much space then?
Anyways, you can either listen to my (probably bad) explanation
Or watch the video I linked before, if you haven’t done that already.
Basically, huffman is a way to compress characters (or other “single” token stuff, like block IDs) into a long binary string. First you have to construct a huffman tree, I’ll explain that a bit later on. It will look something like this:

How to decode the binary string:

  1. Read from the start, the 1s and 0s.
  2. If it’s a 1, take the right part of the tree, if it’s a 0, take the left part.
  3. Keep reading until you reach a character in the tree.
  4. Add this character to the decoded string. (or for blocks, place it in the world)
  5. Repeat this until you are at the end of the binary string.

How to encode the binary string

  1. For each character (or block), go from its location upwards, until you reach the top.
  2. Every time you get to a “junction” from the left side, write a 0 in your string.
    Every time you get to a junction from the right side, write a 1.
  3. Reverse the string, and add it to the encoded binary string.
  4. Repeat until you have encoded all the blocks in the world

(this may not be efficient or easy to implement in lua, I’ll see if there’s a better way)

How to create a huffman tree

  1. For each block, count how many times it is used, and put this count, linked to the block, in a list.
  2. Pick the two lowest items in the list, and connect them to a “junction”.
    Count the combined occurences, and put the junction back in the list.
  3. Repeat step 2 until there is only one junction remaining.

Again, Tom Scott (the youtuber I linked) is much better at explaining this.
If you need any help, just ask!

1 Like

Just curious, which reductions have you implemented now? How effective were they?

Well i started of with tables named Chunk|000|000 but renamed to 000000.
I have also changed to instead saving the blocks name to being 1 as the ID.
Im currently watching the video you linked to see if I can get my head around this.
Im also tring to see if i can get it to just save as one whole string to help size.

Using huffman coding, you only need to store 2 things:

  • The encoded string
  • The huffman tree

Just wondering, would it be better if i just saved the ID’s within that layer? As it would be like:

Layer0 = "0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:
0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:
0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:
0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:
0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:
0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0:"

Instead of this:

Table = {
    ["12341234"] = {
        ["0;0;0"] = 0,
        ["0;0;1"] = 0,
        ["0;0;2"] = 0,
        ["0;0;3"] = 0,
        ["0;0;4"] = 0,
        ["0;0;5"] = 0,
        ["0;0;6"] = 0,
    }
}

EDIT:
This is not a string. Its a set of numbers for the block position ranging from 1 - 16.
UPDATE:
from 33 keys to 23 with string conversion.