Data efficiency

Misinformater · November 8, 2022, 5:26am

I’m having problems with a project… So I came up with this questions:

What’s the most efficient way to save up keys? I want to store UserIds without taking up so much memory, should I use utf8?
How prioritized IncrementAsync and UpdateAsync is? I’m not sure which of them is better for saving data.
What’s the most efficient data compression method?

And some advices: (Them’re optionals, if you don’t feel comfortable with them don’t worry)

Don’t use keys: Them will take data, instead use tables as concepts everytime you want to load/save data.
Use JSON or any compression: I always use HttpService:JSONEncode()/Decode, this one is your choice.
Use faster methods: Use modules to save/load/modify data.

Thank you all, salutations and goodnight/day.

7z99 · November 11, 2022, 8:55am

First thing, this probably would fit better in #help-and-feedback:scripting-support

Could you elaborate on what you mean by keys? Like DataStore keys? Those are limited to 50 characters and you can use new APIs like :ListKeysAsync/ListDataStoresAsync to get a list. Are you talking about metadata which is a list of user IDs?

I’m not too sure as to what you mean. They are two different things and have different purposes, IncrementAsync is to, well, increment an integer that is saved to a certain key. UpdateAsync is to update some entry or to completely overwrite data inside of the data store. It’s usually used interchangeably with SetAsync. But I would say that UpdateAsync is the best way to go when factoring in things like getting up-to-date data.

There are many data compression algorithms, there are multiple available on this forum as well. But the first step would be to optimize your data by hand. For example, use identifiers instead of full strings. For example in a script or module, assign individual items to numerical identifiers instead of just saving the full item name. Likewise if you’re using things like properties assigned to objects (like a part’s properties like colours, materials…), write out a table of identifiers that correspond to a string.

local props = {
    'ClassName';
    'Material';
    'Color';
}

-- to serialize:
local data = {}
for i,v in ipairs(workspace:GetDescendants()) do
    local thisPart = {}
    for i,property in ipairs(props) do
        thisPart[i] = v[property]
    end
    table.insert(data, thisPart)
end

-- to deserialize,
for i,v in ipairs(data) do 
    local newInstance = Instance.new(v[1])
    for index, propertyName in ipairs(props) do
        if index == 1 then continue end
        newInstance[propertyName] = v[index]
    end
end

After that, you can look into data compression algorithms, you want a lossless data compression method. There are so many data compression algorithms that exist but some that I can think of off of the top of my head are LZW, ZIP (yup like the .zip file format), and gzip but there are many, many more. And as I have mentioned, some guides and full resources exist on this forum:

Do bear in mind that you have 4KB of space per key, that is a lot of data, 4 million characters. Depending on what you’re trying to compress, different algorithms will be more effective than others, and some will also not work at all and will actually increase the size of the string, however all in all you will find data compression will be more useful if you’re saving a lot of the same thing which is how identifiers come into play. If you do reach the limit though, you can also always break up your data between keys.