Data Store usage and best way to store scaleable data

Lite_Lion · August 19, 2024, 10:50am

I’m creating a chunk-based, infinitely-ish (Roblox has its limits…) expanding, and procedurally generated game. I’d like to know how I could store such amounts of data efficiently, which includes inventories of all players who have ever joined a world, chunk data (the biggest problem), objects within the world (probably integrated within chunk data), entities, and status effects. It’s a tall order and that’s why I’m asking for help. I plan to use ProfileService.

My current plan is to store chunk data in 4MB chunks (keys), which would be automatically created to accommodate more chunk data. Player and entity data would be stored in separate keys (but still related to the world). The structure could be something similar to “x y z top bottom right left front back {objects} {structure data} |” repeated. (e.g. 3 5 1 7 11 2 13 2 9 {item.fire_axe: {pos: 3, 1, 2}} {struc.0.1} |)

I’ve done some projections and found that it could be very inefficient to parse chunk data from a single string, especially when worlds consist of several chunk data keys. It could lead to data loss (since chunks in previous keys might be changed and removed/extended) as well. The problem could happen with player and entity data (they would be stored similarly).

I’d like to discuss more efficient ways to do this, so what to avoid or expect.
(just realised this might be in the wrong category)

IHexoDev · August 19, 2024, 1:43pm

I think you can use “Deflate compressor” it’s module best to compress huge string but he work slow for you it’s best solve

Solar_Eos · August 19, 2024, 2:06pm

This is one of the rare use cases where you won’t expect to hit a REQUEST limit - rather, you’re likely to hit a THROUGHPUT limit.

As in, you’re literally sending, not so much storing, more than what datastores can handle.

Let me break down some things you can do, and you’ll see what I mean.

If your world size is fixed, you can cut them up into bigger chunks and store them each in their own keys. (eg: key 1 is in charge of chunk 1, so on…)
You need to make sure you have just enough space for all that though. Each key can only store up to 4,194,304 characters.
The more concerning part for you is the read-write limits. You can only write up to 4MB worth of characters to datastore per minute, exceeding which will throw an error.
The same goes for READING, but at least the limits are more forgiving; at 20MB per minute.
As someone else has mentioned, please compress your data. This is the primary way to address the issues I stated above.
This is also one of the rare cases where it’s okay to compress your data like crazy. Compression takes more time as it compresses further, but players have been conditioned to accept waiting for some time for a world to load in.
Be smart with your serialization - is it really necessary to save EVERYTHING? Or can you save only the data that has changed during the playthrough?
Speaking of, you may also consider saving smaller chunks; the smaller the chunk, the less you may have to write to datastores, but the higher the chances of hitting a REQUEST limit instead (60 Get/Set requests per minute).
ProfileService is actually okay enough for uses like these, but then you may have to modify some of its autosaving logic so that you don’t hit a read write limit.

WizulousThe2nd · August 19, 2024, 2:08pm

ProfileService was designed to create different profiles for players, but yes it can be used to store data for other things.

If you’re using profileservice I’d recommend splitting every 6-8 chunks into a separate profile (Based on how big they are)

And you can basically separate all your data via profiles, which is smart, but Data Compression as @IHexoDev was saying, you’ll definitely need to compress your data somehow.

And @Solar_Eos explains the limits very well.

Solar_Eos · August 19, 2024, 2:22pm

Not exactly.

ProfileService is a profile object abstraction detached from the Player instance - this allows the developer to create profiles for entities other than players, such as: group-owned houses, savable multiplayer game instances, etc.

Also before OP asks, you really wouldn’t like a truly infinitely generated world, because of the null zone.

This is a part of the map (that is really far away from {0, 0, 0}) where everything flickers and bugs out, because of the floating point error and how Roblox handles that with their Vector3 coordinate system.

If you’ve heard of the Farlands on Minecraft, that’s exactly the kind of problem Roblox has as well - and it’s not even Roblox’s fault, it’s just a fault with computing in general.

And not to mention, the smaller your world, the less you have to save. Duh.

Lite_Lion · August 19, 2024, 2:35pm

To address your feedback:

I mentioned creating additional keys as previous keys fill up, so I believe that should solve the key limit problem.
Read-write limits shouldn’t be too much of any issue due to ProfileService saving locally. I’ll have to modify its autosave period to something higher, thanks for reminding me.
Compression is a good idea.
What I plan to save is pretty much the bare minimum. Any less and one might lose valuable data.
Reducing chunk size is not a good idea, since it’s not a voxel-based game. It could actually increase the amount of data needed to save, since players may travel through more chunks faster. (it’s a Backrooms-style game, sides [walls, doors, stairs, etc.] have to be saved, not voxels, cubes or other primitive parts)
I appreciate your help!

Onto my other questions:
Has anyone done tests or analytics to determine if it’s better to, for example, save 2 keys of 2MB each, or one key of 4MB? With so much compression and processing going on, it should be as performant as possible. Also, some ideas you’ve provided I already had, and I’m glad to know others have the same thoughts about this. And yes, I know about the null zone. But considering the distance at which it becomes a problem, you can consider the world practically infinite.

Lite_Lion · August 19, 2024, 2:37pm

One chunk might be only, say, 40 bytes, while another could be hundreds of bytes long (due to items and objects). There is no extremely complex generation to save, yet it can vary a lot.

Solar_Eos · August 19, 2024, 2:46pm

But then that means you’ll have to potentially access every single key in your datastore at any point in time - and with your scale, you may end up hitting a request limit.

I’m going to assume here that the world itself is static (meaning, no destruction whatsoever); but if you want a REALLY cheesy approach that solves every single datastore headache you have:

Create models of rooms (each slightly different from one another), and throw them into ServerStorage. Go ham here, the sky’s the limit.
That way, all you have to write or store are the indexes or names of those models, and you hit neither limits.
If you are clever enough with the designs of your rooms (and if you’re tenacious enough to design hundreds if not thousands of rooms), a player would be none the wiser.

If you’re only saving data that changes, that’s totally fine.

The latter. 2 calls = twice the chance a datastore call may drop itself and mess things up.

I’ve already mentioned this, but players are really okay with waiting for their worlds to FULLY load in. Loading performance isn’t a concern as long as the process doesn’t take ages.

Can you really call a world like that infinite though? Just a thought.

“infinite” is a pretty big word in the eyes of many players - and if they find out it isn’t, I’d imagine it won’t look very good on you.

Amritss · August 19, 2024, 2:51pm

Not sure about the data-storing but for the chunk-loading to keep it optimized and not laggy. You definitely need to utilize parallel luau.

Lite_Lion · August 19, 2024, 3:03pm

I’m pretty sure each key has its own throughput limit, right? The server goes through each key, parses it, adds it to a local chunks table, and autosaves every few minutes. Isn’t that possible?

I’m not exactly in the mood to create hundreds of different rooms, sorry… I do have another cheesy approach, and that is to make each world predetermined by their seed. However, I have literally no idea how I’m supposed to do that. My current idea would result in different worlds depending on which way the player travels, and it doesn’t even cover the side (walls and stuff) generation. If you have any info about this, could you point me in the right direction? I’ll drop the vertical axes (different floors) from worlds if I really need to.

Saving only data that changes is a bit of a hassle, since data would be stored in large keys. If a single chunk changes, the whole key would need to be overwritten.

I mean, why not? You can move through the null zone with no problem (until a certain point), but I really doubt people would go so far. The maze-like structure stops direct travel, requiring you to walk around walls and dead ends. The current code doesn’t even guarantee there is a way to go so far! And honestly, enough people know about the null zone anyway. They’ll surely cut me some slack… right? (if it ever gets popular)

Lite_Lion · August 19, 2024, 3:04pm

Yep, that’s my plan. Although, I do have to synchronise to create the parts present in the game. The heavy math can be done in parallel.

Solar_Eos · August 19, 2024, 3:41pm

No, throughput limits are shared within the server, regardless of the key you’re writing to.

And you misunderstand - I’m talking about REQUEST limits, not THROUGHPUT limits; as in, the amount of times you can make a request to the datastore, not the amount of data you can send at any point to the datastore.

Hence why I was suggesting to save smaller chunks in the first place - if you dislike that idea, then this is an evil you’ll have to deal with.

It’s just a thought I had.

Besides, I think world-gen would also mess up itself if you’re talking about numbers in the millions anyways, exactly like the Farlands in Minecraft.

To be fair, both approaches are just as tedious, it’s just what kind of tedious you’re more inclined towards.

Besides you can outsource that task to a bunch of other people. You don’t have to sit through the painful task of constructing out every single room on your own.

You can do that, by generating the worlds beforehand in Studio, and then saving it into a dedicated datastore. From there on your game only has to fetch a randomly selected world within that datastore.

I’m not inclined on this approach though, at least not until you get your saving logic optimized down to the tee.

Lite_Lion · August 19, 2024, 3:52pm

As found in:
Throughput Limits
“The following table describes per-key throughput limits”
Either this is really bad wording or the truth. I hope it’s the truth.
Thanks for the support, though. I’ll see what I can do using your suggestions.

Also, I’m not exactly concerned about key request limits. There’s one world in one server, so that should give plenty of room to work with. I can attempt to implement custom throttling if it gets out of hand.

Solar_Eos · August 19, 2024, 3:59pm

Oh, my mistake. It is on a per-key basis.

A little hard to remember the minute details off the back of my mind, I’m afraid.

system · September 2, 2024, 4:00pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.