Hello there, I’ve been a developer for a long time but recently started poking around the Roblox ecosystem. I found the Data Store limitations an interesting problem to solve and built a proof-of-concept implementation for my game project.
I don’t know if this is the right forum or considered good-form but I have questions about someone else’s (publicly available, published) code. Specifically DataStore2
Prior to discovering the DataStore2 package, I built my own simple caching solution that throttles updates to the store. It seems to work reasonably well as a proof-of-concept. Later, I discovered DataStore2 exists and given the number of references to it and claims of reliability, I started digging into the implementation.
Functionally our caching mechanism is very similar (trivially, hold a per-player table in memory) but beyond that, the design to prevent data loss diverges.
My concept was this: hold data store updates in memory just long enough to prevent hitting API quota limits then do writing. Then tune the frequency of writes to account for the number of data stores you want to write to.
DataStore2 takes a different approach of storing everything in memory until the player leaves or the server shuts down and then writing (unless explicitly asked to save). They also use the so-called “OrderedBackups” or “berezaa’s method” of saving data which appears to be saving every instance of player data changes over time.
This raised some questions in my mind that I cannot find answers to.
NOTE: These questions are out of curiosity - I assume I may be missing something and/or that DataStore2 was developed with requirements or limitations I do not understand or are no longer relevant (it looks like it was developed a few years ago).
-
Why is it a good idea to only write data to the store when the player leaves (again, unless the save API is explicitly invoked which is clearly not the “default/encouraged” way to use it) when the server could potentially shutdown any time? While it may be rare, I assume servers crash sometimes for any number of reasons (developer error, roblox service problems) and it seems that having a system that only stores updates in memory indefinitely until the player leaves leaves a significant hole in reliability. In theory, a trivial hedge against this risk is to occasionally write cached data.
-
Why is “OrderedBackups” or “berezaa’s method” of saving data needed at all when Versioning appears to do the same thing? I assume versioning is more performant given it is leveraging internal APIs, not to mention with more support for managing the store through the DataStore cloud API now.
-
Here the DataStore2 docs claim
In normal data stores, you’d save all your data into one giant player data table to minimize data loss/throttling.
I believe this claim rests on the assumption that DB writes are happening very infrequently given a trivial example like writing 256KB once to one store is less efficient than writing 16KB to three stores (assuming the bottleneck is moving the bits, not the connection setup/takedown). Is that the case? Are data store connections so fragile that making 1 write is that much more reliable than make 3 writes?
Expanding on this a bit, perhaps this is related, I’ve never quite understood this piece of Roblox Data Stores documentation, Create Fewer Data Stores. It says,
Data stores behave similarly to tables in databases. Minimize the number of data stores in an experience and put related data in each data store. This approach allows you to configure each data store individually (versioning and indexing/querying) to improve the service’s efficiency to operate the data. As more features become available, this approach also allows you to more easily manage data stores.
Specifically, why, “Minimize the number of data stores…”? Maybe I’m overthinking this but if data stores act just like tabes in a database, I would find it strange to read a database doc saying, “Minimize the number of tables you make…”. The number of tables an app needs is dictated by the requirements of the app and the database design - nothing more, nothing less. So then the only translation of this that makes sense to me then is, “Do good database design”. Which feels out of place here at best. But maybe this documentation is just geared for folks that are inexperienced using databases. Or, again, maybe I don’t understand
That’s all I’ve got for now… again, take this in the spirit it is intended - it is not a criticism of DataStore2. I am trying to learn more about the limitations and constraints Roblox developers must take into account when building a reliable interface to data stores. These are questions I’ve been unable to find answers to in documentation or other forum posts.
Thank you!
@Kampfkarren if you have a minute, would love your thoughts
Edit: here’s a link to a review of my own (possibly naive) caching implementation