As a Roblox developer, it is currently too hard to use datastores to any capacity other than simply storing player-specific data. This is due to a Throughput Limit which is imposed on each key of a datastore at an experience-wide level.
This limit is problematic because it does not scale based on player-count nor server-count; no matter how popular your experience becomes you are left with the exact same (strict) limit of 25MB read throughput and 4MB write throughput per key per minute.
This is absolutely abysmal for any experience that uses datastores to store experience-wide configuration data or otherwise allows players to load data that may not necessarily be connected specifically to them. This is because multiple servers may need to query the data at any point in time (this is especially the case for configuration data after a game shutdown), leading to the limit being reached and subsequent requests failing. This is especially concerning when the write quota is reached as data loss may occur in these cases.
MemoryStores are not a solution to this as their memory size is way too small to handle the amount of data that a datastore key may hold. The only real solution which developers have for this right now is to duplicate data, this involves creating multiple keys which point to the same data and to read from those keys at random or co-ordinate via a MemoryStore on which to read. This is super inefficient and not ideal. It also does not solve the use-case of when data may be getting changed by a player as specific set requests to the duplicated keys may fail, leading to different servers getting different values.
If Roblox is able to address this issue, it would improve my development experience because I would be able to more freely make experiences which rely on querying datastores for non-player specific data without needing to needlessly mass duplicate data.
This isn’t to say there isn’t a feature request to make, but the throughput limit exists for a reason. It’s a real practical limit which exists because of the way the DataStore API is structured. Is it on the conservative side? Yes, but it could not be increased by the orders of magnitude needed to fit the usage patterns I’m guessing you’re thinking of here.
It would be more useful to discuss the particular use cases you want to fill, because “raise the limit on the existing API” isn’t a solution here, more likely differently structured DataStore APIs need to be added to fill your needs.
I sort of wanted to stay away from directly addressing really specific use-cases (outside of my few examples), in the OP in order to encourage others to share their use cases since guiding the feature request towards my specific set-up doesn’t benefit the wider developer community which may use varying different approaches to their datastore structure.
But to dive into my current problematic set-up, I have an experience where users may claim rooms. Upon claiming a room, the user has their UserId stored to the datastore, with the key being the room number and the value being the UserId. However, users can access multiple rooms in quick succession, so storing each room individually in a key is infeasible with the server limits. Hence why I decided to store rooms in batches of 100s based on room number instead, however, when a user claims a room I need to add their entry to the bigger (batch of rooms) key, when I do this I end up using up a lot of my write throughput and will experience data-loss if my experience blows up at any point. This data loss would be due to the requests stalling in the queue possibly perpetually, passing out a server-shutdown.
This also becomes an issue when I decide to restart servers. Not only are all the servers fighting to get their updates heard and not be left throttled by the limit, but upon the new servers starting up, I’m going to hit the read-throughput limit, throttling not only the datastore but my entire experience start-up phase.
You sure? Player data should be the benchmark you’re using for the fixed throughput limits. It is terrifying on paper and especially for large MMO-type experiences on Roblox but it’s sufficiently understood that DataStores are meant for long, persistent, low throughput operations.
I develop an RPG dungeon crawler experience; although we have next to no CCUs, we have a dedicated and loyal playerbase that regularly plays the game and helps us test our next updates. Our most dedicated player (several thousand hours and lots of inventory size) is only reaching around 0.13MB per get operation (the total size of their data as of 10/25/2023 @ 1:28 AM EST). Worth mentioning that we have teleports to reach each place of our experience. No compression.
Despite the key throughput limitations and considerably hulking data sets compared to most experiences, I have not run into throttling. I know the thread says “non player-specific data” and I’m talking about player data, but I don’t think there’s any difference. A lot of the nuance here is dependent on your architecturing of DataStores, including your use case.
For a schema that’s a large table of 100 elements per store key with a pretty simple key-value structure, I really do not think that the throughput limits are genuinely going to affect you. You should scrutinise the actual size of your data under heavy pressure and use that to determine whether you would actually benefit from raised limits or if you should just redesign this feature.
There is a considerable difference, non player-specific data is more likely to be queried on multiple servers at once, leading to an increased usage of the quota. You also can not predict when a key may be throttled since the last read / write may have been from a different server, making it much more difficult to architect around the limit. Changing datastore structure isn’t going to help either, you cannot simply architect around this limit without removing game features or making a ton of unnecessary datastore requests to duplicate data, for a platform that builds itself around social experiences, this limit is sad to see.
I totally understand this, but there is simply no alternative, DatastoreService is the only way to have data persist indefinitely without having to set up an external web server as of now.
I think this errs on the side of lack of tooling for dedicated servers on Roblox’s end moreso than it does the throughput limits (i.e. script content running at the experience-level so you can batch updates across all alive instances and perform calls to DataStore periodically). Same goes with extending budgeting APIs to cover how much of the quota you have remaining.
Roblox has already been incredibly generous with what they offer with DataStores. As developers, we are guaranteed (nigh-)infinite free data storage and tools already set up to work with them, and not to mention some years back when we got the increase from 256KB/value to 4MB/value. The limits exist for a number of reasons including to relieve pressure on cloud services, which is especially a problem when a lot of experiences are working with poor data architectures.
The throughput limits help prevent cloud services from utterly overheating; they are a safety net for everyone. It encourages better architecturing/game design, while for Roblox it allows them to continue to provide promise of free unlimited data storage. Unfortunately it does mean that some experiences’ features are not workable by design or need to figure out a different structuring. Roblox will always remain a social-first experience and the limits of DataStores are almost entirely irrelevant.
Are you absolutely sure that the limits deeply affect your game design? Have you performed any analytics or projections on the use of your data stores and actively concluded that you would not be able to fly under these limits? Global configurations are absolutely a valid use case for DataStores but it sounds like you’re trying to achieve medium throughput here and the limits are scaring you even though your data set sounds like it would barely reach 1-2% of the allotted per-minute key limitations, if I understand it correctly.
I’ve been considering something.
I am making a game that requires a generous amount of storage capacity, as it enables gamers to construct their own maps, accessories, trains, and other features. Due to this issue, the game quickly exceeds the 25MB limit (even with compression) when loading “workshop” assets, resulting in poor data loading due to this fixed limit. Players waiting in a virtual queue are caused by other players loading saved files simultaneously. This results in less enjoyment for the players and necessitates the use of MemoryStores to manage the queue.
To work around this problem, I store the assets that are normally used on a dedicated storage server outside Roblox and retrieve them using HTTP Service and MemoryStores. Nevertheless, it’s not perfect.
It’s confusing why the storage limit affects all servers, rather than being set individually for each player or server, which is how other services in Roblox are usually treated. I wonder how large tycoon games like Theme Park Tycoon 2 manage their extensive data storage. Do they have ways to work around limitations such as this or do they just never reach this limit at all?
While I agree with you, you must admit that it doesn’t make sense to have game-wide restriction on a platform with players ranging from none to loads.
And that’s why I think a limit per player or server would be superior. It would force those games with a bad data structure to have a good data structure (and thus remove the pressure on cloud services that you mentioned) because they would hit the limit rather quickly, while not punishing games that rely on data storage.
It’s precisely because of this that such restrictions are implemented, among other reasons. Roblox did not previously have these restrictions however this can result in incredibly heavy pressure and costs on cloud services especially for experiences with higher player counts and bad data architecturing which a non-trivial number of experiences have (and resources out there help resolve that issue).
With what Roblox offers its developers while not requiring them to pay a cent for any of that, they also need to be able to continue to support their promise of free and infinite storage for their developers while also working towards improving cloud services uptime. Heavy pressure on cloud services can cause outages or undue expenses, which we see frequently when very popular experiences update.
This already exists via the API budgeting. The limitations in question that OP is experiencing is an experience-wide per-key query limitation, which the vast majority of developers will never encounter even with mediocre-poor data architecturing. Realistically, there are less than 2% of experiences (and that’s a generous assuming figure) that will consume more than 5% of the current per-key value limit and need to query that frequently, leading to consumption of the key’s experience-wide query limit. For the experiences that encounter that, they appropriately employ their own measures (from compression to sharding) to make sure the limit doesn’t affect them.
The limitation mostly only affects global configurations use cases and calls for either better design or developer-driven analytics/testing revealing that this is a real problem that they can’t get around through alternative measures (OP has provided no analytics and did not start off with a use case, offloading that to potential responders to make cases for).
 See: Using multiple data stores for more capacity?. One of the respondents is Coeptus who describes how Bloxburg, a UGC-driven home building experience, can save large amounts of custom build data, and that primarily happens through sharding.
DatastoreService does not provide any way to retrieve these analytics.
I’ve projected that if my experience ever exceeds 1,000 servers, which while certainly a lot of servers, not entirely unrealistic in the case that my experience blows up since private servers are heavily encouraged for the experience; I will exceed the limit upon having to shutdown all servers for an update. Any player who attempts to claim a room during this ‘shutdown time’ will only add to that load causing even more delays + issues.
This mainly has to do with you and not the cloud services when I say analytics; as in, some testing/projections on the numbers here. It’d be more accurate to say results of real testing (you can pressure the services yourself by making a volume of calls that should be accurate to a high CCU you’re expecting to receive) and see if you encounter problems there; and if there’s absolutely no other solutions you can look at such that you feel an increase in the per-key limit would be warranted.
I strongly feel that the experience-wide per-key limitations are a huge boogeyman here and that you won’t actually encounter the throttling in question or that some parts of the game’s design need to be worked around the limitations. I get that’s sort of a big ask, but it’s a mroe monumental ask to have these scale further unless it affects a large number of developers which it won’t. Niche use cases like this typically don’t get picked up by engineers without strong persuasion.
It’s valid, but it doesn’t feel like a strong case.