Table of Contents
- Introduction
- Request Flow
- Overview of Limits & Budgeting
- Budget Consumption
- Budget Limits through Command Bar / Plugins
- Notes on Caching
- Notes on UpdateAsync
- Notes on OnUpdate
- Conclusion
1. Introduction
This is a complete, comprehensive overview of all sorts of implementation details and gotchas regarding Roblox’s datastores from the perspective of the DataStoreService API that are not properly described on the Developer Hub at time of writing.
The reason for writing is that I recently wrote the MockDataStoreService module, which is a near-perfect emulation of datastores in Roblox-Lua. While making this system, I found a ton of subtleties and issues with datastores that I haven’t seen properly explained/documented anywhere, which may be useful to know about for certain edge cases or if you are pushing the boundaries with the Datastore limits.
Moreover, some of the information on Datastores on the Developer Hub is clearly wrongly documented. I will file some documentation requests after posting this thread in an attempt to fix that.
2. Request Flow
This section discusses in detail how Datastore requests, request throttling, and request throwing (=erroring) works.
Flowchart for Datastore requests
The main path of a successful/failing Datastore request is given below. (click to enlarge)
First of all, the input parameters of the request are checked. If these are invalid, the request fails immediately without yielding and without consuming any budget.
Secondly, read caching constraints are checked. If the request is a get request, and the key was recently touched within the cache cooldown (see Overview of Limits & Budgeting), the request immediately returns the recently obtained value, without yielding or consuming budget.
Thirdly, throttling checks are performed. If budget for the request is not available, the request will be throttled until budget is available. Similarly, if the request is a write request (set/remove/increment/update), and the key was recently written to within the write cooldown (see Overview of Limits & Budgeting), the request is also throttled until the write cooldown has worn off.
If there is budget available and no write cooldown is violated (anymore), appropriate budget is consumed and the remote call is performed, and the thread will yield until a response is received from the external data layer. (see the next subsection for more details on throttling)
If the remote call failed due to whatever reason there may be (service unavailable, service congested, corrupted data, etc), the request will throw an error.
If the remote call was successful, then depending on what kind of call was used, the read cache for that key will be updated (see Notes on Caching), and finally the result of the request will be returned.
Request throttling
When a request cannot be completed at this moment if it doesn’t pass the throttling checks, the request is “throttled”. This means the request will yield for longer than usual until the right conditions are met.
A request can be throttled for the following reasons:
-
There is no budget to complete the request – a warning will be thrown specifying this, and the request yields until budget is sufficient and no other throttling checks are violated.
-
The write cooldown is violated (see Overview of Limits & Budgeting) – a warning will be thrown specifying this, and the request yields until the write cooldown is no longer violated, and no other throttling checks are violated.
Throttling queues
When requests are throttled as described above, they are placed in a so-called throttling queue. The throttling queue is a backlog of requests that are yielded and should be performed as soon as they can be.
Throttling queues operate on the Leaky Bucket Principle: throttled requests will flow out at some rate defined by the throttling checks (i.e. budgeting and write cooldown), and the bucket has a limited size. Once the size of the bucket has been reached, no more throttled requests can be added.
Every actual budget type (GetAsync, SetIncrementAsync, GetSortedAsync, OnUpdateAsync, SetIncrementSortedAsync) has its own throttling queue. Each of these five throttling queues has a queue size of 30 throttled requests max. Throttled requests are added to the queue of the corresponding budget type that it consumes.
Requests that are currently being executed do not take in space in the throttling queue. Only the requests that are still waiting to be executed until there is budget / the write cooldown has passed take space in the queues. Therefore, if you have 100 GetAsync budget and you send out 100 GetAsync requests simultaneously, this will not overflow any throttling queue since there is budget for each request and no write cooldown is violated, so all of the requests execute immediately and are not throttled in the queue.
Request throwing due to throttling
If a throttled queue is full, and another throttled request is attempted to be added, that request will throw an error immediately instead and is thus not executed. The error will indicate that the request was discarded due to the throttle queue being full.
Request throwing due to other reasons
Apart from errors related to throttling, Datastores can also throw errors for invalid input (i.e. storing an invalid value, invalid key names, invalid scopes, invalid Datastore names, invalid parameter types, etc). These kind of errors will not consume the corresponding budget of the request.
Datastores may also cause errors if the remote server determines that the request is malformed. For example, if you are trying to call GetSortedAsync with a minimum value that is higher than the maximum value, the remote server will reject the request for invalid input. However, as it is a remote call, Datastore budget will be consumed for such erroneous requests. Another example is if you are trying to call IncrementAsync on a non-integer key; this will need to perform a remote call before it realizes that the request is erroneous, and therefore will also consume budget but will still throw an error.
Errors can also be thrown due to service failures or malformed data. These requests will also consume budget as they ultimately attempt a remote call before erroring.
This hub article gives a decent overview of some of the errors that the API can throw:
https://developer.roblox.com/articles/Datastore-Errors
Throttled requests may (and probably will) be processed out of order
It is called a throttle queue, but it does not actually adhere to queuing principles like first-in-first-out or even first-in-last-out. Requests may be resumed from the throttling queue in any given order, no order is guaranteed and they will commonly be resumed out of order. So just because throttled request A was called earlier than throttled request B, doesn’t mean A is resumed before B is resumed.
I don’t expect this to be practically be an issue in most applications, because you shouldn’t be relying on the order of these requests to be LIFO or FIFO.
3. Overview of Limits & Budgeting
This section gives an overview of all limits regarding Datastores, and describes how exactly budgeting is initialized and updated.
Size limits
These are the size limits for various fields.
Property | Size Limit |
---|---|
Name | 50 |
Scope | 50 |
Key | 50 |
Data* | 4194303 |
*) Note on data size: it is recommended to still check for something like <= 4,000,000 on the data size, because the little bit of additional space is probably to account for overhead for storing the data, which could be changed in the future.
If the data on a key is a boolean or number, it can always be stored. You cannot store nil
on a key using SetAsync or UpdateAsync, this will throw an error. You need to use RemoveAsync to set the data to nil on a specific key.
If the data is a table, you should check if the JSON-encoded representation of that table does not exceed the data limit (but preferably still use 4,000,000).
Notes regarding string length and string characters:
- For string length checking, this is a bit more involved than with tables. Strings can contain non-regular characters that are escaped as
\uXXXX
. This could mean that the actual size of your string is much longer than the string length that Lua observes (i.e. a non-regular character can take up 6 characters of space). - Datastores cannot accept string characters with an ASCII value of over 127 unless they are part of a valid UTF-8 character. For most developers this should not be an issue they typically run into, but it’s good to know.
- Datastores correctly handle
\0
characters unlike the most of the Lua API.
Cache cooldown
The cache cooldown is not the same as the Developer Hub reports. The Developer Hub reports 4 seconds at the time of writing, but it is actually 5 seconds. The cache cooldown affects how GetAsync works. Refer to Notes on Caching.
Write cooldown
The write cooldown is the same as the Developer Hub reports, 6 seconds. The write cooldown affects how SetAsync, IncrementAsync, UpdateAsync, and RemoveAsync work.
If you try to write (set/increment/update/delete) a key multiple times within the write cooldown time, then those additional requests will be throttled until they can be executed without violating the write cooldown. If too many requests of that type are currently throttled, it is possible for the request to throw an error instead. Refer to Request Flow.
Datastore budgeting table
The following table gives an overview of how the budgets are initialized, incremented on a base rate and a per-player rate per minute, and what the maximum backlogged budget is for each type.
Budget Type | Start | Base rate | Per-player rate | N Player Max (for N >= 0) |
---|---|---|---|---|
GetAsync | 100 | 60 | 40 | 3 * (60 + N * 40) |
SetIncrementAsync | 100 | 60 | 40 | 3 * (60 + N * 40) |
GetSortedAsync | 10 | 5 | 2 | 3 * (5 + N * 2) |
SetIncrementSortedAsync | 100 | 30 | 5 | 3 * (30 + N * 5) |
OnUpdate | 30 | 30 | 5 | 1 * (30 + N * 5) |
“UpdateAsync” | 100 | 60 | 40 | 3 * (60 + N * 40) |
The following subsection discusses how exactly these values are used.
Initial budgets
Once DataStoreService starts existing (i.e. is obtained through GetService), its budgeting is kickstarted. At this point, all budget types will return the Start
value.
Budgets do not start updating over time until you do the GetService call.
Incrementing budgets over time
Budgets are not updated once every minute, but rather smoothly increased over time. For example, if the rate per minute is 60, you would get approximately +1 request per second on the budget, rather than +60 at once every minute.
Maximum backlogged budgets
If you don’t use up your Datastore budget, you will develop a backlog of requests that you can use at a later point. This is displayed in the last column. For example, if you have 0 players in a game, you can accumulate up to 3 * 60 = 180 GetAsync budget. Similarly, with 5 players online in a game, you can accumulate up to 3 * (5 + 5*2) = 45 GetSortedAsync budget.
The common formula is that it is a factor of 3 times the total per-minute increment rate at that given moment (depending on base rate, per-player rate, and number of players at that moment). In other words, you reach the maximum backlogged budget if you don’t do any Datastore calls for 3 minutes.
In the case that you are at the maximum backlogged budget, and a player leaves, the budget is cut down to the maximum allowed backlogged budget with the new amount of players. For example, if the GetAsync budget is maxed out at 270 with 3 players, and one player leaves, the budget is instantly decreased to 240 to match the new player count.
Extra budgets when the server closes (OnClose)
When the server closes, you may use more Datastore requests in that time period than you would otherwise to store, for example, remaining player data. To accommodate for this, Roblox moves your budget up to a lower bound if it is below that value currently. These are the bounds:
Budget Type | Base rate | OnClose Minimum Bound | ||
---|---|---|---|---|
GetAsync | 60 | 2.5 * 60 |
= 150 | |
SetIncrementAsync | 60 | 2.5 * 60 |
= 150 | |
GetSortedAsync | 5 | 2.5 * 5 |
~ 12 | |
SetIncrementSortedAsync | 30 | 2.5 * 30 |
= 75 | |
OnUpdate* | 30 | 2.5 * 30 |
= 75 | |
“UpdateAsync” | 60 | 2.5 * 60 |
= 150 |
For example, if your GetAsync budget is currently 23, it will be bumped up to 150 before OnClose calls run. If your GetSortedAsync budget is currently 15, it won’t be bumped down to 12, since 15 is already higher than the boundary.
The general formula is that the minimum bound is 2.5 times the normal rate-per-minute without players, rounded down to the nearest integer.
*) Note that the OnUpdate bound is technically useless, since you wouldn’t be connecting new OnUpdate connections in OnClose anyway.
4. Budget Consumption
This section describes how each call to the API reduces the Datastore budgets as retrieved by GetRequestBudgetForRequestType.
The Developer Hub reports the following table for which Datastore requests use which budgets:
https://developer.roblox.com/api-reference/enum/DataStoreRequestType
However, at the time of writing, this table is absolutely wrong. The table claims, for example, that GlobalDataStore::IncrementAsync uses SetIncrementSortedAsync budget, and that SetIncrementAsync budget is only used by the SetAsync and IncrementAsync calls.
The following subsections provide the correct budget consumption tables for all Datastore requests.
Consumption table for GlobalDataStores (all non-OrderedDataStores)
Method | GetAsync | SetIncrementAsync | GetSortedAsync | SetIncrementSortedAsync | OnUpdate | UpdateAsync |
---|---|---|---|---|---|---|
GetAsync | 1 | N/A | ||||
SetAsync | 1 | N/A | ||||
IncrementAsync | 1 | N/A | ||||
UpdateAsync | 0/1* | 1 | N/A | |||
RemoveAsync | 1 | N/A | ||||
OnUpdate | 1** | N/A |
*) UpdateAsync uses 1 GetAsync budget (in conjuction to 1 SetIncrementAsync budget) when the specified key has never been obtained by the server before.
**) OnUpdate only consumes budget when first connecting it, triggering the connection any number of time afterwards consumes no budget whatsoever.
Consumption table for OrderedDataStores
Method | GetAsync | SetIncrementAsync | GetSortedAsync | SetIncrementSortedAsync | OnUpdate | UpdateAsync |
---|---|---|---|---|---|---|
GetAsync | 1 | N/A | ||||
SetAsync | 1 | N/A | ||||
IncrementAsync | 1 | N/A | ||||
UpdateAsync | 0/1* | 1 | N/A | |||
RemoveAsync | 1 | N/A | ||||
GetSortedAsync | 1 | N/A | ||||
OnUpdate | 1** | N/A |
*) UpdateAsync uses 1 GetAsync budget (in conjuction to 1 SetIncrementAsync budget) when the specified key has never been obtained by the server before.
**) OnUpdate only consumes budget when first connecting it, triggering the connection any number of time afterwards consumes no budget whatsoever.
Consumption table for DataStorePages:
Method | GetAsync | SetIncrementAsync | GetSortedAsync | SetIncrementSortedAsync | OnUpdate | UpdateAsync |
---|---|---|---|---|---|---|
AdvanceToNextPageAsync | 1 | N/A |
Why “N/A” for UpdateAsync?
There is actually no “UpdateAsync” request budget, it is a fake value.
Instead, what you get back when you request this budget, is just the minimum of the GetAsync and SetIncrementAsync budgets. There is no separate counter for an “UpdateAsync” budget internally, it is always returned as the minimum of those two budgets.
Why is that the case?
An UpdateAsync call for GlobalDataStores may need to consume a unit from both GetAsync and SetIncrementAsync budget, if the key has not been recently fetched through another GetAsync/UpdateAsync/IncrementAsync call on that same key.
Therefore, to ensure that developers can always perform an UpdateAsync call when the “UpdateAsync” budget is > 0, this “UpdateAsync” budget must be the minimum of the GetAsync/SetIncrementAsync budgets. If one of them is 0 while the other is above 0, that means the “UpdateAsync” budget will be 0, which is correct, since if an UpdateAsync needs to be performed that needs to consume from both budgets, this request would be throttled.
5. Budget Limits through Command Bar / Plugins
This section describes how budgeting differs when DataStoreService is used in Studio via plugins or the command bar.
As can be observed by comparing to tables in Overview of Limits & Budgeting, the starting budget and the rate-per-minute is the same as in live instances with 0 players. However, the maximum backlogged budget is not a factor of 3 of the total increment rate, but rather a factor 100 of it. Obviously, this is intended so that there are less restrictions for using Datastores through Studio command bar / plugins.
Budget Type | Start | Rate | Max in Studio (N = 0) | ||
---|---|---|---|---|---|
GetAsync | 100 | 60 | 100 * (60 + 0 * 40) |
= 6000 | |
SetIncrementAsync | 100 | 60 | 100 * (60 + 0 * 40) |
= 6000 | |
GetSortedAsync | 10 | 5 | 100 * (5 + 0 * 2) |
= 500 | |
SetIncrementSortedAsync | 100 | 30 | 100 * (30 + 0 * 5) |
= 3000 | |
OnUpdate* | 15 | 30 | 100 * (30 + 0 * 5) |
= 3000 | |
“UpdateAsync” | 100 | 60 | 100 * (60 + 0 * 40) |
= 6000 |
*) See Notes on OnUpdate for an issue with budgeting due to which these budget values are not actually achieved for OnUpdate in Studio. The other values in this table are actually achieved.
Effectively, this means you need to wait 100 minutes without doing any Datastore requests to reach the maximum backlogged budgets in Studio.
6. Notes on Caching
This section describes how caching works in Datastores.
Cache cooldown revisited
As discussed in Request Flow, if a get request is performed on a key, and the key was recently touched within the cache cooldown (see Overview of Limits & Budgeting), then the get request will return instantly with the cached value of that key.
The cache cooldown currently is 5 seconds. So for example, if I perform a GetAsync request on “TestKey” right now, and again 2 seconds later, that second request will return instantly because the value for “TestKey” was cached.
Datastore requests that set the read cache
The following requests on a key will set the read cache for that key:
- GetAsync
- IncrementAsync
- UpdateAsync
This means that if I perform an operation on a key with any of these methods, and then do a GetAsync on that key afterwards (within the cache cooldown, 5 seconds), the latter will return immediately without yielding at all and consumes no budget.
It makes sense that GetAsync sets the read cache, however, it does not make sense that IncrementAsync sets the read cache as that method never consumes GetAsync budget. Therefore, you could technically get a value with only 1 SetIncrementAsync budget using IncrementAsync followed immediately by a GetAsync (the latter would be free).
Moreover, while it does make sense that UpdateAsync sets the read cache if that key was never obtained before (since UpdateAsync will then also consume from the GetAsync budget), it does not make sense that UpdateAsync also sets the read cache when it does not consume from GetAsync budget. You can perform a similar trick here as with IncrementAsync (see previous paragraph).
It is unclear why it was decided that the IncrementAsync and UpdateAsync methods should set the read cache.
Datastore requests that don’t set the read cache
The other requests do not set the read cache for that key:
- RemoveAsync
- SetAsync
- OnUpdate
Performing these methods on a key will not set the read cache for that key that GetAsync uses. In other words, performing a GetAsync request on that same key within 5 seconds will still cause the GetAsync to be a regular remote call that does consume budget and is subject to other throttling checks.
It makes sense that none of these methods set the read cache time for a key, as none of them are able to consume GetAsync budget.
Obviously, requests that don’t take a key, such as OrderedDataStore::GetSortedAsync and DataStorePages::AdvanceToNextPageAsync, do not set any read cache times for any key either.
What caching does
Caching is useful because it only allows you to fetch the new value every so often rather than every time. This saves on budgeting and unnecessary yielding due to excessive remote calls. This is especially useful for less experienced developers who would, perhaps, write code like this:
local money = data:GetAsync(playerKey).Money
local items = data:GetAsync(playerKey).Items
local pets = data:GetAsync(playerKey).Pets
-- (...)
Due to caching, only the first GetAsync here will perform a remote call and use budget. The other calls will not use budget and go through instantly with the cached value.
Obviously, you want it to be possible for the read cache to be busted, such that you are not stuck getting cached values all the time. You would expect that after 5 seconds, the cached value for that key would be timed out, and you can fetch new values again after that.
How Roblox implemented caching
EDIT: Roblox fixed Datastore caching: Datastores: Caching is fundamentally broken (with details on how to fix logic-wise)
The cache time is actually set every time that you touch a key with a method that sets the read cache.
For example, consider this code:
local oldValue = nil
while wait(1) do
local value = data:GetAsync("TestKey")
if oldValue ~= value then
print("value changed!")
oldValue = value
-- (other non-yielding code here)
end
end
As we know, the cache cooldown should prevent remote calls for GetAsync requests for 5 seconds, as that is the cache cooldown. You might expect this code to perform 1 remote call and thus get the remote value every 5 iterations of the while loop for that reason. You might expect that if another server would change the key, then the if-statement would be entered whenever that new value is observed here as soon as possible…
… However, what actually happens is that every time when you call GetAsync, whether remote or not, the cache time is set again. That means every 1 second, the key is set as “please don’t fetch a new value for this key for the next 5 seconds”.
The result is that this piece of code only ever consumes one single get request from the budget. After that, every single loop iteration will get the old, cached value, and the read cache will never be busted because the loop interval is less than 5 seconds. The if-statement is not entered ever, if you set the value of the key from another server.
Side-note: if you write the key on the same server where you are running this code, the new value will actually be returned by the GetAsync call in the code, because that is now the new cached value for that key on that server. Updates from other servers however would not be received.
Implications of broken caching on game logic
This deficit of caching has some serious caveats that you should keep into account for your code. You should never attempt to read a key consistently within the 5 second interval in your game, because if you do, you will not be able to receive any new values on that same key.
Normally, you should not run into issues with caching, because typically games do not call GetAsync on a key very repetitively in a loop or call it often enough across multiple threads to cause a never-ending cache refresh.
Make sure to also not call any other methods in a loop on a key that set the cache time (see a previous subsection in this section for a list of such methods).
Note on concurrently dispatched requests
The cache time is only set once GetAsync, IncrementAsync or UpdateAsync returns from the call, not while the call is still being performed.
Consider the following code:
for i = 1, 100 do
spawn(function() -- fire off 100 threads at once
local value = data:GetAsync("TestKey")
end)
end
While this code performs 100 get requests on the same key, it will not trigger the read cache because each call is made before the first of the bunch finishes (they are dispatched at the same time). Therefore, this code actually performs 100 remote calls and consumes 100 GetAsync budget, rather than one remote call and 99 cache hits. Unfortunately caching is not implemented very cleverly.
7. Notes on UpdateAsync
This section discusses some quirky properties of UpdateAsync and the “UpdateAsync” budgeting, both for Global and Ordered Datastores.
UpdateAsync will only consume GetAsync budget if not fetched before by any other request, but will still fetch new values every time if available
As discussed before, UpdateAsync will get the value of the key if it was never yet obtained before in that server. In such cases, UpdateAsync will consume both a get and set request from the respective budgets.
You might expect UpdateAsync to respect the cache cooldown such as discussed in Notes on Caching and perform another get+set request when you perform another UpdateAsync 5+ seconds later.
However, this is not the case. UpdateAsync will only ever do a get+set request for the first time that a key is touched. Afterwards, UpdateAsync will only ever consume a single set request on that key. This is a bit strange as it allows you to get an updated value of a key without consuming a get request but rather consuming a set request, similar to IncrementAsync.
UpdateAsync budget is fake
As described in Budget Consumption, the budget of UpdateAsync is not an actually tracked value, but is simply the minimum of the GetAsync and SetIncrementAsync budgets. This is to ensure that, in case an UpdateAsync needs to consume both budgets, it is possible to execute this request if the “UpdateAsync” budget > 0.
Watch out with UpdateAsync budget and OrderedDataStore::UpdateAsync!
OrderedDataStore::UpdateAsync uses SetIncrementSortedAsync budget, rather than using SetIncrementAsync budget like GlobalDataStore::UpdateAsync does. However, the “UpdateAsync” budget is still defined as the minimum of the GetAsync and SetIncrementAsync budgets. Therefore, if SetIncrementSortedAsync is 0 and the other two are above 0, the “UpdateAsync” budget will be above 0 too, despite a OrderedDataStore::UpdateAsync not being possible.
Therefore, to check whether OrderedDataStore::UpdateAsync is possible at any given time, replace:
if DataStoreService:GetRequestBudgetForRequestType(Enum.DataStoreRequestType.UpdateAsync) > 0 then
-- do OrderedDataStore::UpdateAsync, wrong! there might be no SetIncrementSortedAsync budget...
end
With:
if DataStoreService:GetRequestBudgetForRequestType(Enum.DataStoreRequestType.GetAsync) > 0
and DataStoreService:GetRequestBudgetForRequestType(Enum.DataStoreRequestType.SetIncrementSortedAsync) > 0 then
-- do OrderedDataStore::UpdateAsync, correct!
end
For OnUpdate on GlobalDataStores, it is of course enough to check that the “UpdateAsync” budget > 0.
8. Notes on OnUpdate
This section discusses issues with the OnUpdate API.
OnUpdate core functionality is broken
While OnUpdate requests will complete successfully if there is enough budget available, the update connection is actually not triggered unless the key was either updated on the same server, or a get request is performed on a key that was updated. This significantly reduces the usability of OnUpdate.
This has been filed here as a bug report:
https://devforum.roblox.com/t/onupdate-not-being-fired-cross-server-until-getasync-used/126316
OnUpdate budgeting does not function correctly in Studio
In the command bar and in plugins, the Datastore budget of OnUpdate calls is initialized on 15, but is never incremented over time unlike other budget types, even if the budget is reduced by making requests. This may be intentional or may be a bug.
Moreover, OnUpdate calls will never throttle or throw errors in Studio, even if the budget reaches 0 and it should technically be impossible to connect more events. This seems to be a bug.
OnUpdate only consumes budget when you connect it, not when triggered
OnUpdate does not consume any budget relative to the amount of times that the key it is connected to updates. It only consumes 1 OnUpdate request at the moment you connect to that key.
Disconnecting an active OnUpdate connection has no effect on budget. The Developer Hub suggests that you should disconnect these connections when they are not used anymore, but there is no observed difference in budgets/throttling/overall functionality of Datastores between keeping a bunch of connections alive or not.
I am not sure whether OnUpdate only consuming budgets when connecting is a bug. For now, I would consider it a “gotcha” rather than a “bug”.
9. Conclusion
I hope that this article clears up some misconceptions and gives advanced developers more details on how Datastores work internally. If you have any specific questions that this hasn’t answered or if you’re unsure about anything, feel free to drop a reply below.
I also hope this shows how Datastores are somewhat broken in several ways for any engineers that happen to be reading this, particularly in regards to budgeting and caching old values. Datastores also fail to provide descriptive error messages most of the time. I intend to make follow-ups for every documentation/engineering issue I discovered with Datastores at the time of writing.
Thanks for reading!