Details on DataStoreService for Advanced Developers

datastores

#1

Table of Contents

  1. Introduction
  2. Request Flow
  3. Overview of Limits & Budgeting
  4. Budget Consumption
  5. Budget Limits through Command Bar / Plugins
  6. Notes on Caching
  7. Notes on UpdateAsync
  8. Notes on OnUpdate
  9. Conclusion

1. Introduction

This is a complete, comprehensive overview of all sorts of implementation details and gotchas regarding Roblox’s datastores from the perspective of the DataStoreService API that are not properly described on the Developer Hub at the time of writing.

The reason for writing is that I recently wrote the MockDataStoreService module, which is a near-perfect emulation of datastores in Roblox-Lua. While making this system, I found a ton of weird subtleties and issues with datastores that I haven’t seen properly explained/documented anywhere, which may be useful to know about for certain edge cases or if you are pushing the boundaries with the Datastore limits.

Moreover, some of the information on Datastores on the Developer Hub is clearly wrongly documented. I will file some documentation requests after posting this thread in an attempt to fix that.

(back to top)


2. Request Flow

This section discusses in detail how Datastore requests, request throttling, and request throwing (=erroring) works.

Flowchart for Datastore requests

The main path of a successful/failing Datastore request is given below. (click to enlarge)

First of all, the input parameters of the Datastore request are checked. If these are invalid, the request fails immediately without yielding and without consuming any Datastore budget.

Secondly, read caching constraints are checked. If the request is a get request, and the key was recently touched within the cache cooldown (see Overview of Limits & Budgeting), the request immediately returns the recently obtained value, without yielding or consuming budget.

Thirdly, throttling checks are performed. If budget for the request is not available, the request will be throttled until budget is available. Similarly, if the request is a write request (set/remove/increment/update), and the key was recently written to within the write cooldown (see Overview of Limits & Budgeting), the request is also throttled until the write cooldown has worn off.

If there is budget available and no write cooldown is violated (anymore), appropriate budget is consumed and the remote call is performed, and the thread will yield until a response is received from the external data layer. (see the next subsection for more details on throttling)

If the remote call failed due to whatever reason there may be (service unavailable, service congested, corrupted data, etc), the request will throw an error.

If the remote call was successful, then depending on what kind of call was used, the read cache for that key will be updated (see Notes on Caching), and finally the result of the request will be returned.

Request throttling

When a request cannot be completed at this moment if it doesn’t pass the throttling checks, the request is “throttled”. This means the request will yield for longer than usual until the right conditions are met.

A request can be throttled for the following reasons:

  • There is no budget to complete the request – a warning will be thrown specifying this, and the request yields until budget is sufficient and no other throttling checks are violated.

  • The write cooldown is violated (see Overview of Limits & Budgeting) – a warning will be thrown specifying this, and the request yields until the write cooldown is no longer violated, and no other throttling checks are violated.

Throttling queues

When requests are throttled as described above, they are placed in a so-called throttling queue. The throttling queue is a backlog of requests that are yielded and should be performed as soon as they can be.

Throttling queues operate on the Leaky Bucket Principle: throttled requests will flow out at some rate defined by the throttling checks (i.e. budgeting and write cooldown), and the bucket has a limited size. Once the size of the bucket has been reached, no more throttled requests can be added.

Every actual budget type (GetAsync, SetIncrementAsync, GetSortedAsync, OnUpdateAsync, SetIncrementSortedAsync) has its own throttling queue. Each of these five throttling queues has a queue size of 30 throttled requests max. Throttled requests are added to the queue of the corresponding budget type that it consumes.

Requests that are currently being executed do not take in space in the throttling queue. Only the requests that are still waiting to be executed until there is budget / the write cooldown has passed take space in the queues. Therefore, if you have 100 GetAsync budget and you send out 100 GetAsync requests simultaneously, this will not overflow any throttling queue since there is budget for each request and no write cooldown is violated, so all of the requests execute immediately and are not throttled in the queue.

Request throwing due to throttling

If a throttled queue is full, and another throttled request is attempted to be added, that request will throw an error immediately instead and is thus not executed. The error will indicate that the request was discarded due to the throttle queue being full.

Request throwing due to other reasons

Apart from errors related to throttling, Datastores can also throw errors for invalid input (i.e. storing an invalid value, invalid key names, invalid scopes, invalid Datastore names, invalid parameter types, etc). These kind of errors will not consume the corresponding budget of the request.

Datastores may also cause errors if the remote server determines that the request is malformed. For example, if you are trying to call GetSortedAsync with a minimum value that is higher than the maximum value, the remote server will reject the request for invalid input. However, as it is a remote call, Datastore budget will be consumed for such erroneous requests. Another example is if you are trying to call IncrementAsync on a non-integer key; this will need to perform a remote call before it realizes that the request is erroneous, and therefore will also consume budget but will still throw an error.

Errors can also be thrown due to service failures or malformed data. These requests will also consume budget as they ultimately attempt a remote call before erroring.

This hub article gives a decent overview of some of the errors that the API can throw:
https://www.robloxdev.com/articles/Datastore-Errors

Throttled requests may (and probably will) be processed out of order

It is called a throttle queue, but it does not actually adhere to queuing principles like first-in-first-out or even first-in-last-out. Requests may be resumed from the throttling queue in any given order, no order is guaranteed and they will commonly be resumed out of order. So just because throttled request A was called earlier than throttled request B, doesn’t mean A is resumed before B is resumed.

I don’t expect this to be practically be an issue in most applications, because you shouldn’t be relying on the order of these requests to be LIFO or FIFO.

(back to top)


3. Overview of Limits & Budgeting

This section gives an overview of all limits regarding Datastores, and describes how exactly budgeting is initialized and updated.

Size limits

These are the size limits for various fields.

Property Size Limit
Name 50
Scope 50
Key 49
Data* 262143

*) Note on data size: it is recommended to still check for <= 260,000 on the data size, because the little bit of additional space is probably to account for overhead for storing the data, which could be changed in the future.

The key limit is 49 characters rather than the 50 that the documentation suggests.

If the data on a key is a boolean or number, it can always be stored. You cannot store nil on a key using SetAsync or UpdateAsync, this will throw an error. You need to use RemoveAsync to set the data to nil on a specific key.

If the data is a table, you should check if the JSON-encoded representation of that table does not exceed the data limit (but preferably still use 260,000).

Notes regarding string length and string characters:

  • For string length checking, this is a bit more involved than with tables. Strings can contain non-regular characters that are escaped as \uXXXX. This could mean that the actual size of your string is much longer than the string length that Lua observes (i.e. a non-regular character can take up 6 characters of space).
  • Datastores cannot accept string characters with an ASCII value of over 127 unless they are part of a valid UTF-8 character. For most developers this should not be an issue they typically run into, but it’s good to know.
  • Datastores correctly handle \0 characters unlike the most of the Lua API.

Thanks @Anaminus for contributing to this section!

Cache cooldown

The cache cooldown is not the same as the Developer Hub reports. The Developer Hub reports 4 seconds at the time of writing, but it is actually 5 seconds. The cache cooldown affects how GetAsync works.

However, the Developer Hub claims that if you repeatedly try to GetAsync a key, you should receive a new value every 5 seconds due to this mechanism. This is actually wrong, because caching is fundamentally broken in Datastores. Refer to Notes on Caching.

Write cooldown

The write cooldown is the same as the Developer Hub reports, 6 seconds. The write cooldown affects how SetAsync, IncrementAsync, UpdateAsync, and RemoveAsync work.

If you try to write (set/increment/update/delete) a key multiple times within the write cooldown time, then those additional requests will be throttled until they can be executed without violating the write cooldown. If too many requests of that type are currently throttled, it is possible for the request to throw an error instead. Refer to Request Flow.

Datastore budgeting table

The following table gives an overview of how the budgets are initialized, incremented on a base rate and a per-player rate per minute, and what the maximum backlogged budget is for each type.

Budget Type Start Base rate Per-player rate N Player Max (for N >= 0)
GetAsync 100 60 10 3 * (60 + N * 10)
SetIncrementAsync 100 60 10 3 * (60 + N * 10)
GetSortedAsync 10 5 2 3 * (5 + N * 2)
SetIncrementSortedAsync 100 30 5 3 * (30 + N * 5)
OnUpdate 30 30 5 1 * (30 + N * 5)
“UpdateAsync” 100 60 10 3 * (60 + N * 10)

The following subsection discusses how exactly these values are used.

Initial budgets

Once DataStoreService starts existing (i.e. is obtained through GetService), the budgeting of DataStoreService is kickstarted. At this point, all budget types will return the Start value.

Budgets do not start updating over time until you do the GetService call.

Incrementing budgets over time

Budgets are not updated once every minute, but rather smoothly increased over time. For example, if the rate per minute is 60, you would get approximately +1 request per second on the budget, rather than +60 at once every minute.

Maximum backlogged budgets

If you don’t use up your Datastore budget, you will develop a backlog of requests that you can use at a later point. This is displayed in the last column. For example, if you have 0 players in a game, you can accumulate up to 3 * 60 = 180 GetAsync budget. Similarly, with 5 players online in a game, you can accumulate up to 3 * (5 + 5*2) = 45 GetSortedAsync budget.

The common formula is that it is a factor of 3 times the total per-minute increment rate at that given moment (depending on base rate, per-player rate, and number of players at that moment). In other words, you reach the maximum backlogged budget if you don’t do any Datastore calls for 3 minutes.

In the case that you are at the maximum backlogged budget, and a player leaves, the budget is cut down to the maximum allowed backlogged budget with the new amount of players. For example, if the GetAsync budget is maxed out at 270 with 3 players, and one player leaves, the budget is instantly decreased to 240 to match the new player count.

Extra budgets when the server closes (OnClose)

When the server closes, you may use more Datastore requests in that time period than you would otherwise to store, for example, remaining player data. To accommodate for this, Roblox moves your budget up to a lower bound if it is below that value currently. These are the bounds:

Budget Type Base rate OnClose Minimum Bound
GetAsync 60 2.5 * 60 = 150
SetIncrementAsync 60 2.5 * 60 = 150
GetSortedAsync 5 2.5 * 5 ~ 12
SetIncrementSortedAsync 30 2.5 * 30 = 75
OnUpdate* 30 2.5 * 30 = 75
“UpdateAsync” 60 2.5 * 60 = 150

For example, if your GetAsync budget is currently 23, it will be bumped up to 150 before OnClose calls run. If your GetSortedAsync budget is currently 15, it won’t be bumped down to 12, since 15 is already higher than the boundary.

The general formula is that the minimum bound is 2.5 times the normal rate-per-minute without players, rounded down to the nearest integer.

*) Note that the OnUpdate bound is technically useless, since you wouldn’t be connecting new OnUpdate connections in OnClose anyway.

(back to top)


4. Budget Consumption

This section describes how each call to the API reduces the Datastore budgets as retrieved by GetRequestBudgetForRequestType.

The Developer Hub reports the following table for which Datastore requests use which budgets:
https://www.robloxdev.com/api-reference/enum/DataStoreRequestType

However, at the time of writing, this table is absolutely wrong. The table claims, for example, that GlobalDataStore::IncrementAsync uses SetIncrementSortedAsync budget, and that SetIncrementAsync budget is only used by the SetAsync and IncrementAsync calls.

The following subsections provide the correct budget consumption tables for all Datastore requests.


Consumption table for GlobalDataStores (all non-OrderedDataStores)

Method GetAsync SetIncrementAsync GetSortedAsync SetIncrementSortedAsync OnUpdate UpdateAsync
GetAsync 1 N/A
SetAsync 1 N/A
IncrementAsync 1 N/A
UpdateAsync 0/1* 1 N/A
RemoveAsync 1 N/A
OnUpdate 1** N/A

*) UpdateAsync uses 1 GetAsync budget (in conjuction to 1 SetIncrementAsync budget) when the specified key has never been obtained by the server before.
**) OnUpdate only consumes budget when first connecting it, triggering the connection any number of time afterwards consumes no budget whatsoever.


Consumption table for OrderedDataStores

Method GetAsync SetIncrementAsync GetSortedAsync SetIncrementSortedAsync OnUpdate UpdateAsync
GetAsync 1 N/A
SetAsync 1 N/A
IncrementAsync 1 N/A
UpdateAsync 0/1* 1 N/A
RemoveAsync 1 N/A
GetSortedAsync 1 N/A
OnUpdate 1** N/A

*) UpdateAsync uses 1 GetAsync budget (in conjuction to 1 SetIncrementAsync budget) when the specified key has never been obtained by the server before.
**) OnUpdate only consumes budget when first connecting it, triggering the connection any number of time afterwards consumes no budget whatsoever.


Consumption table for DataStorePages:

Method GetAsync SetIncrementAsync GetSortedAsync SetIncrementSortedAsync OnUpdate UpdateAsync
AdvanceToNextPageAsync 1 N/A

Why “N/A” for UpdateAsync?

There is actually no “UpdateAsync” request budget, it is a fake value.

Instead, what you get back when you request this budget, is just the minimum of the GetAsync and SetIncrementAsync budgets. There is no separate counter for an “UpdateAsync” budget internally, it is always returned as the minimum of those two budgets.

Why is that the case?

An UpdateAsync call for GlobalDataStores may need to consume a unit from both GetAsync and SetIncrementAsync budget, if the key has not been recently fetched through another GetAsync/UpdateAsync/IncrementAsync call on that same key.

Therefore, to ensure that developers can always perform an UpdateAsync call when the “UpdateAsync” budget is > 0, this “UpdateAsync” budget must be the minimum of the GetAsync/SetIncrementAsync budgets. If one of them is 0 while the other is above 0, that means the “UpdateAsync” budget will be 0, which is correct, since if an UpdateAsync needs to be performed that needs to consume from both budgets, this request would be throttled.

(back to top)


5. Budget Limits through Command Bar / Plugins

This section describes how budgeting differs when DataStoreService is used in Studio via plugins or the command bar.

As can be observed by comparing to tables in Overview of Limits & Budgeting, the starting budget and the rate-per-minute is the same as in live instances with 0 players. However, the maximum backlogged budget is not a factor of 3 of the total increment rate, but rather a factor 100 of it. Obviously, this is intended so that there are less restrictions for using Datastores through Studio command bar / plugins.

Budget Type Start Rate Max in Studio (N = 0)
GetAsync 100 60 100 * (60 + 0 * 10) = 6000
SetIncrementAsync 100 60 100 * (60 + 0 * 10) = 6000
GetSortedAsync 10 5 100 * (5 + 0 * 2) = 500
SetIncrementSortedAsync 100 30 100 * (30 + 0 * 5) = 3000
OnUpdate* 15 30 100 * (30 + 0 * 5) = 3000
“UpdateAsync” 100 60 100 * (60 + 0 * 10) = 6000

*) See Notes on OnUpdate for an issue with budgeting due to which these budget values are not actually achieved for OnUpdate in Studio. The other values in this table are actually achieved.

Effectively, this means you need to wait 100 minutes without doing any Datastore requests to reach the maximum backlogged budgets in Studio.

(back to top)


6. Notes on Caching

This section describes how caching works in Datastores, and how it is fundamentally broken at the moment of writing, and how to avoid running into issues with caching.

Cache cooldown revisited

As discussed in Request Flow, if a get request is performed on a key, and the key was recently touched within the cache cooldown (see Overview of Limits & Budgeting), then the get request will return instantly with the cached value of that key.

The cache cooldown currently is 5 seconds. So for example, if I perform a GetAsync request on “TestKey” right now, and again 2 seconds later, that second request will return instantly because the value for “TestKey” was cached.

Datastore requests that set the read cache

The following requests on a key will set the read cache for that key:

  • GetAsync
  • IncrementAsync
  • UpdateAsync

This means that if I perform an operation on a key with any of these methods, and then do a GetAsync on that key afterwards (within the cache cooldown, 5 seconds), the latter will return immediately without yielding at all and consumes no budget.

It makes sense that GetAsync sets the read cache, however, it does not make sense that IncrementAsync sets the read cache as that method never consumes GetAsync budget. Therefore, you could technically get a value with only 1 SetIncrementAsync budget using IncrementAsync followed immediately by a GetAsync (the latter would be free).

Moreover, while it does make sense that UpdateAsync sets the read cache if that key was never obtained before (since UpdateAsync will then also consume from the GetAsync budget), it does not make sense that UpdateAsync also sets the read cache when it does not consume from GetAsync budget. You can perform a similar trick here as with IncrementAsync (see previous paragraph).

It is unclear why it was decided that the IncrementAsync and UpdateAsync methods should set the read cache.

Datastore requests that don’t set the read cache

The other requests do not set the read cache for that key:

  • RemoveAsync
  • SetAsync
  • OnUpdate

Performing these methods on a key will not set the read cache for that key that GetAsync uses. In other words, performing a GetAsync request on that same key within 5 seconds will still cause the GetAsync to be a regular remote call that does consume budget and is subject to other throttling checks.

It makes sense that none of these methods set the read cache time for a key, as none of them are able to consume GetAsync budget.

Obviously, requests that don’t take a key, such as OrderedDataStore::GetSortedAsync and DataStorePages::AdvanceToNextPageAsync, do not set any read cache times for any key either.

What caching is meant to do

Caching is useful because it only allows you to fetch the new value every so often rather than every time. This saves on budgeting and unnecessary yielding due to excessive remote calls. This is especially useful for less experienced developers who would, perhaps, write code like this:

local money = data:GetAsync(playerKey).Money
local items = data:GetAsync(playerKey).Items
local pets = data:GetAsync(playerKey).Pets
-- (...)

Due to caching, only the first GetAsync here will perform a remote call and use budget. The other calls will not use budget and go through instantly with the cached value.

Obviously, you want it to be possible for the read cache to be busted, such that you are not stuck getting cached values all the time. You would expect that after 5 seconds, the cached value for that key would be timed out, and you can fetch new values again after that.

How Roblox implemented caching (wrong!)

The cache time is actually set every time that you touch a key with a method that sets the read cache.

For example, consider this code:

local oldValue = nil
while wait(1) do
    local value = data:GetAsync("TestKey")
    if oldValue ~= value then
        print("value changed!")
        oldValue = value
        -- (other non-yielding code here)
    end
end

As we know, the cache cooldown should prevent remote calls for GetAsync requests for 5 seconds, as that is the cache cooldown. You might expect this code to perform 1 remote call and thus get the remote value every 5 iterations of the while loop for that reason. You might expect that if another server would change the key, then the if-statement would be entered whenever that new value is observed here as soon as possible…

… However, what actually happens is that every time when you call GetAsync, whether remote or not, the cache time is set again. That means every 1 second, the key is set as “please don’t fetch a new value for this key for the next 5 seconds”.

The result is that this piece of code only ever consumes one single get request from the budget. After that, every single loop iteration will get the old, cached value, and the read cache will never be busted because the loop interval is less than 5 seconds. The if-statement is not entered ever, if you set the value of the key from another server.

Side-note: if you write the key on the same server where you are running this code, the new value will actually be returned by the GetAsync call in the code, because that is now the new cached value for that key on that server. Updates from other servers however would not be received.

Implications of broken caching on game logic

This deficit of caching has some serious caveats that you should keep into account for your code. You should never attempt to read a key consistently within the 5 second interval in your game, because if you do, you will not be able to receive any new values on that same key.

Normally, you should not run into issues with caching, because typically games do not call GetAsync on a key very repetitively in a loop or call it often enough across multiple threads to cause a never-ending cache refresh.

Make sure to also not call any other methods in a loop on a key that set the cache time (see a previous subsection in this section for a list of such methods).

Note on concurrently dispatched requests

The cache time is only set once GetAsync, IncrementAsync or UpdateAsync returns from the call, not while the call is still being performed.

Consider the following code:

for i = 1, 100 do
   spawn(function() -- fire off 100 threads at once
      local value = data:GetAsync("TestKey")
   end)
end

While this code performs 100 get requests on the same key, it will not trigger the read cache because each call is made before the first of the bunch finishes (they are dispatched at the same time). Therefore, this code actually performs 100 remote calls and consumes 100 GetAsync budget, rather than one remote call and 99 cache hits. Unfortunately caching is not implemented very cleverly.

(back to top)


7. Notes on UpdateAsync

This section discusses some quirky properties of UpdateAsync and the “UpdateAsync” budgeting, both for Global and Ordered Datastores.

UpdateAsync will only consume GetAsync budget if not fetched before by any other request, but will still fetch new values every time if available

As discussed before, UpdateAsync will get the value of the key if it was never yet obtained before in that server. In such cases, UpdateAsync will consume both a get and set request from the respective budgets.

You might expect UpdateAsync to respect the cache cooldown such as discussed in Notes on Caching and perform another get+set request when you perform another UpdateAsync 5+ seconds later.

However, this is not the case. UpdateAsync will only ever do a get+set request for the first time that a key is touched. Afterwards, UpdateAsync will only ever consume a single set request on that key. This is a bit strange as it allows you to get an updated value of a key without consuming a get request but rather consuming a set request, similar to IncrementAsync.

UpdateAsync budget is fake

As described in Budget Consumption, the budget of UpdateAsync is not an actually tracked value, but is simply the minimum of the GetAsync and SetIncrementAsync budgets. This is to ensure that, in case an UpdateAsync needs to consume both budgets, it is possible to execute this request if the “UpdateAsync” budget > 0.

Watch out with UpdateAsync budget and OrderedDataStore::UpdateAsync!

OrderedDataStore::UpdateAsync uses SetIncrementSortedAsync budget, rather than using SetIncrementAsync budget like GlobalDataStore::UpdateAsync does. However, the “UpdateAsync” budget is still defined as the minimum of the GetAsync and SetIncrementAsync budgets. Therefore, if SetIncrementSortedAsync is 0 and the other two are above 0, the “UpdateAsync” budget will be above 0 too, despite a OrderedDataStore::UpdateAsync not being possible.

Therefore, to check whether OrderedDataStore::UpdateAsync is possible at any given time, replace:

if DataStoreService:GetRequestBudgetForRequestType(Enum.DataStoreRequestType.UpdateAsync) > 0 then
   -- do OrderedDataStore::UpdateAsync, wrong! there might be no SetIncrementSortedAsync budget...
end

With:

if DataStoreService:GetRequestBudgetForRequestType(Enum.DataStoreRequestType.GetAsync) > 0
and DataStoreService:GetRequestBudgetForRequestType(Enum.DataStoreRequestType.SetIncrementSortedAsync) > 0 then
   -- do OrderedDataStore::UpdateAsync, correct!
end

For OnUpdate on GlobalDataStores, it is of course enough to check that the “UpdateAsync” budget > 0.

(back to top)


8. Notes on OnUpdate

This section discusses issues with the OnUpdate API.

OnUpdate core functionality is broken

While OnUpdate requests will complete successfully if there is enough budget available, the update connection is actually not triggered unless the key was either updated on the same server, or a get request is performed on a key that was updated. This significantly reduces the usability of OnUpdate.

This has been filed here as a bug report:
https://devforum.roblox.com/t/onupdate-not-being-fired-cross-server-until-getasync-used/126316

OnUpdate budgeting does not function correctly in Studio

In the command bar and in plugins, the Datastore budget of OnUpdate calls is initialized on 15, but is never incremented over time unlike other budget types, even if the budget is reduced by making requests. This may be intentional or may be a bug.

Moreover, OnUpdate calls will never throttle or throw errors in Studio, even if the budget reaches 0 and it should technically be impossible to connect more events. This seems to be a bug.

OnUpdate only consumes budget when you connect it, not when triggered

OnUpdate does not consume any budget relative to the amount of times that the key it is connected to updates. It only consumes 1 OnUpdate request at the moment you connect to that key.

Disconnecting an active OnUpdate connection has no effect on budget. The Developer Hub suggests that you should disconnect these connections when they are not used anymore, but there is no observed difference in budgets/throttling/overall functionality of Datastores between keeping a bunch of connections alive or not.

I am not sure whether OnUpdate only consuming budgets when connecting is a bug. For now, I would consider it a “gotcha” rather than a “bug”.

(back to top)


9. Conclusion

I hope that this article clears up some misconceptions and gives advanced developers more details on how Datastores work internally. If you have any specific questions that this hasn’t answered or if you’re unsure about anything, feel free to drop a reply below.

I also hope this shows how Datastores are somewhat broken in several ways for any engineers that happen to be reading this, particularly in regards to budgeting and caching old values. Datastores also fail to provide descriptive error messages most of the time. I intend to make follow-ups for every documentation/engineering issue I discovered with Datastores at the time of writing.

Thanks for reading!

(back to top)


Datastores: Extra budget gained upon OnClose not properly documented
Datastores: Unclear that "UpdateAsync" budget is actually minimum of set/get budget
[ URGENT ] Major Issue Relating to Data Stores
Datastores: Maximum unused budgets in live instances for each request type are not properly documented
Datastores: Table of budget consumption rates is incorrect/incomplete
Game Development Resources [MEGA THREAD]
Data loss starting two days ago
Data loss starting two days ago
DataStore requests are being incorrectly queued
Whats the limit of a table in DataStore?
Best way to handle DataStore data management?
Inventory Safety/Accessibility Suggestions?
Datastores: Budgeting API and enums should be easier to understand and use
Datastores: Request flow, caching and throttling should be better documented
Datastores: Initial values for datastore budgets are not documented
Datastores: Budgeting in Studio is not properly documented
#2

That is very problematic. So just to clarify:

for i = 1,3 do
   dataStore:SetAsync("Test", i)
end

In that code, you would expect have 3 as the final value saved. But if all those SetAsync requests get throttled, then it could actually be 1, 2, or 3?


#3

That’s not quite what that section you quote is about, because these requests you are showing here are not throttled requests and they are not called from separate threads. Let me explain below what I meant with that quoted section.

Suppose you have 0 SetIncrementAsync budget at the moment, and then do this:

for i = 1,3 do
   spawn(function()
      print("start", i)
      dataStore:SetAsync("TestKey"..i, i)
      print("end", i)
   end)
end

This will lead to all three requests to be throttled, because there is no budget. The output could look like this:

start 1
start 2
start 3
(... insert 3 warnings about throttling here ...)
end 2                   -- 1 second later
end 3                   -- 1 second later
end 1                   -- 1 second later

(the requests leave the queue out of order, not 1-2-3)

In other words, once requests enter the throttling queue (i.e. no budget, or violating write cooldown of 6 seconds), those threads may be resumed in random order not respective to the order in which they entered the queue.

I don’t even know if this could ever be an issue practically. Just including it in the text for completeness. :slight_smile:


#4

I modified that subsection a bit to make sure no one else misunderstands it the same way. It’s just not a critical issue, just a very specific implementation detail I wanted to note about.


#5

Cool, thanks!


#6

I really wish I had this information about 2 years ago because DataStore was extremely frustrating to work with when I rather spend more time on the game than the saving mechanics and debugging.


#7

Data stores have always been a tricky subject for me. Thank you for devoting the time to arrange this information! I’ll definitely make time to read through it soon!


#8

Great tutorial, please post more content like this in the future, it’s pretty awesome.


#9

And this is why I would rather store my data on my custom database implementation.

Thank you for writing this wonderful guide! I’m quite saddened by both the fact that this behavior was not previously documented by Roblox engineers and by the fact that it has yet to be corrected.

Most developers fully rely on DataStoreService to store all of their game-related data. I cannot fathom the difficulties faced by the beginner developer which is attempting to fulfill a seemingly simple task of storing player data. It’s as such no surprise that data-loss errors are so common amongst the developer community.


#10

Edited the figure: I was informed Roblox uses Amazon DynamoDB instead of what I had written there first.


#11

Excellent research. I hope we migrate this information into the developer hub.


#12

Sweet, this really helped me out to understand the DataStoreService
much better. I have done Datastores before but they were really basic, this is a really good tutorial.


#13

okey dokey


The actual size limit for values appears to be 2^18-1, but only after the value has been coverted to JSON.

-- JSON-encoded string is enclosed in double-quotes, so 2 characters must be subtracted.
print(pcall(function() game:GetService("DataStoreService"):GetDataStore("test"):SetAsync("test", string.rep("A",2^18-3)) end))
--> true
print(pcall(function() game:GetService("DataStoreService"):GetDataStore("test"):SetAsync("test", string.rep("A",2^18-2)) end))
--> false 105: Serialized value converted byte size exceeds max size 64*1024 bytes.

The error is also off by several powers of 2.


The key limit is apparently 49 rather than 50:

print(pcall(function() game:GetService("DataStoreService"):GetDataStore("test"):GetAsync(string.rep("A",50)) end))
--> false 102: Key name exceeds the 50 character limit.
print(pcall(function() game:GetService("DataStoreService"):GetDataStore("test"):GetAsync(string.rep("A",49)) end))
--> true

There is certainly a lonely < searching its missing = friend.


Getting an accurate size can be a lot more involved that it seems. All non-regular characters appear to be escaped as \uXXXX, which can inflate the length of a character 6 times in some cases. Verifying string lengths isn’t helped by JSONEncode only accepting tables, either.


Datastore functions don’t accept strings with characters above 127 (should be a bug).

print(pcall(function() return game:GetService("DataStoreService"):GetDataStore("test"):SetAsync("test","\128") end))
--> false 104: Cannot store string in DataStore

Perhaps the string is (incorrectly) assumed to be unicode encoding rather than unencoded bytes?


It should also be mentioned that, unlike most of the Lua API, datastore functions properly handle null characters within strings.

print(pcall(function() local ds = game:GetService("DataStoreService"):GetDataStore("test") ds:SetAsync("test","hello\0world") return #ds:GetAsync("test") end))
--> true 11

#14

Oops, I threw in that size limit section last minute while writing the draft just to have it there, I forgot I had that whole section labeled as “complete and accurate”. The subsections after that one are complete :grin:

EDIT: I edited/changed all of this information, thanks!


#15

Is this going to be fixed anytime soon?

I thought I should ask because you might know or able to acquire news about this broken feature everyone is relying on and want fixed.


Very Informative Thread! :+1:


#16

This isn’t a bug, strings in data stores explicitly only store valid UTF-8.


#17

I developed a global marketplace about a month ago for my game, but couldn’t ever ship it because it only worked if both players were in the same server (aka not global. Defeats the purpose) This OnUpdate bug seems to be the issue. No wonder. I hope this gets fixed soon. If it does, someone please @ me. :+1:t2:


#18

You can subscribe to notifications on the linked bug report and you’ll get an email/browser notification when someone replies there.


#19

This is quite possibly the best guide I’ve ever read on anything related to ROBLOX.
Awesome job, I wish I had this years ago, it would’ve saved me a lot of time.


#20

Heads up, there’s a bug going around right now related to potential data loss if you try to call the same key in succession within a short period of time (<4s between attempts), more info here.