Upcoming Changes to Data Stores Versioning

Hi Creators!

Our vision for the Data Stores versioning system is to provide reliable, easy-to-use access to previous versions of data when things go wrong in your experiences. Since its launch in 2021, we’ve faced two main issues:

  • Storing unlimited versions for every key is excessive and unsustainable.
  • When versions are needed, they are often unavailable or difficult to access.

We’ve looked at how your experiences are using versioning today, and we’ve found that the vast majority of versions created are never used again. We also know that, when needed, having access to versions across a wide range of times is critical for data recovery.

To optimize the number of versions we store and increase their usefulness, the following changes will be launching on July 29th, 2024:

  1. Only one version will be saved per hour for each key.
  2. Versions will be available longer.
  3. Data Stores Snapshots can be taken to guarantee backups for every key in an experience.

Later this year, the following will also go live:

  1. New APIs to retrieve a version by timestamp instead of version id.

Change 1: Fewer, Optimized Versions

Launching July 29th, 2024.

[Current Behavior]

  • All writes to a key create a versioned backup of the previous data.

[New Behavior]

  • Only the first write to each key in each hour creates a versioned backup of the previous data. All successive writes to a key within the same hour permanently overwrite the previous data.

    You will continue to have permanent access to the latest version.

Change 2: Versions Available Longer

Launching July 29th, 2024.

[Current Behavior]

  • Non-current versions expire 30 days after they are written.

[New Behavior]

  • Non-current versions expire 30 days after they are made non-current. This is always longer than the current behavior.

    Currently, a corrupted or unintentional data write immediately expires the previous data for any key that was last updated >30 days ago. With this update, previous data will remain available as a versioned backup for 30 days after the write.

    In other words, previous versions of data for all keys are available for 30 days after a data corruption incident, regardless of when the previous versions were written.

Change 3: Data Stores Snapshots

Launching July 29th, 2024.

[Current Behavior]

  • n/a

[New Behavior]

  • Through a new Open Cloud API, you will be able to take a snapshot of all the Data Stores in an experience once per day, per universe. After the snapshot, the next write to every key in the universe will create a versioned backup of the previous data, regardless of the time of the last write. All data current at the time of the snapshot is guaranteed to be available as a versioned backup for at least 30 days.

    This endpoint should be used before publishing any experience update which changes your data storage logic. It guarantees that you have the most recent data available from the previous version of your experience.

    For example, without a snapshot, if an update published at 3:30 causes data corruption, corrupted writes will overwrite any data written between 3:00-3:30. With a snapshot taken at 3:29, the corrupted data will not overwrite anything written before 3:29, preserving the latest data for all keys written between 3:00-3:29.

Change 4: Get Version by Timestamp

Launching late 2024.

[Current Behavior]

  • To retrieve a versioned backup, a combination of ListVersionsAsync and GetVersionAsync must be used.

[New Behavior]

  • To retrieve a versioned backup, a new, simpler GetVersionAtTimeAsync method can be used.
    This feature makes it easier to use Data Stores versions for data rollback to a particular time.
    For example, if a user complains of data issues and it is known that the data was “correct” at 4:00pm four days ago, you can use this method to get data from that specific time. We will continue to fully support ListVersionsAsync and GetVersionAsync.

If you have any clarifying questions, we’re happy to answer them here! We’re also looking for volunteers to get early access to these changes. If you’d like to enroll an experience or a specific Data Store, please send me a DM. This is a great way to test and understand these changes before they go live globally.

See you soon!
The Data Stores team

111 Likes

This topic was automatically opened after 10 minutes.

Totally fine with these changes. Always wondered how people were able to essentially make unlimited versions with no repercussions LOL

18 Likes

okay master, sounds good and just updated my studio

4 Likes

Hi
Less frequent but longer lasting versions make a ton of sense.
I struggle to understand the snapshots… Does it keep a copy of all data so you can always access it later?

So if I understand correctly snapshots are a tool to be used before you push an update that could cause corruption? Just to be safe you do a snapshot before pushing the update so you can revert all player data later?

5 Likes

Wait so, how can one figure out when the last version was made…

uhh whatever

maybe I have it wrong in my mind

This would be an issue for session locking, but I think I’d just be using that additional meta thing one can supply…

which can only be a small amount though

 

Data Store Snapshots sounds cool though.

3 Likes

Honestly, I’m surprised that storing 30 days of backups for every DataStore write stayed a feature this long. It sounds wildly expensive to run. The new behavior of buffering the backups to once an hour makes sense.

9 Likes

I didn’t even know about versioning, but I’m surprised at how much it has improved with this single update. Keep up these amazing updates, they’re really great! :happy2:

2 Likes

I have never read “datastores team” before, this smells like there will be many more great updates for datastores, this will be great!

2 Likes

Taking a snapshot tells Data Stores to make a versioned backup for all your data on the next write to each key. These versions are available to you for 30 days after the next write takes place.

Yep!

You can also take a snapshot before changing a live configuration in your experience, for example. You don’t have to tie it to a place update.

6 Likes

as someone who managed to botch an update ruining the games economy and had to speedrun a rollback with 0 knowledge on how to do a rollback I approve of this feature :+1:

3 Likes

GetAsync will always return the latest data, including any metadata and session locking info from your most recent write.

We’ve made these changes with existing Data Stores libraries in mind. If you have a use case which doesn’t seem compatible with these changes, please let us know! We’re happy to help find a solution.

3 Likes

Glad to hear this sounds useful to you! If you have ideas for other tools that would improve your rollback workflows, please open a feature request and let us know.

5 Likes

This is a little bad for developers, since they can’t store unlimited data anymore, but it got a valid and good reason.
I approve this :+1:

3 Likes

Awesome work, looking forward to the new API!

2 Likes

This is the issue with Roblox blindly taking a % instead of it being pay for usage. All of these features have costs that at the end of the day developers are paying. It’s money Roblox is choosing to put towards infrastructure instead of DevEx. If they chose to optimize their infrastructure and let developers pay for usage instead of a fixed %, those savings could be passed on to developers but they have no incentive to do it when nobody has a problem with them taking a %. Nobody asked for this feature but every developer still has to pay for it because Roblox management said so

4 Likes

Same really goes for how many players are in your game at once. Developers with dozens of thousands of average active players pay the same % as a small developer with ten average active players even though the difference in cost for those servers is astronomical.

What if developers don’t have the budget for a “pay for usage” model?

Was there a reason to limit to once per hour VS just have a cap on total versions with a first in first out and the 30 day expiry?

I can see this negatively affecting any games that used it as an effective save history and restore system.

Other cloud service providers they offer free tiers so people can easily start developing on their platform then when a project becomes successful both the developer and the service provider benefit. This makes it free to start developing or if you only want small games with a few players, you only pay when your game becomes successful

https://azure.microsoft.com/en-us/pricing/free-services/

1 Like