Data Stores Batch Processor is now Open Source

Hello Creators!

Managing data at scale is one of the greatest challenges of building a successful experience. As your game grows, so does the need to efficiently manage your data, whether that be migrating between data schemas or removing obsolete entries.

Today, we’re excited to allow you to do exactly that by releasing the Data Stores Batch Processor CLI as a fully open-source tool!

This is a powerful, flexible, command-line tool built to orchestrate large-scale, custom operations on your Data Stores. It’s designed to run on your local machine or a server, leveraging the power of Open Cloud APIs for LuauExecutionSessionTasks, Memory Stores, and Data Stores to perform work, separate from your live game servers.

Note that the batch processor’s processing speed is liable to Data Stores rate limits and LuauExecutionSessionTask limits. Depending on the operation, expect an initial processing speed of up to 500 keys or data stores / min!

Unlocking Your Data

The Batch Processor CLI is all about flexibility. You provide the custom Luau logic, and the tool handles running it across thousands or even millions of items. This unlocks critical capabilities for every large-scale developer:

  • Bulk Data Deletion: Remove obsolete data to manage your storage footprint and comply with upcoming limits. You can run custom logic to selectively delete data based on any criteria you define. If you need to clear an entire data store, we recommend you use the Open Cloud Data Store Deletion API instead.
  • Complex Schema Migrations: Seamlessly migrate user data to new, optimized storage patterns. The tool is vital for creators wishing to move away from patterns like DataStore2 or the “Berezaa Method” to a new schema.

Discover your own use cases for the tool as well!

By the Community, For the Community

We’re open-sourcing this tool because we want to give you the power to solve your unique data challenges, and adapt the tool to your own needs. By providing the full source code, we are inviting you to:

  • Contribute: Fork the repository, make improvements, and submit a pull request to benefit the entire Roblox community.
  • Customize: The full codebase is available for you to modify and tailor to your experience’s unique needs.
  • Learn: See a production-grade example of how to build a powerful tool on top of Open Cloud APIs.

As excited as we are to give you this tool, we are just as excited to see what you are able to achieve with it next!

Get Started Today

Without further ado, here it is:

The documentation is available in the README of the repository. Please read all of the documentation before testing the batch processor, and ensure that you understand how to use the tool and its associated risks.

Check it out, start processing, and let us know what you think and if you have any questions!
Thanks, The Data Stores Team

103 Likes

This topic was automatically opened after 10 minutes.

I’m so happy that this tool is finally out. It has been incredibly powerful for scaling data analysis and modification, which has been a massive help for all the issues we’ve had in the past with data storage due to poor design decisions at the time. Super excited to see what other creators do with this product!

13 Likes

Right now, the many of the other Roblox OSS Luau repositories are read only one way mirrors with an internal source of truth. Is there any plans to have a two way system such as Google’s Copybara that allows for developers to contribute back to these repositories?

2 Likes

If I have 1 million keys, I will have to keep it running for approx. 33 hours 20 minutes. Are there any plans to develop bulk API to speed this up further?

2 Likes

Thank you for the question!

Our coming Data Stores Experience-Level Access limits (see “Future Experience Limits” in documentation) will help with the Batch Processor running more smoothly. The official limits are coming in April '26, but we are exploring a public opt-in opportunity before then. Please stay tuned to us sharing more information.

3 Likes

Let me flag this to the relevant team and get back to you!

1 Like

This addition goes into the direction of Roblox giving us more control over how we manage data.

Do you plan to extend this concept to other services as well, such as letting users define custom ways how we send user data back to the client, how memory is allocated, or even better, let users manage performances and therefore raise the triangle limit a model can have?

2 Likes

Enabling and bringing more control to Creators is a central mission for Roblox. However, across different products, controls will be provided in different ways and on different timelines.

At this time, we do not have further Open Source initiatives to share beyond Data Stores. For services, additional observability and tooling solutions will be provided as part of Extended Services.

Please look for future announcements around client performance improvements.

1 Like

Are there any plans to reduce data store limits? I am going to be producing systems that require:

Large amounts of data to be stored, written, updated, sorted, and read frequently.

Usually this isn’t a concern for developers because Roblox’s data store service already provides more than enough for your average scripter who is simply going to read and write data when a player leaves/joins.

But for some games, Roblox’s built in data store service isn’t enough.

3 Likes

Our limits are set to allow for large worlds on Roblox. For usage beyond those limits, we will be offering Extended Services for Data Stores storage and access, coming in early next year - here are the details on pricing, and more details on our latest update here.

baited into thinking this was for atomic batch writes, instead got ai slop :skull:

While it is fine that it was ai-generated, has this been thoroughly tested to ensure that it works fine at the very least? I know it says to use at your own risk but have you ensured theres a baseline of quality that can be expected?

Lastly, what is the rationale for doing bulk schema migrations? In our games we generally just change the scope if we need to migrate. You would just import from the old one and transform the data for each player that joins. Which is ostensibly less risky.

Edit: It truly humours me how people that have not read the code beyond a surface level are flabbergasted at the possibility of it being ai-generated. All the signs are there but I will eat my hat if I am wrong which I hope to be.

1 Like

Where’s your source for this supposedly being AI? There’s literally NO mention of it in neither the post nor the repo.

AI atleast for the next 50 years I believe doesn’t have the capabilities to make stuff like this. If it was even possible to do right now even slightly or with some effort roblox would fire its whole team of programmers.

And um there’s no mention of this being created by AI…

…so, are you going to give snippets of the code that you think are AI generated?

1 Like

Stop evading the question. Show us sources or you’re just lying for the sake of it. Roblox isn’t that lacking on engineering resources that they need to offload such a complex resource onto AI to somehow make.

3 Likes

:skull: :sob:


Glad this is opensource, although the docs are definitely rewritten by AI I just hope the code isn’t.

You can’t trust those so-called “AI detectors” to actually be accurate. They are very notorious for either pumping out false positives or intentionally lying/exaggerating the results to look better.

2 Likes

I mean no disrespect, but there is unicode in the docs that doesn’t exist on the keyboard that gets output by ChatGPT everytime you run something through it. They didn’t write it, and I certainly don’t think they copied it from the internet somewhere when they could have used a dash or something else. On top of it not following MLA or ELA formats in sentence structure, it’s 100% affected by AI.

It’s blatantly obvious, and people should check the docs for issues around consistency. The docs may say one thing while the code does another thing.

This still isn’t good enough reason to make AI accusations, there are plenty of software that will automatically convert the output of what you are typing to use characters like the em-dash that is commonly used as some catch-all “Ha! this is AI!”. Hell, you can even observe it when writing a post on the devforum, things like a double dash will convert to an em-dash and even three consecutive dots converts to a unicode three dots character.

I highly doubt they hand-wrote such a lengthy markdown document and definitely would have used software that almost certainly has these “convenience” auto-conversions in them.