Datastores : Ability to simulate errors (outages and limit rates)

As a Roblox developer, it is difficult to simulate things like outages and limit-rates with DataStoreService. This makes testing code that interacts with datastores more difficult than it should be, which is a problem since testing datastore code under stressful conditions is absolutely critical.

Currently, if a developer wants to simulate errors with DataStoreService, they have to write their own “mock” datastore APIs, which then either call actual datastore APIs, or return a simulated error. Like outlined in another feature request I made regarding datastores, maintaining a “mock” datastore library is extremely difficult and puts unnecessary burden on developers.

DataStoreService should allow a developer to specify that any requests to it should randomly error. The mock datastore library by @buildthomas allows its users to set:

  • The rate at which simulated errors will be thrown (0% of the time to 100% of the time)
  • The amount of time an API call takes (random between a range of seconds the developer specifies)
  • Budget limits

By allowing developers to simulate & configure errors and other restrictions like this, testing datastore code becomes MUCH easier. When tied in with the ability to use datastores “locally” instead of reading & writing to actual datastores, developers would have a powerful testing tool at their disposal for testing their datastore code.

7 Likes

Yeah, I really hate that to just test your datastore code that you have to reserve a new counterpart of the database which just wastes too much room. I think built-in mock datastores that would persist from session to session would be a good solution.

This type of tool actually existed before known as “Network Simulator” (formerly “Diabolical Mode”) which was removed in 2019. These types of tools including the one proposed fall under the category of Choas Engineering. As a professional software engineer, these tools are useful but not a replacement for automated tests. They are much faster and can consistently test error cases while the proposed tool might not fit every case. Nexus Data Store has a block of tests dedicated to network errors that can consistently test every expected error case within a second. Ideally, you should have access to both tools, but automated tests should be your priority over relying on chaos engineering.

2 Likes

Problem - automated testing is extremely difficult on Roblox because of the complete lack of support for it. Things like roblox-CLI were never released to developers.

1 Like

Well, separate feature request then. ¯\_(ツ)_/¯

Automation tests either need to run in a Studio session or a live game server currently. It would be the same for any sort of chaos engineering. You wouldn’t do your chaos engineering in an automated environment because of how unpredictable it is. Might as well do the first one or both in Studio for now.

You also cannot do an automated test if you have no way of making DataStoreService “fail”. In order to have these sorts of tests, we need the feature requested in this post.

You can if you mock out the DataStoreService with only the bits you need. My mocks of the service are typically <20 lines of only what I need and test a very specific failure mode. Chaos engineering of sorts can help you find what to test, and more importantly, practice identifying and resolving issues in live games. Automated tests with known, simple error cases should be the focus to prevent regressions though.

This becomes a hassle to do (see my other post, linked above) when you have a data system that:

  • Implements session locking
  • Implements data schema migration
  • Deals with multiple keys and multiple datastores (e.g. community made level sort page)

“Mocking” this requires you to essentially “mock” the entirety of DataStoreService, which is problematic.

1 Like