Bulk DataStore Deletion Methods

At the moment, it takes a long amount of time to iterate through DataStores to delete personal data when responding to a Right of Erasure request. This is because of the time maximum limits on running DataStore API calls.

This is an issue if you have lots of data written for one player, such as storing logs. Sometimes I end up having to leave my computer on for over 4 hours while player data is deleted.

If Roblox was able to speed up data store requests, for example, by allowing all writes to a specific data score & scope to be dropped via one API call.

If Roblox were able to implement features to help speed up running RoE requests, it would help make the whole Roblox fame maintenance experience much less tedious.

10 Likes

The limit for remove should be about 60/minute in Studio, if you say you spend 4 hours deleting keys for a single player, does that mean you’re storing 60 x 60 x 4 = over 14,000 keys per player in your datastores? Why can’t your system keep logs of a player together in one key?

You probably want to think about using dedicated logging infrastructure or making a feature request for log collection & analysis as a platform feature. This doesn’t seem like a proper use case for datastores. (You generally don’t use durable storage for logs, most infrastructures have a lifetime on them because they may contain sensitive info, and index them for easier searching through. Logs also rarely tend to stay relevant past the few-week threshold.)

I assume you’re hitting similar frustration on the read side if you ever need to inspect player logs and you have no way of easily searching through them if they’re stored in datastores.

That’s not to say there are no use cases for bulk delete. I just don’t think this is a good use case for datastores in general.

2 Likes

The logs are of time & financial critical data (server join/leaves & robux transactions) which I use to investigate (rare) cases where players contact reporting a failed transaction or lost goods.

There use to be reports of inconsistencies with retrieved datastore data when players join/quit games quickly - mainly related to in-game item cloning. I don’t think I understood what the exact issue was (and was perhaps more likely due to trading systems not paralleling the saving of both players profiles safely). [This is why they aren’t stored under one key].

However, I was concerned the cause of these issues may be due to data stores not always giving the latest saved value. Using an algorithmically unique key means I’m guaranteed to never overwrite data if incorrect ordering of table updates occurred. The vast majority of times a players contact about transaction errors, it turns out that they’re lying or the HCI elements of the game need improving (as to more clearly show how to use what’s been purchased) - analysing the logs quickly tells the story.

image
Diagram of what I believed the item-cloning bug was caused by

Usually I will only be checking at most the top 20 logs (ordered datastore), so reading isn’t a problem.
(I use an unorthodox method of storing the timestamp as the value, and the timestamp concatenated with a short log message as the key to do this).

To be honest, the logs don’t usually take too long to delete as there are relatively few logs, I use to store player profiles in the famous “Bereza” overkill style, and it’s these legacy profile saves will take ages to delete as deleting each profile save version requires deleting two keys. When player’s have play times going into days, and with autosaves having occured once a minute, it is where the bulk of the deletion time comes from. (Although overall, I’ve probably only had 3-10 cases of RoE where it has taken over an hour.)

I agree external infrastructure is a better solution for storing logs, however I fundamentally don’t agree with exporting data outside of Roblox as it creates what I see as unanswered questions with regards to Data Protection/GDPR, which primarily is “Am I a Data User for/of Roblox or an independent/external Data Controller?” If I’m exclusively using Roblox-internal data storage then I think an argument for being a Data User is clear, however if external stores are used I think there is more of an argument of being a Data Controller.
Being a Data Controller comes with legal compliance, some of which I don’t think would be possible (e.g. providing contact email/address to make requests too, or requesting personal information to identify whom is making a Subject Access Request, providing the address and URL of the Information Commissioners Office etc. would be a violation of Roblox ToS).

3 Likes