~40 Data store 502 errors / day

Over the past week I’ve consistently had atleast one 502 data store error a day, and today atleast 40. (Yes I track them, Yes I counted)
I believe this to be part of the cause of data loss amongst my player base, which considerably hurts my stat’s when players come back a day later after FTUE and realise none of their progress saved.

Game

1 Like

Hey, I’ll ping someone to look at this. Keep in mind that it’s infeasible to provide 100% up-time. We usually strive for 2.5-4 nines (99.5% to 99.99%) depending on the service.

What kind of data loss is happening? You should be able to write your logic to prevent data loss if you do not overwrite default data when the data could not load, and you can implement frequent auto-saves and limited retries to avoid impact from intermittent failures like this.

1 Like

Im using profile store which seems to retry, but by the looks of it and with the frequency of the issues especially today it may have given up. (not too sure :man_shrugging:)

Theres the possibility of it being something stupid on my end, but the frequency of the issues isnt very favourable either.

And to add some more context it seems like this only ever affects data saving, not loading.

Hi thanks for the report.

Could you share the link to your Data Stores Observability page? I don’t see errors from the game you linked.

Is this issue still ongoing? And confirm that the errors are from live servers (not Roblox Studio).

Yes this issue is ongoing and has only become more prevalent

This dashboard doesn’t seem to report most of the errors however which I noticed myself

This was also already reported almost 2 weeks ago and we had no answer since.. I can confidently say this has been happening for ATLEAST a month.

I had over 500 occurences in my game in the last 7 days

You can find them in the “Error Report” category if you filter for “DataStoreService”

EDIT: I looked at Data Store Dashboard and same result as Jakey, there is no reports on the dashboard, despite having over 500 in the error reports over the same period..

1 Like

We are not promising literally 100.00% availability, and I don’t think any free web service on the face of this planet does. You should expect to handle transient 5xx errors at times (e.g. gracefully catch errors or retry responsibly). This is the nature of distributed systems.

If you constantly see a significant % of failures compared to your successes (e.g. more than 0.1%), our team will definitely take a look.

1 Like

Hello, thanks for your answer. I understand 100% availability is impossible, but from these errors I also have a good chunk as “AccessForbidden” errors, which I doubt is the reason of any unavailability.

However it has been increasing amount for the last couple of months (2-3), they did not happen as often before that… something degraded

I would like to see how it compares with the successes, but the errors are suppressed and not properly reported to the datastore dashboard for comparison

1 Like

Yeah I get access forbidden aswell, I also got 9007 (Service instances unavailable) earlier and about 8 minutes ago.

Hi @Hooksmith, Just wondering and making sure, is there going to be a fix for the problems dispatching the errors to datastore dashboard? this is a HUGE flaw and a VERY important fix so we can know and tell if there is indeed a degradation or not on roblox datastores

Thanks

Sorry but I’m not sure I’m following. I understand it’s a bit inconvenient to have some of these errors show up on your dashboard if they are not something you can address yourself, but it’s better to show these than not, correct?

Again, 100% reliability is not something we offer, so you should expect to see a tiny amount of errors against datastores.

The expectation is that you cover these errors with retries with a reasonable back-off pattern.

The issue is that by the looks of it very few of the errors actually make it to the dashboard, I get a considerable number / day but the dashboard doesnt report them. (Or very few of them)

1 Like

Can we file a separate bug report for that, I missed that in your earlier message and it’s likely all my colleagues did as well. Let’s file a separate bug report for each kind of issue if possible, it would help a lot.

Please add sufficient detail to the report.

4 Likes

Hello! Yeah sorry, as Jakey said not all datastore errors reported in the “Error Report” can be found in the Datastore Dashboard (in Monitoring). @jake_4543 since I can’t find one opened by you, I’ll go ahead and open a new one, if you already did just lmk and I’ll close mine

As for this topic, looking back at those graph today, we can see that the datastore error surge we reported before has now (or appears to have) returned to a normal state since approximately June 22nd for my game (4 weeks preview shown below)

1 Like