Ice7 here from team Royale High. We’ve experienced an increase in data loss reports over the last few days - likely in proportion to our increased traffic. It’s a small percentage of our total traffic, but it is STILL occurring. I have been testing different scenarios, and today I managed to replicate three instances where I made developer product purchases, left and my in game currency did not save (we force save both to data store and our backup (with http service) after all developer product purchases)
It should also be noted that we are logging all data store errors and none of the reports correspond to logged DS errors. I am able to go in a server and visually see if it’s even close to hitting the data store cap, and it is not. I am also logging those fails externally to google analytics (as well as our own server), and no errors to report.
I have been I have spoken with devs of other top games over the last few weeks, and they have reported the similar issues. Obviously isolating these types of issue at scale is not easy, simply due to the fact that everyone’s implementation is different and could very easily be the fault of the developer.
But here is what I found today:
I went into about 60-70 servers in order to see if I could replicate this. What I found was that Http Requests weren’t going out. I compared the value os.time() in that server to the actual utc timestamp, and it was returning wrong values. I tested making purchases, and leaving, and found that my data did not save in these scenarios, scenarios where os.time() was out of sync in one scenario behind by four minutes. So three things seemed to happen here. 1. os.time() was either delayed or ahead 2. The Http Request didn’t go out and 3. The data store request did not go out. So I strongly suspect that it is no coincidence the fact that three servers in which my data did not save were the same three servers where os.time() was returning an incorrect result. (both returning values in the past or the future, when comparing it to the actual UTC timestamp)
We so appreciate any insight & help the engineering team can give us, and are more than willing to have the engineering team have a look at our code. We have had other developers look it over and they were unable to find things that explained these strange issues.
Ice7/Ironclaw33/callmehbob/LaunceLotHandsome & the rest of the Royale High team.
Hello. I’m currently looking into this issue. I have a running theory, but would like some extra information.
Could you replicate this a few more times? When you do, please Press Ctrl-Shift-F3 for the network debug menu. I’d like you to collect a list of the server IP addresses and datacenter IDs (listed to the right of the IP address) of affected servers, and DM that to me.
Not sure about http data. However, you shouldn’t rely upon os.time() being directly in sync. I don’t think this was ever guaranteed, although I’m not sure where it’s documented.
In any event, don’t rely on os.time() to invalidate save data. You can perform a 2 way sync with an http server which might get your a better idea of what time it is.
Thanks Quenty - makes sense. I do agree, that the documentation should include that os.time() should include that they’re not guarenteed to be in sync across servers. What i just find bizarre is the fact that in these very servers, the ones in which os.time() was out of sync, were the same servers that I saw where Http requests weren’t going out/very delayed. (We have our own rate limiter in place to prevent too many http requests from going out)
I know it seems hard to explain any sort of correlation between the fact out of sync os.time() servers are the same ones where http requests were delayed, I just think its a very strange coincidence.
In any case, I have removed our overwrites from our database due based on the value of os.time(). I really appreciate the insight.
os.time being out of sync makes since. All of Roblox’s servers aren’t in the same time zone and even then there could be a couple seconds difference between them.
Are the results of os.time(os.date("!*t")) out of sync between servers? If so that’s more cause for alarm because UTC should be synced through servers as much as possible.
All makes sense - Although I think the wiki really should to be updated to make this fact clear. At the moment all it says is “os.time() Returns how many seconds have elapsed since the UNIX epoch (1 January 1970, 00:00:00), under UTC time.”
And it still doesn’t explain the fact that I’ve seen cases of HTTP Requests delayed by several seconds, even when it’s not even close to hitting the threshold. We will continue to monitor this over the next few days to see where we lie. Thanks everyone for your input.
os.time() is suppose to be UTC, IIRC. I’m pretty sure that os.time(os.date(“!*t”)) actually returns local time.
Basically, don’t rely upon Roblox’s servers having correct time set as if it’s synced from any centralized location. You either sync your timezone yourself, or find another way to solve the problem.
In vanilla Lua at least it’s the system time, which isn’t necessarily UTC. Roblox could very well have changed that for their implementation, though I don’t know why they would.
os.time(os.date("!*t")) should return the system time adjusted to be UTC, though I’m not close enough to a PC to test it.