Ping skyrockets on player join

Recently players in my game have started to experience significant ping spikes when players join my game. It isn’t uncommon to see ping time tripple when single players join. I’ve spent a lot of time debugging this and I think the cause might be related to a recent update regarding loading.

I work on Apocalypse Rising 2 and we store a huge amount of items, chunked parts of our map, scripts, images, sounds, so on. We have to keep a majority of it in ReplicatedStorage and ReplicatedFirst since the client needs almost all of those instances at one point or another. Quick benchmarks in an isolated test put client memory at ~1GB when the instances are in ReplicatedStorage and ReplicatedFirst.

My isolated test for this involved no running scripts, just the instances in their desired storage services:

Testing steps were as follows:

  1. publish the game with no running scripts
  2. play the game and wait for ping to stabilize
  3. have another user join the game
  4. observe ping levels

We tested under the follow conditions:

  1. Storage as is (see picture above)
  2. Storage with everything in ReplicatedFirst
  3. Storage with everything in ReplicatedStorage
  4. Storage with everything in Workspace
  5. Storage with everything in ServerStorage

Tests 1, 2, 3 all gave us the same results. Ping is stable at the 60ms range and when the second player joints the game ping spikes into the 300-500ms range then drops back down to the 60ms range.

Test 4 crashed clients when they tried to join the testing place.

Test 5 had no increases in ping (presumably because there was nothing to replicate).

All tests we’re reproducible 100% of the time, all test cases tested at least 3 times. I can privately supply a .rbxl file to work with on request, the contents of the file are not for public use.

EDIT: public repro Ping skyrockets on player join


This breaks my game. It is totally unacceptable for me to expect 500ms of ping for extended periods of time just because a player joins. In actual game tests on our development build we’ve had serious performance issues on server sided tasks and busy player traffic servers become totally unplayable.

I don’t have anything other than timing to believe that this problem is linked to this update. We did not experience ping spikes like this before the update went live and our development and productions versions of our game suffer from really bad ping spikes when players join.

12 Likes

It looks like this is very visible with this example (240,000 parts under ReplicatedStorage), but I don’t have my own benchmark for how this behaved previously. I’d definitely expect an impact regardless with this much in ReplicatedStorage though.

4 Likes

Yea in AR2 we’ve always been able to observe a small bump in ping when players join but it’s never been anywhere near this bad. In our bug report channels and from our staff we’ve seen a pretty big increase in ping spike complaints around the time the load order Roblox update went out.

Could it be due to the sheer amount of loading required and all the assets to be fully loaded in, causing some errors within these loading times

Yes that does seem to be the current theory. The bug is that this wasn’t a problem before but something changed that I believe wasn’t on my end and now my game gets massive ping spikes.

I don’t know, recently roblox was coming out with a lot of updates which made my studio break a lot of times so maybe they messed up somewhere in their main client causing ping to go high up for no reason

The feature on this thread needed to be disabled shortly before the holiday weekend due to an issue discovered with team create:

does the issue still repro in game clients / in studio?

2 Likes

I’m still able to repro this issue in my own testing place and in the one linked here: Ping skyrockets on player join

I tried performing joins from studio with this change enabled/disabled

With the change enabled there is an increase in the ping spike. However, the current state on production clients & studio for the general public is with the change disabled, and there are still hundreds/thousands of ms ping even with the change disabled, so the core of the problem is not the change I linked.

The change that I linked is intended to reduce the overall impact on ping when joins happen. If you catch a server microprofile, you will see that the ping is caused by an elongated frame, which in turn is triggered by processing many instances on join. The change I posted earlier is part of a larger initiative aimed at reducing that time, so this general problem is on our radar. Reducing this time on join for games with many instances is multi-faceted, it is going to take a significant amount of engineering effort (time) to reduce.

Okay, maybe something else then? There is for sure a noticeable impact on user ping for everyone in the server when somebody joins. We did 2 tests on the 24th and 25th of August and one of our admins streamed the test. I looked back at the stream footage and recorded pings on the test dates.

August 24 started @ 10:30pm EST

  • 8 joins with no major ping jump (variance of ~30)
  • 2 joins with major ping jumps (100 -> 340, 100 -> 400)

August 25 started @ 10pm EST

  • 3 joins with no major ping jump (variance of ~30)
  • 7 joins with major ping jumps (base 100 ping, average +100, worst case +200)

Between those two days we had really different results and we’ve got a spike in ping and lag complaints starting on the 25th in our discord. I can confirm no change in how we measured ping or game code for both tests. Testing on both days occurred with full servers (20 players) and we had people instantly fill empty server spots. Test chat logs in our discord say there were peopled queued to join for the duration of the test.

1 Like

I reviewed the settings that were pushed on Aug 24 and Aug 25, the only setting that seemed possibly related to join was later disabled for clients on Sept 03 and in studio as well Sept 05. If you perform the same test you performed on Aug 24/25 today, do you get the same results?

Sorry for the delay, I’ve been trying to organize a test for this but I’ve had some delays. I should be able to get test results this weekend.

We had a staff only test in Sept 5th that didn’t match the same testing conditions I’ve outlined in this post but we still had ping spikes in the same range. Average + ~100 ping for all clients when somebody joined. The staff ran a test where 8 people coordinated a join at the same time and ping spiked to ~500 for the people who stayed in the server. Server condition was a non player full server the entire time and nobody was ever queued to join during the duration of the test.

Ran another test tonight (9/12 @ 9pm EST) and got the same results as the test on the 24th. Same test conditions as the tests on the 24th and 25th.

Not sure if this is 100% correct but servers locations will occasionally change over the days. Farther away the server is from the player = more ping spikes.

1 day the server might be in Europe, the next it might be in the USA, the next it might be in Asia. Its pretty random. Most of the servers in a location near the players might be taken up on some days so they will have to use another server farther away.

this is all based on speculation, I have no idea how server location actually works lol

To make sure I’m following, having the same results as on the 24th means that the slowdown is as good as low as you have measured it to be / that the current state doesn’t seem to be a regression from where we were before?

Correct, everything seems to be working a lot better. Ran another test yesterday and replicated the same positive results: very minimal (~30ms) ping jumps on player joins, no major jumps (100ms or higher). We ran a few servers this time and didn’t find any continuous cases of extreme ping jumps on player join.