High Ping Spikes on Player Joins

Reproduction Steps
By starting a Local Server and having having a player join you can see large data spikes whenever a player joins. I’ve tested this on an empty baseplate, but it seems to be much worse on large terrain maps. That and on a public server.

My game contains a large terrain map. Which can be found here: https://www.roblox.com/games/5357899168/Dinosaur-Arcade-BETA

I’ve also provided an uncopylocked link to just the terrain map with a majority of the scripts deleted here: https://www.roblox.com/games/8923038824/Bug-Report-Player-Join-Ping-Spikes

Expected Behavior
It seems a lot of data is getting sent to the server on player joins, and it causes it to freeze a bit. The opposite is expected.

Actual Behavior
The ping and data sent increases when a player joins, and exponentially if multiple join at the same time. This sometimes causes the server to crash (which has been resulting in some data loss).

On a public server when a player joins:
https://i.gyazo.com/e7a8f3c4862cfb86ebbba04a72e49ac1.mp4

Same conclusion happens on an empty baseplate but less problematic:
https://gyazo.com/d69f15d1b66b247d7ac9f21c0a5f0f7a.mp4

This is with the same map with the scripts deleted:
https://i.gyazo.com/8dda1aed4246ece23d23e0991380dc75.mp4

This has me believe that it has something to do with large terrain maps. I’ve concluded that it does usually happens on player joins but I’m not sure if it is on my end. If there is a reason this is happening that someone could figure out it’d make my day because I’ve been stuck on this for 2 weeks.

The only other forum post I can find with this same problem was here: Major lag spikes? (DDoS?) - #4 by ChickenEngineer

Workaround

Issue Area: Engine
Issue Type: Performance
Impact: Very High
Frequency: Constantly
Date First Experienced: 2022-02-11 00:02:00 (-05:00)
Date Last Experienced: 2022-02-24 00:02:00 (-05:00)

7 Likes

By starting a Local Server and having having a player join you can see large data spikes whenever a player joins.

The default networking model currently in Roblox is for the server to transmit pretty much the entire level on join. In practice this initial data is much larger and we also opt to send it more quickly to help reduce the amount of load time on the client. One option you can consider if you would like to reduce this is to enable game content streaming
https://developer.roblox.com/en-us/articles/content-streaming
When that mode is enabled, the server will send only a portion of the world initially, and stream the rest incrementally as the user plays. The same amount of data may ultimately get sent, but this will spread that sending over a longer period.

If you are curious to see what is being transmitted when a player joins, you can enable Print Join Size Breakdown in studio network settings NetworkSettings.PrintJoinSizeBreakdown that will output text information about what is being sent at join to the new player.

There is some server processing overhead associated with sending this data, that is likely contributing to the observed ping times when a player joins. As a corollary, if you enable streaming and reduce how much data is sent initially, you will likely also see a reduction in the impact of a join on ping time.

If you capture a server microprofile around the time of join you may gain further insights into what is happening MicroProfiler

This sometimes causes the server to crash (which has been resulting in some data loss).

Please confirm which place is crashing (I’m assuming this place https://www.roblox.com/games/5357899168/Dinosaur-Arcade-BETA). Based on our records, that experience has been crashing due to running out of server memory. Try checking memory usage in the developer console for further indications of what might be going on.

7 Likes

I currently already have StreamingEnabled Enabled with a target StreamingTargetRadius of only 400.

Yes for the crashing it was for Dinosaur Arcade. I did notice high server memory but only after I’ve noticed these high pings were players reporting any kind of crashing. I do have a memory leak though, but I’ve noticed these spikes affecting clients on fresh public servers right after a shutdown.

I’ll look into the microprofiler and see the breakdown in a bit to get a better picture but thank you for the response!

Total Transmitted Join Bytes (Instance + Terrain) : 1669739 (1492175 + 177564)

Is this output showing what is sent to the new player before it is streamed in? On the right I currently have SteamingEnabled set to true.

Is this output showing what is sent to the new player before it is streamed in?

Yes, this is a breakdown of the data that cannot be loaded over time via streaming, the data that needs to be sent immediately on join.

WRT the lag, the first join that a server handles may take additional time, we cache some data between each join to reduce the workload, but the first join has no cache to work from. For severs handling live users this is typically not a problem (those servers normally handle many users over their lifetime, and by definition there is no user in the server before the first player joins to observe that high ping), but in studio most of the time when you test you are starting a new test server, so you “feel” that slowness each time.

I tried looking into the micro profiler when having another player join and most of it is from RunJob/Render. A majority of my models are stored in ServerStorage, and I have StreamingEnabled enabled. Though I could see how testing this all in studio can affect this as everything is local so I’ll try to look into it more on live servers.

I’ve tested it on a live server with my experience Dinosaur Arcade and with each player join I’m experiencing the high data/ping. Even after the initial first player joins it continues to spike with each additional player join. I’ll try to test it on a live server with only the terrain map and no other instances. I just have no idea what it could be. I’ll also try to backtrack my updates from two weekends ago when this all started.

This problem seemed to go away on it’s own for about a month, but now it’s back. It recently started happening a week ago. Without me having changed anything.

I’m at a lost. Still looking into a cause.

I just feel that it’s streaming related. Happens only when a player joins. We’re in the process of making a smaller map to compensate.

If anyone has any information on this it would be helpful. The microprofiler just shows a long “Replicator ProcessPacketts”

1 Like

Having the same issue as well, except Streaming Enabled is not on. The only difference is that it’s a very small map (Around 500x500 studs I’d estimate). Very high ping rates, that irrationally increase whenever a player joins, and decrease whenever a player leaves.

Also experiencing exactly the same issue across multiple Experiences. The issue manifests itself as chronically long load times (> 2 mins) for consecutive connections to the same game. The games typically manage to display their loading screen and then the delay kicks in and everything appears to pause.

During this period, Performance Stats will show a ping of 0 ms. After ~2 mins, the experience will load but will show unbelievable ping times >100,000 ms, 2 examples below shown:


I typically experience a ping of 150 to 200ms here in NZ. The above is just insane. Once the experience has loaded, things start to settle down after approx 1 or 2 minutes but the lag during this time is extreme to say the least.

A snippet of an extreme event is seen below, this was for the 3rd consecutive connection to the game Boom (issue is not limited to that experience but seems to be platform wide):
2022-04-17T10:37:37.754Z,2.754323,1020,6 [FLog::Output] Connecting to 128.116.86.85:55874
2022-04-17T10:37:37.754Z,2.754323,1020,7 [FLog::Network] Connecting to server, IP(inet_addr): 1431729280 Port: 55874
2022-04-17T10:37:38.353Z,3.353337,0360,6 [FLog::Output] Connection accepted from 128.116.86.85|55874
2022-04-17T10:37:38.354Z,3.354337,0360,7 [FLog::Network] serverId: 128.116.86.85|55874
2022-04-17T10:37:38.355Z,3.355336,0360,7 [FLog::Network] Replicator created for player 128.116.86.85|55874
2022-04-17T10:37:38.355Z,3.355336,0360,7 [FLog::Network] Replicator created: 1F6C31E0
2022-04-17T10:37:38.781Z,3.781558,04bc,7 [FLog::Network] Time taken to initialize schema = 8.451100 ms
2022-04-17T10:37:38.781Z,3.781558,04bc,7 [FLog::Network] Assigned peer id = 1523
2022-04-17T10:37:38.781Z,3.781558,04bc,7 [FLog::Network] Join snapshot timer: 0.215100
2022-04-17T10:37:39.640Z,4.640810,0494,12 [DFLog::HttpTraceError] HttpResponse(#32 235AF768) time:252.3ms (net:252.2ms callback:0.0ms timeInRetryQueue:0.0ms)status:403 Forbidden bodySize:50 url:“https://api.roblox.com/assets/4651779470/versions” ip:128.116.117.3 external:0 numberOfTimesRetried:0
2022-04-17T10:37:44.403Z,9.403332,066c,6 [FLog::Warning] Warning: Infinite yield possible on ‘StarterPlayer:WaitForChild(“StarterPlayerScripts”)’
2022-04-17T10:37:44.403Z,9.403332,066c,6 [FLog::Output] Info: Stack Begin
2022-04-17T10:37:44.403Z,9.403332,066c,6 [FLog::Output] Info: Script ‘CoreGui.RobloxGui.Modules.Settings.SettingsHub’, Line 789 - function getOverridesPlayerScripts
2022-04-17T10:37:44.403Z,9.403332,066c,6 [FLog::Output] Info: Script ‘CoreGui.RobloxGui.Modules.Settings.SettingsHub’, Line 850
2022-04-17T10:37:44.403Z,9.403332,066c,6 [FLog::Output] Info: Stack End
2022-04-17T10:42:58.168Z,323.168365,0eb4,6 [FLog::Graphics] Future is bright shadows
2022-04-17T10:42:58.170Z,323.170380,0eb4,6 [FLog::Graphics] SolidTextures::initSolidTextures()
2022-04-17T10:42:58.170Z,323.170380,0eb4,6 [FLog::Graphics] Terrain TextureArray 512x512, arraySize: 36, mip: 10
2022-04-17T10:42:58.661Z,323.661255,0494,12 [DFLog::HttpTraceError] HttpResponse(#234 355A1E78) time:544.4ms (net:544.3ms callback:0.0ms timeInRetryQueue:0.0ms)status:403 Forbidden bodySize:50 url:“https://api.roblox.com/assets/4651779470/versions” ip:128.116.117.3 external:0 numberOfTimesRetried:0
2022-04-17T10:42:59.006Z,324.006226,1908,6 [FLog::Output] Character Replication Bootstrapper
2022-04-17T10:42:59.010Z,324.010223,1908,6 [FLog::Output] Server RobloxGitHash: 7a1df4513df04e3ecb8ff9fa7f0406545c2cb0d7
2022-04-17T10:42:59.010Z,324.010223,1908,6 [FLog::Output] Server Prefix: 0.522.0.5220281_20220416T151914Z_RCC_c3081
2022-04-17T10:42:59.144Z,324.144287,0468,6 [FLog::Error] Error: Failed to load sound rbxassetid://1561067566: Unable to download sound data
2022-04-17T10:42:59.144Z,324.144287,0468,6 [FLog::Error] Error: Failed to load sound rbxassetid://1561067566: Unable to download sound data
2022-04-17T10:43:34.176Z,359.176849,0468,6 [FLog::Warning] Warning: Infinite yield possible on ‘Workspace.EmilyGaming9081:WaitForChild(“LowerTorso”)’
2022-04-17T10:43:34.176Z,359.176849,0468,6 [FLog::Output] Info: Stack Begin
2022-04-17T10:43:34.177Z,359.177856,0468,6 [FLog::Output] Info: Script ‘Workspace.EmilyGaming9081.Animate’, Line 39
2022-04-17T10:43:34.177Z,359.177856,0468,6 [FLog::Output] Info: Stack End
2022-04-17T10:43:37.103Z,362.103088,1b8c,6 [FLog::Output] Character Replication Ready!
2022-04-17T10:44:08.340Z,393.340698,0360,6 [FLog::Error] Error: Players.EmilyGaming9081.PlayerGui.MainUI.Main:4948: attempt to index nil with ‘Humanoid’
Stack Begin
Script ‘Players.EmilyGaming9081.PlayerGui.MainUI.Main’, Line 4948
Stack End

I am slowly collecting logs from multiple Experience connection attempts to provide diagnostics if required. I’ve heard of the issue affecting multiple people in multiple countries, across multiple Experiences. However it does not always manifest itself as an issue, so not easily reproducible.

1 Like

Mind blowing example connecting to Poly Battle with Perf Stats showing a 400,000ms ping. Trace route proves it not my connection causing the issue:
1 <1 ms <1 ms <1 ms 192.168.nnn.nnn
2 3 ms 6 ms 5 ms 100.nnn.nnn.nnn
3 19 ms 15 ms 18 ms 10.nnn.nnn.nnn
4 * * * Request timed out.
5 13 ms 15 ms 13 ms 10.64.131.94
6 14 ms 11 ms 14 ms 10.64.131.96
7 15 ms 15 ms 14 ms 10.64.132.55
8 15 ms 13 ms 13 ms 172.20.64.6
9 45 ms 48 ms 47 ms unknown.telstraglobal.net [210.176.152.234]
10 42 ms 43 ms 43 ms unknown.telstraglobal.net [210.176.152.233]
11 42 ms 45 ms 42 ms i-91.sydp-core04.telstraglobal.net [202.84.222.85]
12 166 ms 167 ms 168 ms i-91.sydp-core04.telstraglobal.net [202.84.222.85]
13 168 ms 167 ms 168 ms i-20802.eqnx-core02.telstraglobal.net [202.84.141.25]
14 165 ms 163 ms 163 ms i-92.eqnx03.telstraglobal.net [202.84.247.17]
15 165 ms 164 ms 164 ms sjo-b21-link.ip.twelve99.net [213.248.75.146]
16 163 ms 164 ms 188 ms las-b23-link.ip.twelve99.net [62.115.116.41]
17 194 ms 196 ms 209 ms roblox-ic342576-las-b21.ip.twelve99-cust.net [62.115.172.27]
18 * * * Request timed out.
19 * * * Request timed out.
20 192 ms 195 ms 195 ms 128.116.86.66

From the logs, connection attempt started @ 2022-04-17T11:47:53.024Z, game sort of loaded by 2022-04-17T11:51:34.529Z

That says infrastructure issue to me.

First off, sorry for the long delay in communication. We had to shift our attention to other work and dropped the ball on this report.

Are you still experiencing the high ping spikes on player join? Do you have any additional logs, screenshots, or info that relate to the issue. I will start looking at server-side join metrics we have for the experiences mentioned before on this thread. Thanks!

The player join ping spikes no longer happen on my game. Only information I had on it was what I posted before on this post. Large amounts of data still gets sent but I was told it’s just data sent to the client when a player joins from streaming.

Only issue I’m facing right now are memory leaks. Which I’ve seen confirmed by another developer it’s an issue with streaming in general, and is being looked into.

1 Like

Thanks for getting back to me! Yeah upon joining we need to send clients all the data that isn’t eligible for streaming. You can inspect what is sent and if possible reorganize your content to reduce the upfront data cost/size. For example, if you have a lot of Parts in ReplicatedStorage you can move them to Workspace so they can be streamed in gradually.

Yes, we are looking into the memory leaks. Please stay tuned on this other thread for updates on that Server network/raknet Memory Leak? - #13 by PlumJar, For now I will consider this bug report resolved, but please reach out if you encounter any other issues.