10,000+ Physical NPCs running with ~60 fps on low-end device

You guys can giving me feedback if you guys want
Information:

Running on Iphone 8 plus which is consindering low-end device nowadays
I still didn’t optimize memory (which include textures)
Also no throttle, LOD, or parallel

Since I don’t have good laptop/PC or even high-end phones, The game most likely have fps drop due to rendering, so people who have decent GPU can mostly run smoothly
Try out the game yourself if you want: Game optimization - Roblox

TESTED WITH 10,000 NPCs/zombie

  • Recv: 100 - 120 KB/s (I can’t push this further)
  • Ping: 130 - 200 ms (same with this)
  • CPU: 15 - 27 ms
  • GPU: 17 - 30 ms
  • Memory: ~800 MB (pretty good tho,)
  • FPS: 55 - 60 fps (60 when I don’t look all the zombies at once)

Recorded on my phone because I think it is more convinient, sorry for low quality
(because I compressed the video down to 10 MB)

Kinda insane right?

12 Likes

incredible, this is exactly what I’m trying to achieve, running 600+ enemies without SENT and RECV flying… if it’s not a secret, how did you make it?

CFramed NPC is the best method to increase performance as same as increasing more Recv.
Luckily it gave you more controls over replications, you just need to stop replicating from the NPC being sent to the client,
Make a remote event, the server will sent the client the COMPRESSED DATA by using buffer library

Example:
we will limit the position range for the NPCs. The smaller the limit, the smaller the data being sent.
Since I need mid-large and not tall maps, I only need the limit about

  • X: signed 10 bits ( -512 studs to +511 studs )
  • Z: signed 10 bits ( -512 studs to +511 studs )
  • Y: unsigned 9 bits ( 0 studs to +511 studs )
local bitWrite = function(count) -- count ( how many NPCs? )

	local package = buffer.create(math.ceil(count * 3.625))

	local offset = 0

	-- return the buffer, and the callback which you can write CFrame on the buffer easily.
	
	return package, function(cf: CFrame)
		
		local x = math.clamp(cf.X // 1, -512, 511) + 512	-- // Axis - X - 10 bits
		local y = math.clamp(cf.Y // 1,  000, 511)			-- // Axis - Y - 8 bits
		local z = math.clamp(cf.Z // 1, -512, 511) + 512	-- // Axis - Z - 10 bits

		buffer.writebits(package, offset, 10, x) 	offset += 10
		buffer.writebits(package, offset, 9, y) 	offset += 9
		buffer.writebits(package, offset, 10, z) 	offset += 10

	end

end

local bitRead = function(package, count)
	local result  = {}
	local offset = 0
	
	for i = 1, count  do
		local x = buffer.readbits(package, offset, 10) - 512 	offset += 10
		local y = buffer.readbits(package, offset, 9)			offset += 9
		local z = buffer.readbits(package, offset, 10) - 512 	offset += 10
		
		table.insert(result, CFrame.new(x, y, z))
	end
	
	return result
end

After this: update 1 NPC = 10 bits + 10 bits + 9 bits = 29 bits = 3.625 bytes

So 1000 NPCs ≈ 3.54 kB (kilobytes)
We will sent the data every 0.2 seconds ( optional )
→ 1000 NPCs: ~ 17.7 kB/s Recv

and the client need to create 1000 NPCs on their own client, making the NPCs on client moving what the server gives. The NPCs moving on 5 fps (kinda ugly because of the 0.2s sent), so we can use lerp to move them (I lerped these NPCs at 40 hz to improve CPU usage)

These are my bandwidth optimization, I recommended you should research about more insane network optimizations out there and tweak by your own, depends on the type of NPCs data they sent ( TD enemy, Real-time strategy NPC, zombies NPC, ect…)

4 Likes

This will help me so much, I’m using Packet framework atm but still on 200 NPCs it RECV is skyrocketing. Thank you !!!

1 Like

This is crazy and cool! Being able to actually do such things that require a lot of device-usage on lower-end devices is great! We are finally getting to a point where what’s considered low-end nowadays is good enough!

1 Like