TD game entity system, help on optimization

So i was trying to make an entity system (client rendered) for my td game since my current one uses align on client which isn’t ideal for performance. I’m not really a good programmer so this code isn’t the best:

So my current system stats with only position calculation (not updating any model cframe) : 21~26ms on average for 20k enemies

My current current systel stats with moving models (updating cframe) : 36~45ms in average with only 2k enemies

I hope my calculation part isn’t too bad but once i try to update the enemy models cframe my performance just becomes really worse and everything i tried couldn’t change anything about it /:

I already sacrified a lot of readability to get even the smallest (micro) optimizations i could get
those micro optimzations had a really great effect on the calculation part which allowed me to double my performances, but on the cframe updating part i am unable to even optimize the script, so help would be really useful

NOTE: this is also not really good since i’m mainly only calculating the position , no rotation with it
and i’m using native luau

--!native
local replicatedStorage = game:GetService("ReplicatedStorage")
local replicatedFirst = game:GetService("ReplicatedFirst")

local modules = replicatedStorage.Modules
local clientModules = replicatedFirst.ClientModules

local enemyContainer = require(modules.EnemyContainer)
local playerSettings = require(clientModules.PlayerSettings)

local enemyHandler = {}
local frameCount = 0

function enemyHandler:CalculatePos(enemy)
	local nodesFolder = workspace.Nodes
	local currentTime = workspace:GetServerTimeNow()

	local moveFrom, moveTo, before, spawnTime = enemy.Before, enemy.MoveTo, enemy.Before, enemy.SpawnTime
	local enemyTimes = enemy.nodeTimes

	while enemyTimes[moveTo] < currentTime - spawnTime do
		before, moveTo = moveTo, moveTo + 1
	end

	local startPos, endPos = nodesFolder[before].Position, nodesFolder[moveTo].Position
	local timeToFinish, timeLeft = enemyTimes[moveTo] - enemyTimes[before], enemyTimes[moveTo] - (currentTime - spawnTime)
	local alpha = (timeToFinish - timeLeft) / timeToFinish

	enemy.Before, enemy.MoveTo = before, moveTo
	return CFrame.new(startPos:Lerp(endPos, alpha)) * enemy.RootPart.CFrame.Rotation
end

function enemyHandler.UpdatePos()
	frameCount = frameCount + 1
	if frameCount <= playerSettings.RefreshRate then
		return
	end
	frameCount = 0

	local enemyTable = {}
	local posTable = {}

	for id, enemy in pairs(enemyContainer) do
		local rootPart = enemy.RootPart
		local timeElapsed = workspace:GetServerTimeNow() - enemy.SpawnTime
		if timeElapsed >= enemy.FinishTime then
			if rootPart.Parent then
				rootPart.Parent:Destroy()
			end	
			table.remove(enemyContainer, id)
		else
			local cframe = enemyHandler:CalculatePos(enemy)
			table.insert(enemyTable, enemy.RootPart)
			table.insert(posTable, cframe)
		end
	end
	workspace:BulkMoveTo(enemyTable, posTable)
end

return enemyHandler

Specs:
CPU Ryzen 5 5600H
GPU Rtx 3050
Ram 16gb

4 Likes

There are a few minor optimizations you could be doing here.

First and foremost, any place where you do anything that looks like;

local variable = 1
variable = variable + 1

should be transferred into:

local variable = 1
variable += 1

This is a micro-optimization you can do that’s a new feature introduced in Luau, and will reduce the amount of info calls the calculation makes from 2 to 1, which is significant when you are trying to micro-optimize. An example of what lines I mean specifically is where you do frameCount = frameCount + 1, you can change this to frameCount += 1 and it will work identically, but faster.

The same applies to the line where you do before, moveTo = moveTo, moveTo + 1, instead, you should be separating that statement into two lines, to look like this;

before = moveTo
moveTo += 1

this will allow you to leverage the former Luau optimization more.

Something else you should keep in mind is that creating a variable that will only be used once is pointless, and will only take up runtime. This means that you should not be creating the “cframe” variable inside of your UpdatePos loop.

Currently, it looks like this;

local cframe = enemyHandler:CalculatePos(enemy)
table.insert(enemyTable, enemy.RootPart)
table.insert(posTable, cframe)

instead, it should look like this:

table.insert(enemyTable, enemy.RootPart)
table.insert(posTable, enemyHandler:CalculatePos(enemy))

This way, you will not be creating an unnecessary variable that would be taking up unnecessary memory (even though it is a tiny memory difference), technically, it also takes time to create that variable so this counts as a micro-optimization.

To add on top of that, you have a variable that is not being used, that variable being “moveFrom”. I am unsure whether or not you will actually need this but, if you don’t, make sure you remove it for extra performance.

Lastly, there is one more change that you could make (which may or may not improve performance, I’m leaving this to you to try out and see the difference but, Luau has something called generic loops, where you don’t have to use “pairs”, “ipairs” and “next” anymore, and instead should just directly reference the table. What I mean by this is that rather than doing;

for id, enemy in pairs(enemyContainer) do

you should try doing:

for id, enemy in enemyContainer do

This will automatically select whether it is best to use pairs, or ipairs automatically for you, and might even outperform picking the method yourself as the functionality is written in C++ (which is way faster than Lua and Luau).

There are a few other places where you are creating a variable for no reason as well such as the alpha variable, however, it may be best to keep those since otherwise the code will be completely unreadable. The “cframe” variable case was not needed as it would not necessarily sacrifice readability for performance.

Here’s an updated version of your code with all of my micro-optimization changes applied in case you’re confused:

--!native
local replicatedStorage = game:GetService("ReplicatedStorage")
local replicatedFirst = game:GetService("ReplicatedFirst")

local modules = replicatedStorage.Modules
local clientModules = replicatedFirst.ClientModules

local enemyContainer = require(modules.EnemyContainer)
local playerSettings = require(clientModules.PlayerSettings)

local enemyHandler = {}
local frameCount = 0

function enemyHandler:CalculatePos(enemy)
	local nodesFolder = workspace.Nodes
	local currentTime = workspace:GetServerTimeNow()

    -- Remove moveFrom is it is not needed?
	local moveFrom, moveTo, before, spawnTime = enemy.Before, enemy.MoveTo, enemy.Before, enemy.SpawnTime
	local enemyTimes = enemy.nodeTimes

	while enemyTimes[moveTo] < currentTime - spawnTime do
		before = moveTo
		moveTo += 1
	end

	local startPos, endPos = nodesFolder[before].Position, nodesFolder[moveTo].Position
	local timeToFinish, timeLeft = enemyTimes[moveTo] - enemyTimes[before], enemyTimes[moveTo] - (currentTime - spawnTime)
	local alpha = (timeToFinish - timeLeft) / timeToFinish

	enemy.Before, enemy.MoveTo = before, moveTo
	return CFrame.new(startPos:Lerp(endPos, alpha)) * enemy.RootPart.CFrame.Rotation
end

function enemyHandler.UpdatePos()
	frameCount += 1
	if frameCount <= playerSettings.RefreshRate then
		return
	end
	frameCount = 0

	local enemyTable = {}
	local posTable = {}

	for id, enemy in enemyContainer do
		local rootPart = enemy.RootPart
		local timeElapsed = workspace:GetServerTimeNow() - enemy.SpawnTime
		if timeElapsed >= enemy.FinishTime then
			if rootPart.Parent then
				rootPart.Parent:Destroy()
			end	
			table.remove(enemyContainer, id)
		else
			table.insert(enemyTable, enemy.RootPart)
			table.insert(posTable, enemyHandler:CalculatePos(enemy))
		end
	end
	workspace:BulkMoveTo(enemyTable, posTable)
end

If you’ve got any questions, don’t hesitate to ask them!

1 Like

wow! thanks, also completly forgot you could do += and didn’t realise at all i added pointless variables
i’m sure this could improve sme stuff, but i got a question, my main downgrade in performance is when changing the cframe of an enemy, is it actually possible to even update the cframe/pivot (no idea if pivotto is actually better, probably not since i heard bulk move to is C++ or something like that) in a better and more optimised way? because for now everything i tried doesn’t really change anything on that side. Is it like impossible due to how roblox works or?

also for the readability part i like don’t really care if it is mostly “unreadable” i often still understand those, i’ll try the suggestion that makes those “unreadable”, like i’m already used to be really messy irl so…

also move from is actually useless forgot to remove that from an older version thanks for pointing this out too

also i suppose startPos and endPos are “pointless variables” even tho it would destroy even more the readability but like i said earlier i just only care about performance

just tried your version and i legit have 40~50 ms instead of 36~45ms now, now idea how this is the case but ig okay

This is probably just general testing deviation given the overlap, given other programs are running on your computer at the time of testing it may be slower or faster. Those micro-optimizations shouldn’t realistically provide any benefit unless you’re looping through millions of times.

How are you computing this ms value anyways?


Anyways, the tips and suggestions I can give are

  • Firstly, if you can avoid having 2k enemies for your game, that’d probably be the best route. Roblox is undoubtedly pretty mediocre game engine, so it’ll probably be rough given that Roblox doesn’t utilize resources to the fullest potential.
  • Secondly, you should probably measure the cost of each operation, time sections individually (like the for loop, enemyHandler:CalculatePos() and the workspace:BulkMoveTo()) This will help give you a better understanding where the bottlenecks are.
  • As far as optimizations go, I’d recommend multi-threading for situations like this. Separate the enemies into 4 or so tables which are each individually tied to RunService. In that case you’d only have 1/4 the ms caused by each method (in theory at least). Iirc, Roblox applies multithreading to task.spawn and coroutines by default, but if you want more control you can utilize actors.

If I also might add, I’d recommend not calculating the position based on comparing the time the enemy was spawned and how long it will be until they get to the finish. Rather just store the monster’s position and current node as you’ll end up recalculating the position of the enemy so much. Determining if towers are in range requires position, if you try to have split paths it will be a struggle, status effects on enemies will be problematic, etc.

the issue isn’t from the calculation of the position like i said, it’s from simply changing the cframe that causes the massive amount of lag. Also this is for a “Tower Defense” game and i would like to at least be able to update the position on every frame withouth having a desync issue, most game would do some normal lerping with math.clamp but this causes issues if the user froze and the enemy might just be desynced (late) because it was clamped to the waypoint it was going to and using time mostly fixes that but i know i still could have calculated on the “lerping without that time version” tho, multi threading isn’t a bad option and thought about it multiple times like with task.spawn and coroutines, just never did use it
i also thought (still multi threading) about using :ConnectParallel but it doesn’t allow stuff like changing cframe of an object.

Also i can easily handle at 60 fps (cannot go higher due to roblox fps lock, i know i can use an unlocker but meh) with 15k enemies only the calculation, just --workspace:BulkMoveTo() stuff removes the entire lag
This means the entire lag mostly come from literally changing the cframe itself and i do not think i can even optimise it further
Those micro optimisations are actually really efficient on large amount of enemies like that

Try adding the “pairs” that I removed back and make sure it isn’t that which is causing the issue.

As for the movement optimization, it is possible to optimize NPCs enough to have 600+ of them without any lag, maybe even 2000. I saw an entire post detailing the process of doing it, but here’s the basics to it:

First and foremost, you want to store each NPC’s position on the server, alongside direction perhaps? You want to update the server value when moving an NPC.

With this server table, the client will gain access to it via remote event, and will replicate the actual models into the workspace and move them accordingly.

To reduce latency between the server and client, use Vector3int16 for the position on the client, it obviously comes with positional limitations, however in return takes up less data, and is faster to transfer to the client that way.

There’s a bunch of tuning that can be made to a system like this but I recommend you check out the post that I read regarding the topic.

Here’s a link to that: How we reduced bandwidth usage by 60x in Astro Force (Roblox RTS)

i do not really care about the direction on the server and i have a system that barely does even really need to update the pos on the server, after my wifi is so garbage (i’m just making this game for fun and so i can play on my own td game) that i cannot even afford this option of sending those. Also from what i read this is just make them move with an interpolation? Isn’t that kinda still updating the cframe or stuff? + their test seems to be done on way lower amount of enemy
i easily get 60 fps with 600 enemies.

One of my friend tried that and tried and was able to reduce it by 88.3x but he says that method kinda sucks and he made a way better version that literally doesn’t require at all to even send those “vector3int16” data and he can handle 4k enemies with 40~50 fps

Well if that’s the case, it’s best if you just try different ways of doing that CFrame math, and monitoring how much performance each way takes.

That’s how I’d usually do it.
Perhaps do so by using the Benchmark plugin by boatbomber?

no it’s not from calculating the cframe, it is simply by just assigning the cframe to a part that causes the lag i don’t know if you can actually optimise that but i suppose you can since my friend literally has better performance by far

and i have no idea what that plugin is

Well, there is no direct way of making it take less time (the CFrame assigning part), what you could try is using instance:PivotTo(cFrame), however, I cannot guarantee that this would improve performance, it’s worth a try though.

Also, you could always ask your friend how they did it.

it is better but way more unstable it can have some random fps drop spike like 1/2 seconds

1 Like

Well, in that case, I think you’re going to have to stick to setting the CFrame directly… Not much else you can do here as far as I know.

yeah but i don’t really get it, i see friends have better performance with worse specs than me by legit just doing object.CFrame = cframe and i’m confused how they even get better performance with that even when i do the same

CFrame is very fast. If it is causing lag, something else is going wrong.
Maybe you should check your studio settings; lower them a bit.


nope… (i do get small but faster results with bulk move to) also don’t mind the parallel thing i was just trying something

… Now hold on a sec. Those models look like they have a TON of tris. May just be compression though.

they are just roblox r6 rigs, and doing that with normal parts doesn’t change anything

\Yea but it looks like you have thousands of them. Colliding into one another. That’s why you lag, lol

yeah since there are meant to be 1k enemies on that test, but even when they don’t collide into each other