TD game entity system, help on optimization

MohhayScripts · September 22, 2023, 5:25pm

There are a few minor optimizations you could be doing here.

First and foremost, any place where you do anything that looks like;

local variable = 1
variable = variable + 1

should be transferred into:

local variable = 1
variable += 1

This is a micro-optimization you can do that’s a new feature introduced in Luau, and will reduce the amount of info calls the calculation makes from 2 to 1, which is significant when you are trying to micro-optimize. An example of what lines I mean specifically is where you do frameCount = frameCount + 1, you can change this to frameCount += 1 and it will work identically, but faster.

The same applies to the line where you do before, moveTo = moveTo, moveTo + 1, instead, you should be separating that statement into two lines, to look like this;

before = moveTo
moveTo += 1

this will allow you to leverage the former Luau optimization more.

Something else you should keep in mind is that creating a variable that will only be used once is pointless, and will only take up runtime. This means that you should not be creating the “cframe” variable inside of your UpdatePos loop.

Currently, it looks like this;

local cframe = enemyHandler:CalculatePos(enemy)
table.insert(enemyTable, enemy.RootPart)
table.insert(posTable, cframe)

instead, it should look like this:

table.insert(enemyTable, enemy.RootPart)
table.insert(posTable, enemyHandler:CalculatePos(enemy))

This way, you will not be creating an unnecessary variable that would be taking up unnecessary memory (even though it is a tiny memory difference), technically, it also takes time to create that variable so this counts as a micro-optimization.

To add on top of that, you have a variable that is not being used, that variable being “moveFrom”. I am unsure whether or not you will actually need this but, if you don’t, make sure you remove it for extra performance.

Lastly, there is one more change that you could make (which may or may not improve performance, I’m leaving this to you to try out and see the difference but, Luau has something called generic loops, where you don’t have to use “pairs”, “ipairs” and “next” anymore, and instead should just directly reference the table. What I mean by this is that rather than doing;

for id, enemy in pairs(enemyContainer) do

you should try doing:

for id, enemy in enemyContainer do

This will automatically select whether it is best to use pairs, or ipairs automatically for you, and might even outperform picking the method yourself as the functionality is written in C++ (which is way faster than Lua and Luau).

There are a few other places where you are creating a variable for no reason as well such as the alpha variable, however, it may be best to keep those since otherwise the code will be completely unreadable. The “cframe” variable case was not needed as it would not necessarily sacrifice readability for performance.

Here’s an updated version of your code with all of my micro-optimization changes applied in case you’re confused:

--!native
local replicatedStorage = game:GetService("ReplicatedStorage")
local replicatedFirst = game:GetService("ReplicatedFirst")

local modules = replicatedStorage.Modules
local clientModules = replicatedFirst.ClientModules

local enemyContainer = require(modules.EnemyContainer)
local playerSettings = require(clientModules.PlayerSettings)

local enemyHandler = {}
local frameCount = 0

function enemyHandler:CalculatePos(enemy)
	local nodesFolder = workspace.Nodes
	local currentTime = workspace:GetServerTimeNow()

    -- Remove moveFrom is it is not needed?
	local moveFrom, moveTo, before, spawnTime = enemy.Before, enemy.MoveTo, enemy.Before, enemy.SpawnTime
	local enemyTimes = enemy.nodeTimes

	while enemyTimes[moveTo] < currentTime - spawnTime do
		before = moveTo
		moveTo += 1
	end

	local startPos, endPos = nodesFolder[before].Position, nodesFolder[moveTo].Position
	local timeToFinish, timeLeft = enemyTimes[moveTo] - enemyTimes[before], enemyTimes[moveTo] - (currentTime - spawnTime)
	local alpha = (timeToFinish - timeLeft) / timeToFinish

	enemy.Before, enemy.MoveTo = before, moveTo
	return CFrame.new(startPos:Lerp(endPos, alpha)) * enemy.RootPart.CFrame.Rotation
end

function enemyHandler.UpdatePos()
	frameCount += 1
	if frameCount <= playerSettings.RefreshRate then
		return
	end
	frameCount = 0

	local enemyTable = {}
	local posTable = {}

	for id, enemy in enemyContainer do
		local rootPart = enemy.RootPart
		local timeElapsed = workspace:GetServerTimeNow() - enemy.SpawnTime
		if timeElapsed >= enemy.FinishTime then
			if rootPart.Parent then
				rootPart.Parent:Destroy()
			end	
			table.remove(enemyContainer, id)
		else
			table.insert(enemyTable, enemy.RootPart)
			table.insert(posTable, enemyHandler:CalculatePos(enemy))
		end
	end
	workspace:BulkMoveTo(enemyTable, posTable)
end

If you’ve got any questions, don’t hesitate to ask them!

M0N0RAIL_2 · September 22, 2023, 5:53pm

wow! thanks, also completly forgot you could do += and didn’t realise at all i added pointless variables
i’m sure this could improve sme stuff, but i got a question, my main downgrade in performance is when changing the cframe of an enemy, is it actually possible to even update the cframe/pivot (no idea if pivotto is actually better, probably not since i heard bulk move to is C++ or something like that) in a better and more optimised way? because for now everything i tried doesn’t really change anything on that side. Is it like impossible due to how roblox works or?

also for the readability part i like don’t really care if it is mostly “unreadable” i often still understand those, i’ll try the suggestion that makes those “unreadable”, like i’m already used to be really messy irl so…

also move from is actually useless forgot to remove that from an older version thanks for pointing this out too

also i suppose startPos and endPos are “pointless variables” even tho it would destroy even more the readability but like i said earlier i just only care about performance

M0N0RAIL_2 · September 22, 2023, 6:58pm

just tried your version and i legit have 40~50 ms instead of 36~45ms now, now idea how this is the case but ig okay

kingerman88 · September 22, 2023, 7:41pm

This is probably just general testing deviation given the overlap, given other programs are running on your computer at the time of testing it may be slower or faster. Those micro-optimizations shouldn’t realistically provide any benefit unless you’re looping through millions of times.

How are you computing this ms value anyways?

Anyways, the tips and suggestions I can give are

Firstly, if you can avoid having 2k enemies for your game, that’d probably be the best route. Roblox is undoubtedly pretty mediocre game engine, so it’ll probably be rough given that Roblox doesn’t utilize resources to the fullest potential.
Secondly, you should probably measure the cost of each operation, time sections individually (like the for loop, enemyHandler:CalculatePos() and the workspace:BulkMoveTo()) This will help give you a better understanding where the bottlenecks are.
As far as optimizations go, I’d recommend multi-threading for situations like this. Separate the enemies into 4 or so tables which are each individually tied to RunService. In that case you’d only have 1/4 the ms caused by each method (in theory at least). Iirc, Roblox applies multithreading to task.spawn and coroutines by default, but if you want more control you can utilize actors.

If I also might add, I’d recommend not calculating the position based on comparing the time the enemy was spawned and how long it will be until they get to the finish. Rather just store the monster’s position and current node as you’ll end up recalculating the position of the enemy so much. Determining if towers are in range requires position, if you try to have split paths it will be a struggle, status effects on enemies will be problematic, etc.

M0N0RAIL_2 · September 22, 2023, 8:39pm

the issue isn’t from the calculation of the position like i said, it’s from simply changing the cframe that causes the massive amount of lag. Also this is for a “Tower Defense” game and i would like to at least be able to update the position on every frame withouth having a desync issue, most game would do some normal lerping with math.clamp but this causes issues if the user froze and the enemy might just be desynced (late) because it was clamped to the waypoint it was going to and using time mostly fixes that but i know i still could have calculated on the “lerping without that time version” tho, multi threading isn’t a bad option and thought about it multiple times like with task.spawn and coroutines, just never did use it
i also thought (still multi threading) about using :ConnectParallel but it doesn’t allow stuff like changing cframe of an object.

Also i can easily handle at 60 fps (cannot go higher due to roblox fps lock, i know i can use an unlocker but meh) with 15k enemies only the calculation, just --workspace:BulkMoveTo() stuff removes the entire lag
This means the entire lag mostly come from literally changing the cframe itself and i do not think i can even optimise it further
Those micro optimisations are actually really efficient on large amount of enemies like that

MohhayScripts · September 23, 2023, 2:13am

Try adding the “pairs” that I removed back and make sure it isn’t that which is causing the issue.

As for the movement optimization, it is possible to optimize NPCs enough to have 600+ of them without any lag, maybe even 2000. I saw an entire post detailing the process of doing it, but here’s the basics to it:

First and foremost, you want to store each NPC’s position on the server, alongside direction perhaps? You want to update the server value when moving an NPC.

With this server table, the client will gain access to it via remote event, and will replicate the actual models into the workspace and move them accordingly.

To reduce latency between the server and client, use Vector3int16 for the position on the client, it obviously comes with positional limitations, however in return takes up less data, and is faster to transfer to the client that way.

There’s a bunch of tuning that can be made to a system like this but I recommend you check out the post that I read regarding the topic.

Here’s a link to that: How we reduced bandwidth usage by 60x in Astro Force (Roblox RTS)

M0N0RAIL_2 · September 23, 2023, 8:28am

i do not really care about the direction on the server and i have a system that barely does even really need to update the pos on the server, after my wifi is so garbage (i’m just making this game for fun and so i can play on my own td game) that i cannot even afford this option of sending those. Also from what i read this is just make them move with an interpolation? Isn’t that kinda still updating the cframe or stuff? + their test seems to be done on way lower amount of enemy
i easily get 60 fps with 600 enemies.

One of my friend tried that and tried and was able to reduce it by 88.3x but he says that method kinda sucks and he made a way better version that literally doesn’t require at all to even send those “vector3int16” data and he can handle 4k enemies with 40~50 fps

MohhayScripts · September 23, 2023, 1:07pm

Well if that’s the case, it’s best if you just try different ways of doing that CFrame math, and monitoring how much performance each way takes.

That’s how I’d usually do it.
Perhaps do so by using the Benchmark plugin by boatbomber?

M0N0RAIL_2 · September 23, 2023, 2:26pm

no it’s not from calculating the cframe, it is simply by just assigning the cframe to a part that causes the lag i don’t know if you can actually optimise that but i suppose you can since my friend literally has better performance by far

and i have no idea what that plugin is

MohhayScripts · September 24, 2023, 10:10pm

Well, there is no direct way of making it take less time (the CFrame assigning part), what you could try is using instance:PivotTo(cFrame), however, I cannot guarantee that this would improve performance, it’s worth a try though.

Also, you could always ask your friend how they did it.

M0N0RAIL_2 · September 24, 2023, 11:15pm

it is better but way more unstable it can have some random fps drop spike like 1/2 seconds

MohhayScripts · September 24, 2023, 11:15pm

Well, in that case, I think you’re going to have to stick to setting the CFrame directly… Not much else you can do here as far as I know.

M0N0RAIL_2 · September 24, 2023, 11:25pm

yeah but i don’t really get it, i see friends have better performance with worse specs than me by legit just doing object.CFrame = cframe and i’m confused how they even get better performance with that even when i do the same

ExercitusMortem · September 25, 2023, 12:08am

CFrame is very fast. If it is causing lag, something else is going wrong.
Maybe you should check your studio settings; lower them a bit.

M0N0RAIL_2 · September 25, 2023, 12:18am

nope… (i do get small but faster results with bulk move to) also don’t mind the parallel thing i was just trying something

ExercitusMortem · September 25, 2023, 12:20am

… Now hold on a sec. Those models look like they have a TON of tris. May just be compression though.

M0N0RAIL_2 · September 25, 2023, 12:28am

they are just roblox r6 rigs, and doing that with normal parts doesn’t change anything

ExercitusMortem · September 25, 2023, 12:31am

\Yea but it looks like you have thousands of them. Colliding into one another. That’s why you lag, lol

M0N0RAIL_2 · September 25, 2023, 12:31am

yeah since there are meant to be 1k enemies on that test, but even when they don’t collide into each other

ExercitusMortem · September 25, 2023, 12:32am

OK but you’re spawing 1K parts, close together. They still have events they fire, thousands of times.