The "Art" of Micro-Optimizing

The “Art” of Micro-Optimizing

Or as I call it, a “good” way to spend my time that should be spent doing homework.

Let’s start with some basic things.

- table creation is slow. Super slow. Have been yelled at by two of my friends over my habits of creating unnecessary tables before.
- numbers are usually the fastest.
- you should think smarter and not harder.
- function calls aren't always needed and are extra slow if they're Lua functions.
- locals are **NOT** the be-all end-all solution for slow code. See my comments below.
- The client should be optimize first.

Most importantly, DO NOT OVER OPTIMIZE.

The reason I micro-optimize is because I like to squeeze out every little bit of performance that I can, as well as because you are writing for a computer, not a person. You can learn to read optimized code, a computer can’t learn to read slower code faster.

Table Iteration

In order of slowest way to go through a table to fastest, it goes ipairs, pairs and next, and numeric. Numeric for loops are the fastest, but like ipairs, they only work on arrays.

local Array = { "String", "Number", "Boolean", "Table" } -- A numeric for loop will work on this.
local Dictionary = { String = "", Number = 12345, Boolean = true, Table = { } } -- but not on this.

Using ipairs on anything but LuaJIT is just dumb. It’s super, super slow. The previously mentioned numeric loop does everything it does faster (as well as in LuaJIT, but the difference is significantly reduced). If you care at all about performance, please do not use it.

Using Locals

Locals are faster than globals, but they do use up more memory. You can localize : functions by doing the following:

local GetChildren = game.GetChildren
local MakeJoints = Instance.new("Model").MakeJoints

This is what I did for @boatbomber in his “dual render” scope system. I localized what functions were used most often (or if they were used in a loop), which were GetChildren, GetDescendants, IsA, ToOrientation, and IsDescendantOf. This is risky, as I’m pretty sure I’ve had scripts break on me because of this, so do it with caution. Some Roblox functions also should be localized, such as tick, wait, typeof, and I think I’m forgetting one. Of course, you shouldn’t do these if you’re running them once, like for example:

local tick = tick
math.randomseed(tick() % 1 * 1E7) -- Don't do this, that's not necessary.

Reducing Function Calls

I have a big issue with calling functions unnecessarily. Once again, I did this in boatbomber’s demo. He had a few functions that didn’t even need to be made into functions, so I turned them into single calls.

local function RenderVersion(obj)
	local Descendants = GetDescendants(obj)
	for i=1, #Descendants do
		local c = Descendants[i]
		if (IsA(c, "Script") or IsA(c, "Sound") or IsA(c, "ManualWeld") or IsA(c, "BasePart"))then
			c:Destroy()
		end
	end
	return obj
end

local function RenderHumanoid(Model, Parent, MainModel)
	local ModelParts = GetChildren(Model)
	for i=1, #ModelParts do
		local Part = ModelParts[i]
		if not IsA(Part, "Script") then
			local a	= Part.Archivable
				Part.Archivable	= true
			local RenderClone	= Part:Clone()
				Part.Archivable	= a
		
			if IsA(Part, "MeshPart") or IsA(Part, "Part") then
				PartUpdater = Heartbeat:Connect(function()
					if Part then
						RenderClone.CFrame = Part.CFrame
					else
						RenderClone:Destroy()
						PartUpdater:Disconnect()
					end
				end)
			elseif IsA(Part, "Accoutrement") then
				PartUpdater = Heartbeat:Connect(function()
					if Part then
						RenderClone.Handle.CFrame = Part.Handle.CFrame
					else
						RenderClone:Destroy()
						PartUpdater:Disconnect()
					end
				end)
			elseif IsA(Part, "Script") then
				RenderClone:Destroy()
			end
			RenderClone.Parent = Parent
		end
	end
end

local d = math.deg
local function inFOV (p0, p1)
    local x1, y1, z1 = ToOrientation(p0)
    local cf = cf(p0.Position, p1.Position)
    local x2, y2, z2 = ToOrientation(cf)
    return v3(d(x1-x2), d(y1-y2), d(z1-z2))
end

-- Some local math.abs function I forgot.

I replaced these functions with a do end block where they were originally called. You can see how I did it if you download the file, since it’d take up too much space to put here. My favorite examples of overdoing function calls would be how I do absolute values and clamped numbers.

If I want to get the absolute value of X, I simply do X >= 0 and X or 0 - X. If I want to clamp X between Y and Z, I will do X < Y and Y < Z and Y or Z < Y and Y < X and Y or X < Y and Z < Y and Z or Y < X and X < Z and X or Z. Yes, it is more work, but you can do a simple gsub operation to replace it.

Code = Code:gsub("math%.clamp%(([^%)]+)%, ([^%)]+)%, ([^%)]+)%)", function(X1, X2, X3)
	return ("(%s < %s and %s < %s and %s or %s < %s and %s < %s and %s or %s < %s and %s < %s and %s or %s < %s and %s < %s and %s or %s)"):format(X1, X2, X2, X3, X2, X3, X2, X2, X1, X2, X1, X2, X3, X2, X3, X2, X1, X1, X3, X1, X3)
end)
-- Sorry if this is buggy or just not working, I'm horrible at string manipulation.

While I’m on the topic of strings, let’s go over one of my biggest issues I have.

string.len

Yeah. This has its own section. I HATE seeing this. This isn’t okay. string.len, local len = string.len, and even :len() are all SLOW. STOP USING THEM, PLEASE. You can seriously just use #, same as tables, it’s even shorter to do so! See for yourself!

math.pow

Just like string.len, I hate seeing this. I don’t see anyone writing code for Roblox itself using it, usually only comes from code off of Github. It’s super duper slow, and isn’t recommended.

This is only just scratching the surface of what you can do, but it’s a starter. Games like Phantom Forces use these little tricks to perform as well as they can. I’m know as a matter of fact there are plenty of people who are much, much better at this than I am, and there will be a fair share of people who think this is a stupid waste of time. This is just how I think and how I see things should be.

Here’s a super, super useful post by @Dekkonot.

78 Likes

Oops. Put this in the wrong category again.

@buildthomas can you move it to community tutorials for me?

2 Likes

>micro-optimizing
>not unrolling for loops
:pensive:

17 Likes

What in the world is that?

5 Likes

This is a really good and unique learning resource. Honing in on caching keywords or locals, I have a love hate relationship with it.

What do you think about things like:

local math = math
local print = print --// just an example, i would never

(a good SH answer by eLunate)

I’ve never taken much time to teach myself optimization tricks, but I probably should. Thanks for this post!

1 Like

I used to do that all the time. I was called out by the other developer I worked with because it was actually slowing down the game with how much I did it. Overdoing locals is actually quite bad, so I stopped doing it completely.

This is just with one local, now imagine that with like everything I could use in a stereotypical script. Much larger hit to performance.
image

It eventually got this this point and worse. I stopped after being called out thankfully.

In fact, even doing locals for some constants can be slower. Take for example this:

local Functions = { }
math.randomseed(tick() % 1 * 1E7)
local random = math.random

local DEGREE_CONVERSION = 57.295779513082

Functions["Local Constant"] = function()
	return DEGREE_CONVERSION * (random() * random() * random() * random())
end

Functions["No Local Constant"] = function()
	return 57.295779513082 * (random() * random() * random() * random())
end

The bottom one is about 2% faster than the top one.

4 Likes

I have the opposite view to this. Micro optimizations have such a small impact on the actual performance of the game that it’s rarely worth going to the extent of implementing them. If you’re playing the game, would you really notice the difference between something taking 10 ms and 11 ms?

Imo, it’s much more worthwhile to go for readability. If you stop developing for a few months and want to come back to it, you’d be much more likely to succeed if the code is readable. Plus it helps in things like bug fixes because you can more quickly find the cause.

That’s my personal opinion for it but I think there’s a point to be made there

19 Likes

There’s nothing stopping you from writing readable and optimized code. Of course, that would be a chore and a half to do.

Here’s an example bit of code.

local random = math.random
math.randomseed(tick() % 1 * 1E7)

-- credit to movsb
local function DynamicArray(min, max, size, layers)
	local arr = {}
	for i = 1, size do
		if layers > 0 then
			local rd = random(1000000)
			if rd % 2 == 0 then
				arr[#arr + 1] = random(min, max)
			else
				local newSize = math.floor(size / layers)
				if newSize <= 0 then newSize = 100 end
				arr[#arr + 1] = DynamicArray(min, max, newSize, layers - 1)
			end
		else
			arr[#arr + 1] = random(min, max)
		end
	end
	return arr
end

local function ArrayAddition1()
	local Array = DynamicArray(1, 2000, 100, 0)
	local ArraySum = 0
	for Index = 1, #Array do
		local Value = Array[Index]
		if type(Value) == "number" then
			ArraySum = ArraySum + Value
		end
	end
	return ArraySum
end

local function ArrayAddition2()
	local Array = DynamicArray(1, 2000, 100, 0)
	local ArraySum = 0
	for _, Value in next, Array do
		if type(Value) == "number" then
			ArraySum = ArraySum + Value
		end
	end
	return ArraySum
end

They’re both plenty readable, for sure, one is just faster than the other. I have yet to encounter unreadable fast code that isn’t intentionally so. I try my best to write as readable as I possibly can, and I still can read the code I write months later. I will agree with you on the clamp function, although that can be fixed up if you use good variable names or even if you just test the code in your command bar.

1 Like

Just a note: it’s almost never worth localizing vanilla Lua globals. I used to be in the same boat as you, and do it all the time, but after running some tests I decided it wasn’t worth it. Especially with non-library globals, it’s just not a big enough difference to matter.

The length operator (#) is actually the preferred method to get the length of a string anyways, but the difference between # and string.len is exactly one table index (string -> string __len vs string -> string __index -> len) so it is again not important.

Micro-optimization should be avoided in code as a general rule. If the issue you’re running into is that you have too many table indexes or function calls, that isn’t an issue with their performance, that’s an issue with your code and design. By sacrificing readability for a few milliseconds of performance you’re creating a lot more work for yourself with very little reward.

14 Likes

I agree with this, no reason to optimize at this level unless you are already facing performance issues. (unless of course you just enjoy this type of thing)

3 Likes

between # and string.len is exactly one table index (string -> string __len vs string -> string __index -> len)

So that’s why it’s so slow.

And as for vanilla Lua globals, there are like a select few that actually can benefit from this (type and unpack), but the rest are plenty fast.

1 Like

What did you use to test that speed?

I both enjoy it and sometimes have to do it in collaboratory work due to other incompetents.

1 Like

unpack in general is overused in code so I would disagree with that on principle, but type is about as fast as you can get in Lua without it being a raw data type.

@Validark’s speed tester. I have it in a module, I can open source it really quick.

local Functions = {}

Functions["FunctionOne"] = function()
end

Functions["FunctionTwo"] = function()
end

require(4185109675).new(1, "Title", Functions)

Updated to the new module.

2 Likes

Completely agreed. The readability thing for me more has to do with the idea of reducing function calls - I find myself creating a number of extra ones because it can be the difference between a hard to read piece of code and one clearly laid out.

For example, I usually like to write one main function which’d call other functions to do the actual work for whatever script I’m writing. Then I’d have something like:

function runMinigame()
	local players = game.Players:GetPlayers()
	addToMap(players)
	runGame()

	local winners = getWinners()
	awardPoints(winners)
end

Where you can see, without actually going into the code, what is being done in the function. Then each of the helper functions would implement their one task and everything would be much more clear to read. I think, despite the loss in micro optimization, it’d be worth doing something like that

1 Like

Ah, yeah, good point. If you make a function just for a single use, it’s wasteful and you should just convert it to not be one.

1 Like

But like I said - it’s worth having the function call because it makes the entire thing more readable. You can do something like adding comments to the code, but reducing a several hundred line function into many small functions would almost always be worth it - even though it introduces the extra function calls

time to go review all my screwy and overcomplicated code

this will be fun

I don’t mind helping you if you want. You literally described me when you talked about your code.

1 Like