The Basics Of Basic Optimization


#1

This tutorial is meant to cover a more basic form of optimizing of code that can be done to code without making it hard to read. This is NOT advanced level optimization, which is also a topic I can’t easily do. I will leave it up to someone like AxisAngle to do that one.

So, let us start with two basic questions. 1. “What is optimization?” and 2. “Why optimize?”. To start, optimizing is defined as to ”make the best or most effective use of”. Simply put, to make code run faster. This runs into the second question of why. The answer is pretty much in the definition, to make code run faster. Not everyone has a high end quad core i7 CPU and a NVIDIA GTX GPU, and in those cases a little bit of optimizing can make things run a lot smoother.

1. Use Local Variables
One of the most simple things you can do is add 6 characters to the beginning of your variables, and that is “local “. When you put local in front of a variable name, you are declaring a value as local, meaning it can only be used in the given scope (see example), but is faster than searching the “global” table because it has a lot of other things, including native Lua variables and ROBLOX functions and userdata.

So, How do I use local variables? Here is an example using global variables:

This = 1 --Declares a global value of This
for i = 1, 5 do
	This = This * i --References the global value This, multiplies it by i
			--i can only be referenced in this scope because it is a local value
end

Here is an example of local values:

local This = 1 --Declares a local value of This
for i = 1, 5 do
	This = This * i --References the local value This and multiplies it by i
end

Is that it? Actually, no. You can localize your functions too! How about some code to explain how.

--So let’s say you have this:
function This()
	--Stuff
end
--Lets re-write it so it sets This as a global value:
This = function()
	--Stuff
end
--See where this is going?
local This = function()
	--Stuff
end




2. Don’t Re-Do The Same Math When Needed
This isn’t the best subtitle, so an analogy would help. Let’s say you needed to do math starting with needing sin(0). You start by entering it into your calculator and see it is 0. Then you needed to add a number to it. If you wanted to save time, you would just input the last value rather than re-type sin(0). Not the best example, so here is code to help.
Here is what happens if you repeat the same operation for to make a part rise at a constant rate.

for i = 1, 50 do
	Part.CFrame = Part.CFrame * CFrame.new(0,1,0)
	wait()
end

Now let’s write down what it does:
For 50 times…
1. Get the Part’s CFrame
2. Find the global value for CFrame
3. Create the new userdata for CFrame.new(0,1,0)
4. Multiply the Part’s CFrame from 1 to the CFrame from part 3
5. Set the Part’s CFrame from 4
6. Wait for 1/30th of a second

So, how can you optimize this?

local UpCFrame = CFrame.new(0,1,0)
for i = 1, 50 do
	Part.CFrame = Part.CFrame * UpCFrame
	wait()
end

Now what does it do?
A. Find the global value for CFrame
B. Create the new userdata for CFrame.new(0,1,0)
For 50 times…
1. Get the Part’s CFrame
2. Multiply the Part’s CFrame from 1 to the CFrame from part B
3. Set the Part’s CFrame from 2
4. Wait for 1/30th of a second

And just like that, you reduced 2 (really 3) steps that is done 50 times.




3. Reduce Indexing Of Tables
A while ago, AxisAngle talked about how he could cast thousands of rays per second with no performance hit, and this is a massive helper for that. There is a more complex explanation about that involving MetaTables, but let’s keep it simple.

for i = 1, 10 do
game.Workspace:FindPartOnRay(Ray.new(Vector3.new(0,0,0),Vector3.new(0,i,0))
end

This may seem simple, but it is a lot more complex once you see all the steps it requies.
For 10 times...
1. Call __index (a MetaTable function) in game to get Workspace, and calling functions like this is fast but can stack if it called a lot to be slow.
2. Call __index in Workspace to get FindPartOnRay. Now to get the Ray.
3. Call __index for the global value new in Ray. Now to get the Vector3s.
4. Call __index for the global value new in Vector3.
5. Create the first vector from the function returned from 3 to create a Vector3 with the coordinates of 0,0,0.
6. Call __index for the global value new in Vector3.
7. Create the second vector from the function returned from 6 to create a Vector3 with the coordinates of 0,i,0.
8. Pass those arguments into the function returned in step 3 with the Vector3s from 5 and 7 to get the Ray
9. Pass those arguments into the function returned in step 8 into FIndPartOnRay

Long list, right? Now to use local values to lower the amount of times everything is indexed.

local Workspace = game.Workspace 
local FindPartOnRay = Workspace.FindPartOnRay --This is how you index a function without calling it.
local Raynew,Vector3new = Ray.new,Vector3.new
local Center = Vector3new(0,0,0)
for i = 1, 10 do
	--For a function indexed by :, you have to pass the table it is part of first before the rest of the arguments.
FindPartOnRay(Workspace,Raynew(Center,Vector3new(0,i,0))
end

A. Call __index in game to get Workspace
B. Call __index in Workspace to get FindPartOnRay
C. Call __index for the global value new in Ray.
D. Call __index for the global value new in Vector3.
E. Create a Vector with the function returned from D to create a blank Vector
For 10 times…
1. Create the second vector from the function stored in D to create a Vector3 with the coordinates of 0,i,0.
2. Reference the first vector from E and create the Ray from function stored in C
3. Pass the table the FindPartOnRay function is from (Workspace, so A) into the stored function in B with the Ray from 3.




How Do I Test The Speed Of My Code
This may seem complex at first, but it is simple, store the tick() of the start, run the code a given amount of times in a for loop, then subtract the new tick() from the first.

local function TestFunction()
	--Insert the code to tesst
end
local Start = tick() --Start time in seconds
for i = 1, 1000 do --For 1,000 times...
	TestFunction() --Call TestFunction
End
print(tick()-Start) --End time in seconds





Doing these should make your code run faster. For example, when I did the 3rd optimization tip on the CameraScript, it went from 3%-5% in script performance to 0.5%-2%, and that is on Intel Core i7 4700MQ @ 3.25 GHz with TurboBoost. Optimization. Use it. Think about the people using toasters.

Bonus: With all of my tutorials, an image of the full tutorial for public is included.


Featured Games Program - Getting Started & Expectations
#2

Nice tutorial! Thanks for the public link, this is going on the Roblox Helpers tutorials channel.

Minor qualm: workspace is a global constant, so typing out game.Workspace is unnecessary.

random stuff that might be worth mentioning:

  • x*x is faster than x^2
  • x*x*x is faster than x^3
  • x*x*x*x*x*x*x*x*x*x is faster than x^11, you get the idea
  • Lua uses constant folding; e.g. (2/3) is evaluated ahead of time so it’s just as fast as 0.6666666666666667

#3

Why does it seems like everyone hates using game.Workspace? It works, and has consistency with game.Lighting, game.ReplicatedStorage, etc.

Source for this? I always thought numbers would be faster.


#4

Because ROBLOX created workspace for a reason, and you should use it.

As for consistency, any other programming language has you import stuff you need at the top of files. This is extremely similar to defining variables for each service you use at the top of ROBLOX scripts. workspace is more consistent for us because our code is:

local runService = game:GetService("RunService")

runService.RenderStepped:connect(function()

end)

instead of

game["Run Service"].RenderStepped:connect(function()

end)

#5

print(string.len("local")) == 5 :stuck_out_tongue:

"Local" should be "local"

I love the tutorial actually learned some stuff so thanks :slight_smile:


#6

print(string.len("local ")) == 6
There is a space.

Google Docs is NOT fun to type code into. I thought I missed one of those...


#7

You forgot:

0. No Premature Optimization

The parts of your code that are actually slow enough to warrant optimization are often not the parts that you would have expected. Not to mention that most code doesn't actually need to be fast, so you shouldn't sacrifice your code's readability for nonsense like Reducing Table Indices unless you really need the extra speedup.

Test the speed of your code first, then optimize if need be.


#8

Shouldn't this only be really used for step 3 (and 2 somewhat)? I don't see how using local in front of your functions and variables makes code unreadable.


#9

You should use locals for all of your stuff... but that's not really "optimization", it's just proper Lua coding style. To avoid name collisions on nested variables, and so that if / when there are refactoring tools in studio you would be able to do a semantic rename on your functions / variables.


#10

If someone made a plugin to automatically do the Raynew = Ray.new, etc in a script automatically I'd give them a big hug!

It'd need to add/remove the local RayNew = Ray.new / etc to the start dynamically based on whats used in the script


#11

I could do this, but I need to factor in cases of someone having localization done already, because this would be redundant:

local Raynew = Ray.new
local ray = Raynew

Tip if anyone else does this type of plugin, check for things like " CFrame.new(" or "(CFrame.new(", as someone could easily have "CustomTableCFrame.new()" register for "CFrame.new("


#12

If they already had the localization for Ray.new then Ray.new wouldn't be anywhere in the code and thus would not need the plugin to localize it


#13

???
But the original code would have been this:

local ray = Ray.new

#14

Oh right DERP,

Could check for = Ray.new or similar expressions, idk, this is why I'm counting on someone else doing the work for me :slight_smile:


#15

I have qualms about these kinds of "optimizations".

The best way to make your work go faster is to do less work, not make each step 1% faster.


I recently was working on a graphical effect (the blue beam in this place). The gist of how it works is computing intersections between a box and all of the parts in the place.

It was dropping frames and still taking about 10 seconds to finish computing the animation (to only play back at about 5fps). I did lots of these "optimizations":

  • I localized library functions like math.abs
  • I inlined functions and made local variables for constants like Vector3.new(.2, .2, .2)
  • I replaced tables with tuples of locals to avoid index operations
  • I pooled objects to avoid creating or changing too many

Guess what? These "optimizations" made zero impact on performance.

Why? Because the script does not spend any meaningful amount of time doing things like looking up variables and doing Vector3 constructions! If your script runs for 0.1 seconds, chances are the amount of time spent doing things like looking up keys in tables or globals is only a few milliseconds. Sure, a local variable lookup might be 100 times faster than a global variable lookup, but you're eliminating 1ms from a 100ms script.

So what did I do? How do I optimize? It's pretty obvious that "compute the intersection of this thin box with all of the parts" is wasting a lot of work.

Instead of using a list of all of the parts, I split it into something like an octree. Blam -- I didn't need to coerce all of my code into being less natural to write or read, I didn't need to sacrifice any of the conveniences that Lua is supposed to give you.

It was instantly 100x faster -- I can compute non-stop, in half the time, for an effect with 10 times as many frames.


TL;DR This is the wrong way to optimize. Do less work (using better data structures and algorithms). Don't care about the speed of looking up your variable, because that's not where you're spending your time. Those things will only matter if your script is stuck running non-stop for seconds. If you're doing that in a ROBLOX game, you have other problems.

These are things with smaller "complexity" in the language of CS.

Examples include:

  • linear search → binary search
  • linear search for best → heap
  • searching for nearest → quadtree / octree / dictionary
  • memoizing expensive (esp. recursive) functions
  • linked list ⇄ array (usually linked list → array)

local function this() end is valid syntax, you should probably use that if you only want to do a normal function definition but make it local.


#16

BTW: Unless you're really up for a challenge, don't try to do heaps in Lua. Chances are you'll have a really hard time getting the heap to be faster than a linear search for practical purposes. For example, every time I've implemented pathfinding on Roblox I've tried to implement heaps for the open-list and I've never gotten it to be faster than simpler solutions like bucketing the elements or even just linear search. If possible I'd recommend trying to keep a running max / min first if that's an option (in the case of only adding or rarely removing elements) if you are considering a heap for something.

Also I'd like to add intrusive linked lists are awesome. That's the seriously underused data structure in the Lua code I've seen on Roblox if anything.


#17

Also learning some super basic computer science techniques can lead to coding practices that improve performance too. I'm not sure if I'd call it optimizations, but I guess changing the code to be more efficient could be an optimization.

For instance, if a certain behavior is different based on some static state, then longer code can be used to increase efficiency a bit:

local state = "idk" -- Constant; non-changing state
function DoSomething()
   if (state == "idk") then
      -- Foo
   else
      -- Bar
   end
end

local state = "idk"
local DoSomething
if (state == "idk") then
   DoSomething = function()
      -- Foo
   end
else
   DoSomething = function()
      -- Bar
   end
end

As you can see, the first snippet of code is shorter, but it forces a branch (the 'if' statement) every time it's exectued. This is inefficient because we know that the condition being checked is a constant state and thus will be evaluated the same direction each time. In the second snippet, we write the function in two different ways based on the constant. Now the function will still perform differently based on the constant, but without any branching. (I believe some CPUs are smart enough to predict repetitive branch directions or whatever though, but that's a whole other topic.)


#18

That title could use some optimising.


#19

If you're for some reason doing the state check extremely often and you're sure that's causing performance issues, you'd be better of inlining the branch to avoid a function call entirely (unless you have very many states) and moving the state check "higher up" (thus having more duplicate code in each branch). Equality string comparisons are very fast in Lua.

Branch prediction (which really pretty much every processor has) probably has less of a positive influence on execution time of Lua scripts though, since those predictors cannot use the Lua-internal PC whatsoever.


#20

There has been a reason I haven't done a tutorial like this in a while; it always seems like I just fail at them and miss something massive.

I think it is a per case scenario. For my audio visualizer, localizing them made it go from 1.2%-1.8% in script performance to 0.1%-0.2%.