I thought sqrt was more expensive than squaring?

I’ve been trying to optimize some NPCs I am making which require doing a lot of distance checks between players and other NPCs.

I always thought magnitude checks were relatively expensive due to the square root operation that happens, and a common method for making it faster was to just compare the non-squared rooted distance with the squared distance of what you’re comparing (e.g. if you’re checking if an NPC is less than 50 studs away from a player, you would check if the difference of positions is less than 50^2).

However, I did some benchmarking in studio where I compared the min, average and max times to compute 10,000,000 square roots against 10,000,000 squares, and found that square rooting was actually faster, with square rooting taking about 65% of the time (on average) that squaring took.

Is there some secret optimisation that roblox has done which has made this a lot faster?

2 Likes

Soo, not sure exactly what you’re asking, but for distance checks just do:

(onePosition - anotherPosition).Magnitude

No squares or roots needed

1 Like

I’m not asking how to do a distance check. I’m talking about optimisation regarding distance checks. And the relative cost between computing square roots vs squaring.

1 Like

Sorry, I don’t have much else to say.
Roots are faster then squaring, and distance checks are:

(onePosition - anotherPosition).Magnitude
2 Likes

Can you share what you’re using to test? Using x*x+y*y+z*z and x^2+y^2+z^2 is consistently faster than .Magnitude for me

1 Like

Squaring is a basic arithmetic operation, just some multiplication, x*x.

Square root probably relies on an algorithm, which approximates the square root.

Thus it should follow that square root is more expensive. Are you sure your tests are accurate?

1 Like
local val = 0
local multTime = 0
local sqrtTime = 0
for _ = 1, 10 do
	local clock_start = os.clock()
	for i = 1, 1000000 do
		val = i*i
	end
	local timeTaken = os.clock() - clock_start
	multTime += timeTaken
	clock_start = os.clock()
	for i = 1, 1000000 do
		val = math.sqrt(i)
	end
	timeTaken = os.clock() - clock_start
	sqrtTime += timeTaken
end
print('mult time taken: ' .. multTime)
print('sqrt time taken: ' .. sqrtTime)

mult time taken: 0.039
sqrt time taken: 0.088

1 Like

Different designs in FPUs might yield faster results for square-rooting doubles (IEEE754 requires a native sqrt instruction on complying CPUs).

However, if the magnitude you’re comparing is constant, then skipping the square-root is definitely going to be faster. Non-constant comparisons may differ depending on platform, so just choose the most convenient in those cases.

1 Like

If it isn’t too work intensive, I’d be interested to see a basic, vector-based method added to your comparison (as was posted above), which is what I assume most would default to.

Hey so I found the problem,

I was using math.pow(number, 2) to square instead of just number*number, turns out math.pow is significantly slower, not sure why this is.

When I switched to number*number, it was indeed faster

Lua also has the ^ operator, which can do everything math.pow can but as a VM instruction instead of a library call. Squaring and square-rooting with it is consistently faster because of that.

2 Likes