Native Luau Vector3 Beta

Xan_TheDragon · April 21, 2021, 6:41pm

I’ll be honest, I always treated Vector3 like a stack-allocated value. It caught me off guard when I read they were heap-allocated! This is unquestionably a change for the better and I look forward to its complete implementation. Are there any other “value types” (I put quotes for this exact reason) that are still heap-allocated which will receive this change?

MrSprinkleToes · April 21, 2021, 6:54pm

It’s awesome to see steps being taken to optimizing things like this. Are there any plans on improving GUI performance? (Stuff like not re drawing everything when a property is updated)

I’ve found UI performance to be limiting me many times recently, and I’d really like to see some work done on it.

metatablecatmaid · April 21, 2021, 7:03pm

Im curious why Roblox datatype objects have their properties locked. Wouldn’t it be more efficient to have them unlocked so they can be reused without having to create an entirely new Vector.

We’ve already seen this is possible with RaycastParams

WheretIB · April 21, 2021, 7:06pm

We do have plans to improve Vector2/CFrame performance as well as other Roblox types in the future, but those will use a different implementation strategy.

Luaction · April 21, 2021, 7:14pm

I hope the same optimization will be done to other value types like CFrames, UDim2 and Vector2.
The performance gains are quite significant:

local cl = os.clock()
local vec3 = Vector3.new()
for i = 1, 10^7 do
	vec3+=Vector3.new(i,i,i)
end
print("Vector3 addition: ", os.clock()-cl)


local cl = os.clock()
local vec2 = Vector2.new()
for i = 1, 10^7 do
	math.random(0,i)
	vec2+=Vector2.new(i,i)
end
print("Vector2 addition: "..os.clock()-cl)


local cl = os.clock()
local cf = CFrame.new()
for i = 1, 10^7 do
	cf+=Vector3.new(i,i,i)
end
print("CFrame/Vector3 addition: "..os.clock()-cl)


local cl = os.clock()
local cf = CFrame.new()
for i = 1, 10^7 do
	cf*=CFrame.new(i,i,i)
end
print("CFrame Multiplication: "..os.clock()-cl)

The resulting benchmarks with the old Vector3s:

Vector3 addition:  1.6341197000002 
Vector2 addition: 1.8228185999906  
CFrame/Vector3 addition: 1.6889015000197 
CFrame Multiplication: 1.8462797999964

The same with the new Vector3s:

Vector3 addition:  0.26172300000326
Vector2 addition: 1.8144201000105
CFrame/Vector3 addition: 1.0896642000007
CFrame Multiplication: 1.8471370000043

Vector2 maths are now significantly slower than the native Vector3s.
CFrame math is quite a bit faster as long as it involves Vector3s but are otherwise just as fast/slow as before.
I hope that that the same/similar optimizations will be done to CFrames especially.

sjr04 · April 21, 2021, 7:19pm

This is actually a good post.

local p = Instance.new("Part")
p.Position = Vector3.new(math.random(1, 10), math.random(1, 10), math.random(1, 10))
print(rawequal(p.Position, p.Position))

Since Vector3 is now a native type I get true in the console, since __index is not returning a new Vector3 each time (old behavior). This could (and I think Roblox should) allow for mutability of Vector3, since part.Position.[X/Y/Z] = num would actually do something now (with the old behavior it would seem like it does nothing, since it would just edit the new Vector3 since it isn’t cached).

tnavarts · April 21, 2021, 7:21pm

CFrames can’t have the exact same thing done unfortunately: Vector3s are small enough that we can afford to fit them inside of the same storage as the other value types, but CFrames are significantly larger (4x as large as the next largest type), so they would bloat up the size of every other value type if they used the exact same approach that Vector3 does.

wynnrar · April 21, 2021, 7:38pm

How does this affect garbage collection? Since they are no longer traditional Userdata, do they behave like strings where new Vector3s created during runtime are subject to GC but ones loaded in as constants kinda just sit? Are they even loaded as constants?

Halalaluyafail3 · April 21, 2021, 8:11pm

Vector3 being a native type gets stored like a number, 'tisn’t a separate object but is stored as part of the value.

local a
a = 1 -- 1 is stored in a
a = 2 -- 2 is stored in a
-- the numbers aren't separate objects that are pointed to by a
-- they are contained in a

If Vector3 were to become mutable, then storing the Vector3 inside of the value wouldn’t be possible, it would need to be a different object which is referenced by the value. Essentially, if Vector3 was mutable then there would be little if any benefit to making it a special type distinct from userdata.

local a = Vector3.new(1,2,3)
local b = a

-- with Vector3 being a value type, b holds the same data as a

-- with Vector3 being mutable, then b references the same object as a
-- b can't reference a, because of lifetimes of variables not being dependant
-- upon references like normal objects

Vector3s being a native type like this means that they aren’t cached in the same way that numbers aren’t cached, they both hold the same data (but not references to the same objects or something).

Caching and mutability are also a very weird combination, if vectors were mutable but cached then would updating one vector update all vectors with the same contents?

local v1 = Vector3.new(1,2,3)
local v2 = Vector3.new(1,2,3)
local v3 = v1
print(rawequal(v1,v2)) -- assume true, because of caching
print(rawequal(v1,v3)) -- always true
v1.X = 2
print(rawequal(v1,v2)) -- would this be true?
print(rawequal(v1,v3)) -- always true

rawequal(v1,v2) on a stateful object implies that updating v1 will be observed by v2.

A modification to a Vector3 updating an Instance is also very weird (why does Vector3 need to manipulate state of other objects??).

sanjay2003 · April 21, 2021, 8:43pm

Finally. I can’t tell you amount of times I have been disappointed by user data returning instead of the actual type. This will make life much easier. Been a fan of the changes recently.

This is so underrated. This is incredibly useful.

Halalaluyafail3 · April 21, 2021, 8:47pm

Vector3 with NaN component(s) being used as a table key is very weird with this change.

local t = {}
local v = Vector3.new(0/0,0/0,0/0)
t[v] = 1

print(t[v]) --> nil
print(next(t)) --> -nan(ind), -nan(ind), -nan(ind) 1
for _ in next,t do end -- nothing (Luau optimizations i presume)
for _ in next,t,nil do end -- invalid key to 'next'
print((next(t,(next(t))))) -- invalid key to 'next'
while next(t) do t[next(t)] = nil end -- infinite loop

Vector3s with NaN components should either error when using them as the key in assignments (like when using NaN as a key in assignments), or there should be a special case for NaN components (for compatibility).

zeuxcg · April 21, 2021, 9:13pm

We have a fix for this in the pipeline, which indeed generates an error when you use it. I think it ships next week.

iamtryingtofindname · April 21, 2021, 9:23pm

I believe they plan on doing what they did here on Vector2s (as they are very similar) if all goes well here. Vector2s still currently use the old system.

zeuxcg · April 21, 2021, 10:07pm

For part.Position.X = 1 to do what you want it to do, it needs to translate to a mutation of Position property. There’s no way for us to implement this - part.Position isn’t a reference to the internal part’s memory, it’s a computed property. Immutability of Vector3 is good for both reference types and value types, but for slightly different reasons; you will see languages like C# that have value types (structs) strongly recommend immutable structs for much of the same reason, as in C# when you say foo.Bar.Y = 5, and Bar is a struct, you aren’t mutating a field of foo - you are mutating a copy of that field.

Fluffmiceter · April 21, 2021, 11:19pm

Very very excited for this to roll out! Don’t have anything meaningful to contribute, but my game is very math heavy and will definitely benefit performance-wise in pretty much every area of code. My most intensive algorithms are all written in x y z variables, which is obviously very bad for readability.

I really hope CFrames can get similar treatment soon, too. If that happens, things like welds and character rigs will become significantly faster to update, making a lot of complex animation structures way more viable performance wise. I do a lot of CFrame operations with my Rthro IK system, which takes a substantial amount of time, so def looking forward to native CFrames or something like that.

zeuxcg · April 21, 2021, 11:31pm

If you have performance-heavy code that uses vectors/cframes, please consider submitting it to us. We will use it for science.

(that is to say, our performance improvements are usually motivated by specific examples, so it would help us understand the remaining deficiencies wrt cframe math if we had a non-synthetic example of code that is performance intensive)

tbradm · April 21, 2021, 11:41pm

Is caching a method of a Vector3 and using it on an unrelated Vector3 affected by this?

For example, I think there may be a couple scripts in my game that do something like this:

local Dot = Vector3.new().Dot
-- ...
local a = Dot(b, c) -- where b is not the same vector from where Dot was extracted

Will this still behave as it did previously? Is this even good practice? Does this even boost performance like it used to?

I am just wondering if I need to purge my code of this. Thanks.

zeuxcg · April 21, 2021, 11:45pm

This should still work. Performance-wise it should be more or less equivalent to b:Dot(c).

Hexcede · April 21, 2021, 11:45pm

For some reason I have the beta feature but enabling it has had no effect. type still returns userdata, and, I see no performance improvements, not even within a few percent or so. Is this not enabled yet, or, do I need to do something to get it to work? I’m a little confused. Turns out I had to restart studio, I’m enrolled in the beta channel so the flag was already enabled but it only got enabled after start up.

But, this is absolutely perfect for a few issues I had with Vectors!

This is possible!! Hype moment, I’ve never been so excited over a data structure that holds three numbers One question, will the following be possible with this change? I am hoping this will be the case as this is a pretty big reason the performance for some of my code is so slow. I reference cells by their position and not being able to map cells to positions in this way means I have to implement a cache system which is fairly slow.

local someTable = {
    [Vector3.new(1, 2, 3)] = 123
}
print(someTable[Vector3.new(1, 2, 3)]) -- I would love if this printed 123 and not nil

One problem I’m assuming this will fix was the memory usage and based on this announcement, that sounds like much less of a problem. Another was that vector operations were generally just extremely slow, especially accessing the X, Y and Z components of vectors. Again, sounds like this isn’t much of a problem anymore!

Overall this fixes a lot of the issues I had with vectors, and, I’m super excited to get testing and see what exactly this does to performance.

I do have one concern though, and, that’s that this could end up breaking my code sandbox if type ends up behaving differently for some other types. My code sandbox relies on a limited amount of types existing and if a new type comes about that I don’t take into account and I have no sufficient fallback this could allow for some really nasty sandbox escapes. The concern for me is if some type comes about that could be used to manipulate the sandbox, and then having no way to push a patch out to anyone that uses it. The sandbox is still in its early stages so I can certainly work around this in the future, but, yeah, that’s just one concern I have. (I could for example explicitly disallow use of unknown types by default so that the sandbox user would have to update to get support for it)

Brian1KB · April 22, 2021, 12:27am

There appears to be some particularly poor performance when you generate a table with Vector3 keys that have “similar” / “close” values. My presumption is that there is a large number of hash collisions occurring. Here is an example

local start = os.clock()
do
	local _10 = 0
	while _10 < 250_000 do
		local i = _10
		local _12 = Vector3.new(math.random(1, 5000), math.random(1, 5000), math.random(1, 5000))
		positions[_12] = {
			Identifier = 0,
			VisibleFaces = {},
		}
		_10 = i
		_10 += 1
	end
end
local finish = os.clock() - start
print("Finish " .. tostring(finish))

If the upper-bound of the random is 5000, this takes even longer than the script exhaustion timeout period. If the upper-bound of the random is 50000, then the loop instead takes just 1.4 seconds. If you multiply the vector by 100, it takes even less time (just 0.5 seconds), presumably due to less hash collisions.

I may be off on my conclusion, but I think this is going to be a real issue for people.