This happens because the divide for both Vector types was implemented as a multiply by (1/x) to avoid unnecessary divides. In contemporary times vector instructions mean that this is probably counterproductive.
The Vector3 variant is marked as inline so thanks to fast-math the optimizer manages to undo the mistake, but the Vector2 variant is not marked as inline, so fast-math can’t reach across the function boundary and you’re still left with an imprecise result.
To answer the question: The code has always been the same but the behavior probably changed at some point thanks to the optimizer interpreting things differently.
This is so very interesting! Thank you for looking into this!
I don’t know what the process is for submitting a change in the Roblox source code, is this something that seems like it’s worth fixing, or should I accept the behavior and move on?
It’s worth fixing (in that it’s indicative of optimization failures), however you should also probably move on for now, because it’s a problem in the low level Vector2 type used throughout the engine so it won’t be possible to just throw in a simple fix for it. Some due-diligence that changing it doesn’t break anything will be required.
BTW, note that we don’t implement fast-math. Breaking IEEE math is bad when you’re supporting multiple platforms and need consistent semantics. Imagine if we tried to support server-authoritative replay when clients and servers always disagreed about how floats work…
Oh ok, this is very interesting. Yeah this makes sense, and I agree that we should want repeatable and consistent math.
Tangentially, I have implemented server authoritative physics, animation, etc, for a game we’ve been working on, and, unfortunately, servers and clients always disagree about how subnormal floats work.
Slightly off-topic, but is there a reason Vector2 doesn’t use the native vector type? Is there just not a net performance gain or is it something technical?
Yeah, the Physics system is messing up flush-to-zero mode because of a threading problem. Agreed, it’s a big predictability hazard.
It’s inconsistent even on the same machine because flush-to-zero happens on a per-thread basis - Physics will run on some thread and set the FTZ mode, your script runs on that thread and gets flush to zero behavior, and then your script switches to a different thread the next frame and now the behavior is different.
We’ll fix this, probably by just forcing flush-to-zero everywhere.
Sorry for slightly off-topic but.
I am deeply interested how the Roblox engine works on the system-level.
I’ve been learning/learned languages like C++ and Rust and it’s always small and simple details like this in how software is written that just boggles my mind.
The crazy things that programmers sometimes do to get a few extra CPU cycles/instructions out of a program is wild.
May I ask, will there eventually be a staff post or place where we can ask questions or explore these things?
The answer to that one would be “not on Roblox”. Details of the C++ implementation of the engine are usually a bit out of scope of the discussion here.
At least not directly… there is a fair bit of popular open source community tooling for Roblox written in lower level languages (as I’m sure you’re aware), so that can be one place to satisfy your curiosity.