Subtle behavior change in __eq

Warning: you need to understand metatables and, possibly, IEEE754 arithmetics to understand this post. If you don’t, just skip it :wink:

Background

In Lua 5.1, the semantics of the equality checks (== and ~=) are such that when an object that’s being compared has an __eq metamethod, the metamethod is only called when the object isn’t compared with itself. In other words, a == b works approximately as rawequal(a, b) or getmetatable(a).__eq(a, b). This means that __eq isn’t called when you compare an object with itself.

This has been true of Roblox Lua implementation as well, but accidentally was changed in Luau when it was released in 2019 for a particular case of builtin userdata comparison - for Vector3, CoordinateFrame and other objects like that v == v was in fact calling the built-in __eq metamethod.

This is significant as since 2019, games started using this to compare Vector3 objects to itself to check for NaN values (v ~= v is true if any component of the vector is NaN). You may think this is a hack, but it actually makes sense - the following code always was a correct way to check if any component of a Vector3 is a NaN:

a.X ~= a.X or a.Y ~= a.Y or a.Z ~= a.Z

If you’re confused by this, it’s alright - NaN semantics can be confusing! But very smart people in 1985 decided that this is a good idea, and pretty much all computers and programming languages today implement this semantics for floating-point computation.

This brings us to:

What changed?

As of today, we’re making this behavior official and consistent. Instead of only calling __eq for self comparisons in rare cases, we’re now always calling __eq even if the object is compared with itself.

Why is it a good idea?

This change removes the difference in semantics between builtin types and custom table-based types. This means that if you were to implement a Vector5 type by using a table with a metatable, you’re going to get the same behavior wrt equality comparison as Vector3. This makes builtin types less special, which is great.

It also keeps the existing games working. We initially wanted to revert the behavior back to be in line with Lua 5.1, but existing games had code that wasn’t compatible with this change. And the existing situation was very inconsistent, so we had to change it one way or the other - this way should be less disruptive than alternatives while still resulting in a consistent comparison behavior.

It also just makes sense. For example, Lua 5.1 won’t call __eq when you ask it v == v, but will call __le when you ask if v <= v. It makes more sense to treat all types and operators uniformly without magical behavior.

This change is effective as of April 9th.

Will my game still work?

Yes!
… almost definitely.

We’ve had some reports of games that may be affected by this when they incorrectly implemented __eq metamethod but the problem happened to never occur.

For example, consider this code:

getmetatable(t).__eq = function (a, b)
    return a == b
end

This code is definitely not really necessary, but is it correct? The answer is - no! The == operator inside the __eq implementation will just call __eq again, resulting in a stack overflow. The correct implementation here is return rawequal(a, b).

However, if the code above was never called - which could happen if you only ever compared objects of this type redundantly to itself - the error could have been hidden in plain sight, and now it’s visible - because before this change we wouldn’t call the __eq when you compared an object of this type to itself (t == t), and now we will.

Please let us know if there are other unintended cases like this; we currently believe that all correct code is not affected. Needless to say, if your game never implements __eq it is not affected by this at all.

146 Likes

This topic was automatically opened after 29 minutes.

This release was quite a huge “oof” to me where I never tested with __eq in my core library for custom objects, and this suddenly broke like half my projects because it was attempting to call a nil function when doing a == a. Took 2 hours for me to update and re-deploy everything that broke. Probably my most important project, Nexus VR Character Model, was one of the projects that broke and suddenly hit dozens of games that use it (it auto-updates; servers newer than ~4:00 PM EST are not broken anymore).


Guessing there is still the limit of __eq only works if the metatables are the same? Does Luau have the freedom where this could be updated at some point, such as if I want to compare a theoretical Vector4 and Vector5, each with some __eq implementation?

14 Likes

We do have the freedom to relax the __eq behavior for different types, but this has both performance and compatibility implications for correct code. This update should not break correct code, my understanding of your issue is that you had a prior bug in the __eq implementation but it was never visible until this update. Whereas if we changed __eq behavior for mismatching types there’s all sorts of other questions wrt correct code, especially “is it okay to compare an object with __eq to nil?”)

The use case for comparing values of different types is well understood though, it’s just not clear if this can or will happen yet. Definitely not in the plan for now.

16 Likes

My problem was a bug in my code; it just took until now to manifest. For using it on different types, I can see why it could cause problems, and how the specification could be ambiguous for things like comparisons to nil.

9 Likes

Nice to hear that one of the weird and inconsistent aspects of Lua is officially fixed. I encountered this issue in an old project where I had to create a 300 character if statement (had to check each X, Y and Z component of a custom class for NaN).

5 Likes

Will this change be documented in Compatibility - Luau or somewhere else?

If this subtle behavior change was feasible, what other subtle behavior changes are feasible? A subtle behavior change of future Lua versions is erroring in an error handler recalling the error handler, would this be feasible?

2 Likes

Will this change be documented in Compatibility - Luau or somewhere else?

Maybe, although we don’t have an easy place to just put it in there. FWIW we’re working on a general set of improvements to the documentation site and availability, eventually we should have a full manual hopefully. We’ll figure out what to do in the meantime, it’s good to at least document our general thought process for compatibility on that page so we’ll do that.

Noteworthy is that this change happened because of a combination of an existing long-standing bug/incompatibility with userdata (the aforementioned Luau 2019 rollout caused this, we just never noticed until a few weeks ago) and a feature request (Call __eq even when tables are rawequal); things like this happen, although rarely - we don’t have a rigid playbook in situations like this and adapt to the circumstances.

If this subtle behavior change was feasible, what other subtle behavior changes are feasible? A subtle behavior change of future Lua versions is erroring in an error handler recalling the error handler, would this be feasible?

As a rule of thumb, while we try to preserve compatibility, we also try to move the platform and the language forward so we try to find a balance. For example, :: vs as was both a theoretical (“this might make games”) and a practical (“oops, we actually know for a fact as breaks games”) decision. When there’s an issue that would be great to correct and we think - or know - the impact will be minimal, we’re likely to try. For example, this update [to our knowledge] only breaks code that was incorrect to begin with, which is worse than not breaking anything (these updates usually don’t go out with a dedicated announcement :D) but much better than breaking correct code.

So - if you have feature/change requests, never hesitate to bring them up. Some things are likely to be harmless, and if useful can be done. Some things are likely to be harmful; some are in the middle.

The specific question you ask I don’t know the answer to off the top of my head; I remember the details of error handling were rather complex to resolve when we were adding first-class yieldable pcall/xpcall support so this area might be full of landmines, but also might be reasonably simple and if it makes semantics cleaner, we could do that.

4 Likes

Would it be a bad idea to only call __eq on equal values if the code has a == a semantics? That’s pretty much the only use developers have adopted so far and would break less code overall.

Alternatively there could be an __isnan metamethod that returns a boolean for use-cases like Vector5; This would still need to check the metatable for referentially equal values, but could be more performant than calling __eq for comparison-heavy code.

NaN isn’t very intuitive to begin with so I’ll support whatever has the best performance and implementation simplicity/consistency on average.

2 Likes

Generally I only use __eq if I need to or could benefit from overloading equality comparators (and at that point, I usually do all equality operators and arithmetic operators when possible).

Coming from a lot of time in C#, this is very familiar territory and this new behavior makes its usage orders of magnitude more consistent with what I’m used to.

It puts manually defining __eq in the same strata as

public static bool operator ==(MyObject left, MyObject right) {
    // Usually I just direct to a manual override of .Equals here, 
    // commonly one implementing IEquatable<T>, but that's C# jargon more than Lua
}

And makes using rawequal akin to object.ReferenceEquals(); (at least, close enough to that, dropping a few semantics of it all).

Overall speaking, I’m very happy with this change.

I think habits from C# will translate over right away.

getmetatable(t).__eq = function (left, right)
    if rawequal(left, right) then return true end
    if rawequal(left, nil) then return false end
    if rawequal(right, nil) then return false end
    -- Actual comparison code here.
end
2 Likes

Didn’t even know this existed, so an update will definitely push me to use it now haha

Thanks for clearing it up. Looking at it from the OOP perspective, it makes sense that __eq should be called even if a and b are the same reference. The purpose of __eq is to identify that in the first place! I feel like if you have __eq defined, then you’re overriding any checks that determine whether or not the reference is the same, and __eq is completely responsible for determining if a and b are equal; so, Lua shouldn’t keep __eq from firing just because a and b are the same reference.

1 Like

What can I do if my game stops working due to this change? How soon should I be affected?

1 Like

As said in the post:

  • This change is live already since a few days
  • If you don’t reimplement __eq metamethod anywhere, you don’t need to worry about this
4 Likes

This has been documented on Compatibility - Luau along with a couple others; it’s likely that there’s a few other differences that we’re missing there, so this will be updated in the future to hopefully be more comprehensive (as well as the rest of the page, as we keep discovering changes in later Lua versions that weren’t listed on their release notes and documenting them).

2 Likes

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.