Release Notes for 435

Some context on BulkMoveTo.

The release notes description is actually a bit inaccurate as the feature went through a few changes as I developed it and I forgot to update the release notes description after updating the API to the final version.

As for how it works, the enum argument gives you full control over the behavior:

  • FireAllEvents => The move will behave exactly the same as if you had just looped over the parts and assigned the .CFrame property on each of them. Though with slightly better performance because the reflection layer only has to be crossed once total, instead of once for each property set.

  • FireCFrameChanged => The move behaves almost the same as with FireAllEvents. The only difference is that you won’t get Position / Orientation / Rotation changed events. The CFrame changed event you do get is sufficient for replication / undo history etc to work. This variant will have much better performance than the above in studio, where multiple parts of studio are listening for every property change being made on an instance, since 4x less property changes have to be pushed through those handlers.

  • FireNoEvents => This lets you move parts purely physically. They will move to the new location, but only their physics information will be updated, no events will be fired. Since replication and change history are based off of property changed events, changes made with this mode will not be replicated, and will not be part of the change history. This lets you move parts around to speculatively test for collisions / raycast hits much more cheaply than you could before.

The API is introduced so that the Lua Draggers Beta can much more efficiently move the parts in the selection around when dragging. Dragging large models, thanks to this API, should be almost 5x more performant than with the legacy draggers.

30 Likes

This brings up some interesting ideas to me, and I’m curious since you’re directly working with this atm… The performance benefit you can get by suppressing changed events sounds super useful to be able to control as a dev, and is the strongest point of my terrain generation system because I artificially suppress these events for the most part (I use some trickery by grouping instances together under one parent under nil and setting properties while the instance group is in nil, then parenting the whole thing in one go). My terrain system demonstrates this well because I can generate a massive grid of several thousands of chunks in only about six seconds (in comparison to roughly 15 seconds to a minute or two traditionally, depending on the method used to balance server strain and how well fps is balanced client-side) give or take. And with no noticable fps reduction even on my old 2011 thinkpad which makes me quite happy.

Would there be any potential consideration for an extension of this that could accept a list of Instances and a map of properties for example? My assumption is that in that way this would be more expensive than BulkMoveTo due to the need to process the property strings and validate that an instance has said property, but, “baking” the info into an API instance could potentially work pretty well without causing much annoyance in terms of the actual API.

Would something like this example be in any way feasible?

local classNames = {"Instance", "BasePart"} -- Only deal with these inherited classes
local arrayOfProperties = {"Name", "Archivable", "Parent", "Color"} -- Only deal with these properties
local propertySetter = SomeApiHolderObject:GetBulkSetter(classNames, arrayOfProperties) -- "Bake" it into a single setter API. Regular Instances don't have a Color property, so they could just be ignored by the setter, probably with a slight reduction in performance.

propertySetter:Apply({someModel1, someModel2, somePart}, {"ExampleName", false, workspace, Color3.new(0.3, 0.72, 0.27)}, BulkSetMode.FireAllEvents) -- Apply some properties to some instances in the same order they were specified

The one issue I can see with some event suppression is that certain mechanics might rely on property changes internally that could cause issues, for example Parent. So BulkSetMode code have a FireSafeEvents value for example which would fire the least amount of events it is “allowed to.”

But, this is just a loose idea I came up with on the spot, and I’m not confident that I know enough about the engine, so I kind of have the feeling that there will be complications with this I haven’t thought of. I am kind of curious on whether or not something like this could potentially be achieved in a way that’d be worthwhile to develop.

5 Likes

I considered something like that, but decided it’s too much complexity for too marginal a benefit.

You might want to take a look back at your terrain generator code and measure how much of the slowdown was the .CFrame set vs any other sets. Moving parts is what benefits by far the most from this, due to the four connected properties in Position, Rotation, CFrame, and Orientation.

4 Likes

Why does Lua introduce anything? These are the same people that have local<const>foo as syntax for constants in 5.4.

With regards to bitwise operators, I’m content with having functions for these things (since it comes at such a cost to have the actual operators) but it’s a little annoying that bit32 only works on 32 bits. That’s one of the tangible benefits for proper integers and operators.

As an example, I’ve been writing a module for reading/writing a bitstream and doubles have a mantissa that’s 52 bits. As a result, I can’t use bit32 to extract bytes from it without some work, which is added complexity I didn’t want (the part of the code in question is here if anyone is curious).

Whether or not this is a common enough problem that it warrants solving is debatable though.

I think it’s now local foo <const> :smiley:

Yeah to encode that you’d need to split it into two 32-bit values, but this doesn’t seem too bad. Similarly to the comment above, you could use bit32.extract - it’s a neat way to make shift+and sequences more legible.

1 Like

That is not much better!

I could switch to using bit32.extract, though in this case it wouldn’t help all that much since it’s just a right shift and an AND for most of the operations here, which are obvious enough. Either way, it’s more of an annoyance than an actual hindrance. It’s just a modulo and (floor) division operation to two intrgers that are 32-bits or less anyway.

These compound operators look really cool!

However, will we ever see “javascript” styled assignment chaining.

local x = y = ...

image

If I had to guess I’d say that it’s unlikely this will happen, but not impossible. It is likely easy to set up a little transpiler plugin to do this if you’d want to do it yourself:

local x = y = 5
a = b = c = 7
local y = 5
local x = y
c = 7
b = c
a = b
1 Like

In Lua, assignments are statements, not expressions. We intentionally preserved this for compound assignments, as we believe that this makes the syntax less error prone (and harder to abuse).

Spot the bug:

let a = 2
if (a = 1) { console.log("oh no"); }

So - no.

10 Likes

Will UICorner support altering the radius of individual corners? Judging by the DevHub, it only accepts a UDim, so I assume not. I can imagine it being extremely helpful if individual corners could be altered though - I do so in my UI via the use of 9-slices.

3 Likes

It looks like that at the moment that’s a no, but I’d love to see maybe an IndividualCorners property that’d enable properties such as TopLeftCornerRadius, TopRightCornerRadius, BottomLeftCornerRadius, and BottomRightCornerRadius instead of CornerRadius, sort of like how constraints and some other instances hide properties unless you enable them to avoid clutter.

2 Likes

It would have to be more than just a little bit useful though, as that would have performance implications for all GUIs using UICorner. The corners are implemented at the shader level, and any additional branching shaders have to do has a real cost.

4 Likes

Would a function that could apply multiple properties at the same time help with performance? I’m creating my own terrain generator and a huge amount of time is spent setting the Size and CFrame of wedge parts

this is just setting the size and cframe, the cframe and size vector are already created in the neon green label to the left of “triangle”

Setting the size shouldn’t cost that much. Are you setting the size after parenting the parts to the world? If so, then change your code to set the size (and every other prop including CFrame) before parenting and the setting of size should be a lot more performant.

A terrain generator isn’t a case that such an API would help with, because you’re generating fresh instances, and assigning properties of instances which haven’t been parented to the world yet is about as cheap as it can get.

There will always be some cost, since putting all those parts into the world requires all the physics information to be set up for them.

So if I’m reusing wedge instances instead of creating/destroying, would it be better to parent to nil, set size & cframe then reparent to the world?

No, because then the engine will have to re-do all of the work it did parenting it to the world and setting up physics / rendering data for it.

You could still create a prototype wedge part, and :Clone() from it before ever parenting it to the world, there’s nothing stopping you from cloning from something that isn’t in the world yet.

This is great! Setting .CFrame is a pretty significant bottleneck in my game. Most of my parts are local and graphics-only so they don’t need any expensive updates. I plan to clear the 2 tables and reuse them every frame to reduce allocation overhead.

Would it be reasonable to request an optional offsetList parameter? Almost every use-case (including Lua Draggers Beta) is going to need an additional ToWorldSpace operation for each part anyways, and it would greatly reduce the number of CFrames that need to be allocated and garbage collected for each transformation.

1 Like

This matters way less than you think as long as you allocate your tables in a single allocation using table.create rather than growing them from an empty table. You may even make things slower if you try to dynamically change the number of things in the array and have to loop over part of it to clear it.

The CFrame garbage didn’t seem to be a bottleneck for me in my use case so I didn’t investigate something like that. I could reconsider this, you’re right that most use cases would benefit from it.

(And @tnavarts, I believe this is what you were saying)
I believe the most performant way to cleanup arrays is actually to discard the array. Sure, it’ll cause more GC, and sure, it’ll need to do a whole new table allocation. However, if you look at what has to happen, you’ve got two choices. Firstly, on GC you may get a ton of CFrame collections when your table is discarded, however, this will also happen if you clear it. An easy solution to this is to hold all of your CFrames in a “keep-alive” table to prevent them from GCing and slowly cleaning out that table, but, it’s not really necessary to do this as GC in a majority of cases is quite speedy. Secondly, clearing out a table takes the length of that table iterations. An allocation usually can just involve multiplying
couple numbers together and throwing that into an allocation function. So, basically, ignoring what goes on in the allocation itself since that can’t be controlled, that’s one iteration vs hundreds or thousands of iterations to clear out that table, and thus, it’s likely many many times more efficient to simply discard the table and leave the rest to the garbage collector and the allocation functions used.

Additionally, I prefer to avoid table.insert in as many cases as it might make sense as it’ll make an extra __len call every time which isn’t exactly the most performant in some of my cases (particularly massive data processing in my compression algorithm benchmarks) where I could be doing upwards of a hundred thousand to maybe even a million table.insert calls depending on input data.

Lastly, rather than clearing out your table and rebuilding, you could instead use two tables. One which maps keys to values and values to keys in the array, and one that is the array. This way if you need to update the CFrame of a specific object, simply lookup its array index and bam set the CFrame in the cframes table. If you want to delete the entry,

-- Note, again it'd probably be best to use table.create if you want to populate the table. If you need to populate it and it has old stuff you want to keep, table.move is the best option and you can just allocate a new table with the add length and the old length and copy the values from the old table to the newly allocated table.
local cframes = {}
local parts = {}
local indexMap = {}
local bulkMoveListLength = 0

-- In order to BulkMoveTo
workspace:BulkMoveTo(parts, cframes, Enum.BulkMoveMode.FireCFrame)

-- In order to add an entry to the table
local part, cframe -- Assume these aren't nil
bulkMoveListLength += 1 -- Increment the length
local index = bulkMoveListLength -- Store it in a new variable for the index
-- Store the part and cframe in the lists
parts[index] = part
cframes[index] = cframe
-- Map the indexes to their values in the arrays
indexMap[part] = index
indexMap[cframe] = index -- Note: Even if the CFrame values are unique and cf1 == cf2, they are still unique userdatas which is useful in this case, however, setting two part's CFrames to the same CFrame object will not work so it's important to copy the CFrame in that case or just simply ignore the cframe index in the map.

-- In order to update a part's cframe
local part, cframe -- Again, assume non nil
local index = indexMap[part] -- This will be the old index
indexMap[cframe] = index -- See note above too
cframes[index] = cframe
-- You can do a hybrid of this and the above by checking "if indexMap[part]" (so if index here for example)

Where did you hear this? According to the Lua 5.1 source, table.insert never does a call to __len. In fact, it just checks the length of the table without invoking metamethods. If your table only has an array part, this behavior is O(1).