Luau Native Code Generation Preview Update

tnavarts · May 9, 2024, 10:46pm

It is all CPU. The GPU could do it faster, but with much less flexibility: To do it on the GPU it would have to be structured something like you submitting a batch of casts to do, and getting back the results next frame rather than immediately.

Ethanthegrand14 · May 9, 2024, 10:48pm

I can see a lot a developers using something like a BatchRaycast API or something. There are many scenarios and games of mine where I have to cast many many rays per frame. GPU processing would be super beneficial in that sense. Having both regular CPU and batch GPU methods would be great to see if possible

Fluffmiceter · May 9, 2024, 10:58pm

On the topic of SIMD, could you guys add some way for us to fully utilize SIMD as well? I’m not fully versed on SIMD, but I believe modern CPUs should be able to do much more than 3 parallel instructions.

I’ve got pretty optimized custom raycast, using cache-aligned buffers and pretty much every trick in the book to speed up BVH traversal, but the only additional performance I couldn’t previously access was via SIMD. 3-way is great, but can we get N-way SIMD depending on the device’s capability? 8-way would be awesome.

WheretIB · May 28, 2024, 2:04pm

While we do explore ideas around SIMD, I don’t think there are any APIs that are coming for that any time soon.

AxisAngle · May 28, 2024, 3:32pm

Are Dot and Cross optimizations coming soon to native Vector3?
About 50% of my Vector3 ops are dot product, and 30% are cross product, with the remaining 20% being split evenly between scalar multiplication and addition, with the rare .unit or .magnitude.

WheretIB · May 28, 2024, 4:18pm

Dot/Cross/Floor/Ceil/Magnitude/Unit for Vector3 are native in Roblox Studio when properly type annotated.

Support on servers will come in a few weeks.

Fluffmiceter · May 28, 2024, 7:47pm

My ray-BVH8 test is written in a way that could really leverage 8-way SIMD. If I were to construct Vector3’s and then do the addition and multiplication on said Vector3’s, would that be faster than having the math all in components?

WheretIB · May 28, 2024, 7:52pm

In this case it’s important to consider the extra time it takes to pack data into a Vector3 and extra data back.
While it is a very fast operation, in short code examples we’ve seen that it could be faster to stay with numbers, for example, just doing Vector3.new(x, y, z).Magnitude is slower that just computing the magnitude manually.

However, if the amount of Vector3 operations is higher, it can become a benefit.
Best case would be to have data stored in Vector3 form and operate on it, we’ve seen examples of such code that outperformed the individual 3 numbers.

Fluffmiceter · May 28, 2024, 9:05pm

Yeah, I figured that creating the Vector3 would kill any gains from Vector3 SIMD, although haven’t benchmarked it.

Unfortunately I can’t have the data pre-stored in Vector3 format because the data is encoded into a buffer. More specifically, my 8-wide BVH structure is one giant flattened buffer, with each node in the tree occupying 64 bytes (for 8 children, 8 bytes each, 6 bytes for boundaries: 2 bytes per axis – 1 for minbounds, 1 for maxbounds; 2 bytes for jump offset to child node).

In short, I need to encode the BVH in a buffer in this way to get optimal cache performance, because cache misses and memory reads were the main cost for BVH queries. I’ve really written this with future SIMD support in mind, so fingers crossed there…!

NGC4637 · June 2, 2024, 10:34am

this thing happened
(there is no anonymous function at line 1 in any of the scripts i’ve found)
i’m completely stumped, i have no idea why this is happening.
this only happened recently out of nowhere, it was working fine back then
i made no changes to it before and after this weird warning

: /

NGC4637 · June 8, 2024, 3:02pm

after a bit of tinkering with the script, --!optimize 0 fixed it.
ig that means the aggresive optimizations is doing something to mess it up

ok apparently it really doesn’t like explicit string types like these:

local str: {["a" | "b" | "c"]: any} = {
	a = "foo",
	b = "bar",
	c = "baz"
}

-- insert other code here

Sublivion · July 22, 2024, 6:44pm

Piercing raycasts would be amazing, I’m currently using 5 non-piercing raycasts to emulate the behaviour of a single piercing raycast!

system · November 19, 2024, 6:44pm

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.