Speeding up thousands of iterations (Image API)

You probably saw that Image and Mesh APIs came out. Writing and reading their pixels and vertices are pretty quick. But when it comes to iterating them, it’s a different story.

It takes seconds to iterate 640 * 360 = 230,400 pixels. How can this be done faster?

-- client:
for x = 1, 640 do
    for y = 1, 360 do
        -- read pixel, write pixel
    end
end

Assuming the read/write operation takes 0.00001 seconds, this for loop would be too slow. That’s such a small amount of time, right? Well, 230,400 * 0.00001 = 2.34 seconds. Not good. This may even crash some weak devices.

I don’t think it’s possible to shorten the read/write operation, this is why I researched Parallel Lua.

Is it possible to run multiple threads in parallel from the same script?

I’m sure other people have had this same problem these past few weeks. But, nobody has been talking about it! Does that mean I’m missing something simple?

3 Likes

well from what I currently now you could try doing coroutines and basically split the pixels into x amount of segments the do I for I loop for that amount. In that for I loop you call the coroutine function and pass in the amount. The coroutine function would just loop through that amount

1 Like

It’s a good idea, I’ll try it. I believe that it won’t be effective though.

The WritePixels and ReadPixels methods are not yielding (to my knowledge). Lua will run all the code without pausing, which means separating threads will not make code run in paralell. However, if those methods are yielding, this will work. How can we tell if they are yielding?

1 Like

I have absolutely no clue in my head the code would go

local Pixels = Image:GetPixels()

local function Loop(Start,Amount)

 for i = Start,Start+Amount do

  --run code

 end

end

local Sections = 5

for pixel = 1,Pixels,Pixels/Section do

 coroutine.wrap(Loop)(pixel,Pixels/Section)

end

1 Like

Coroutines are not parallel luau.

To make another thread, you need to create an actor instance and put a script in it.

4 Likes

I found this documentation:

https://create.roblox.com/docs/scripting/multithreading

It has an example for terrain generation, which seems to match what I am doing, but in 2D instead of 3D. It seems they are creating 32 actors, while putting a clone of the script in each of them. Then, when generating, it asks a random actor to do the work.

This is messy, but it makes sense. If I find a way to implement it, and if I can find a way to share it here, then I will! I’ll do this later.

(edit): It also says this:

5 Likes

Have you tested it with native code generation yet?

If I recall, there might also be a function to get the image’s contents as an array though I didn’t read the docs so I’m unsure.

2 Likes

if you need to do computation for each pixel then all you can do is multithread, but eitherway make sure to do it like this:

local ImageSize = Vector2.new(640, 360)
local ImageSizeX, ImageSizeY = ImageSize.X, ImageSize.Y
local PixelBuffer = table.create(ImageSizeY * ImageSizeX * 4)

local EditableImage = Instance.new("EditableImage")
EditableImage:Resize(ImageSize)

for y = 1, ImageSizeY do
    local PixelIndexY = (y-1) * ImageSizeX
    for x = 1, ImageSizeX do
        local PixelIndex = PixelIndexY + x
        local RGBAIndex = (PixelIndex - 1) * 4
        local R, G, B, A

        -- do your stuff

        PixelBuffer[RGBAIndex + 1], PixelBuffer[RGBAIndex + 2], PixelBuffer[RGBAIndex + 3], PixelBuffer[RGBAIndex + 4] = R, G, B, A
    end
end

EditableImage:WritePixels(Vector2.zero, ImageSize, PixelBuffer)

so instead of immediately writing the pixel to the image using DrawRectangle or whatever, instead update an array with all the pixels in it and when you’re done update everything at once.

2 Likes

I made a module to run arbitrary tasks in parallel

If you end up using my module, do not run each pixel as a separate task. While the module does itself group tasks together to reduce overhead, you’ll get the best performance if you do that yourself. A simple way to do that is create one task per line, so 360 tasks in your case which isn’t bad

For the use of parallel lua to be effective, each task should be a decent amount of work. In your case, this should be the case

There are also other modules like mine you can find on the dev forum

2 Likes

I’ll check this out later! It looks promising.

1 Like

Yes, but it hasn’t improved the performance much, unfortunately. I’ll use it anyways.

2 Likes

The reason I need to iterate all the pixels is because I want to implement basic shaders to them. For the first test, it will just change the color of each of them to red, without changing their alpha.

If I have multiple shaders, I would have to iterate all the pixels again. But, I think I can prevent this by using multiple parallel workers with your module. One worker handling each shader. I would make them in order, and then make them work in order.

Unsure on how to use the results though. Looks like I would have to iterate even more to get it to work… negating the performance boost Parallel Lua gives.

I would recommend you do every shader at the same time, so when calculating the color of a pixel, do the first shader, then the second, then the third, before going to the next pixel.

If you make each horizontal line a task (360 lines at your resolution), you would have to loop through the 360 results (and the tables returned by those). Alternatively, you can put a little task.synchronize() in the module (containing the function that runs in parallel) and that will let you do the work that has to be done in serial. I am unsure if that approach is faster, you wouldn’t have to loop through the results in that case

Well this may not work for shaders such as a blur. It will interact with the surrounding pixels as well, and running shaders out of sync like this could cause artifacts.

However, in my program I reduced the time it takes to shade from 60-75 ms to 19-25 ms! Apparently making lists is really slow, even if it only has 4 numbers.

We learn something new every day.

(edit): I also fixed the module to fit my style, which also uses less memory.

(edit 2): Just realized my approach, and your approach wouldn’t work. When I schedule work, I’m sending the pixel data in. But when it’s shaded, that data would have changed. Is it possible to rewrite what the scheduled work parameters are? I don’t think so…

1 Like

Having the surrounding pixels affect other pixels will definitely make things a lot more complicated. To effectively run things in parallel, they need to be separate from each other. You might need to take a much different approach than what my module offers