Is it possible to optimize this voxel greedy mesher to not take hours?

I’ve been messing around with heightmaps and I made a builder for these heightmaps that uses 1x1x1 parts. The heightmap is sort of large, but I want to keep the detail and size.

I built my own greedy mesher using spatial queries, but it took too long and it was quite buggy. I tried a few modules and plugins and none of them worked, and I found this one tutorial that actually helped, but the problem is it just takes an absurd amount of time to work.

I’d be fine if it was too slow for real time, but I could just copy the result when it finished to use pre-optimized, but at the pace it runs at, it would probably take hours if not days at least.

-- in my case, the image is 512x512 
local sizes = {}

-- initialize part sizes and tables
for x = 1, (image.Size.X + image.Size.Y) do
	sizes[x] = {}

	for y = 1, (image.Size.X + image.Size.Y) do
		sizes[x][y] = {}

		for z = 1, (image.Size.X + image.Size.Y) do
			
			sizes[x][y][z] = {["X"] = 1, ["Y"] = 1, ["Z"] = 1}
		end
	end
end

-- check and combine on the z axis
for x = 1, (image.Size.X + image.Size.Y) do
	for y = 1, (image.Size.X + image.Size.Y) do
		for z = 2, (image.Size.X + image.Size.Y) do	

			sizes[x][y][z].Z += sizes[x][y][z - 1].Z
			sizes[x][y][z - 1] = nil
		end
	end
end

-- check and combine on the x axis
for x = 2, (image.Size.X + image.Size.Y) do
	for y = 1, (image.Size.X + image.Size.Y) do
		for z, size in pairs(sizes[x][y]) do

			if sizes[x - 1][y][z] == nil then continue end
			if size.Z ~= sizes[x - 1][y][z].Z then continue end

			size.X += sizes[x - 1][y][z].X
			sizes[x - 1][y][z] = nil
		end
	end
end

-- checking and combining the Y axis
for x = 1, (image.Size.X + image.Size.Y) do
	for y = 2, (image.Size.X + image.Size.Y) do
		for z = 1, (image.Size.X + image.Size.Y) do

			if sizes[x][y][z] == nil then continue end
			if sizes[x][y - 1][z] == nil then continue end

			sizes[x][y][z].Y += sizes[x][y - 1][z].Y
			sizes[x][y - 1][z] = nil
		end
	end
end

-- generate combined parts
for x, column in pairs(sizes) do
	for y, row in pairs(column) do
		for z, size in pairs(row) do
			local part = Instance.new("Part")
			part.Position = Vector3.new(x - size.X / 2, y - size.Y / 2, z - size.Z / 2)
			part.Size = Vector3.new(size.X - 0.1,size.Y - 0.1,size.Z - 0.1)
			part.TopSurface = Enum.SurfaceType.Studs
			part.Anchored = true
			part.Parent = game.Workspace
		end
	end
end

Is it possible to significantly optimize this? Or is the use of tables like this just too slow, even with micro-optimizations, no matter what?