Are 2D/3D arrays faster than 1D arrays in Lua?

sevenpolygons · August 29, 2020, 3:55am

In almost every Stack Overflow thread I’ve seen, the results seem to be the same: in C++, Java, C#, etc., 1D arrays are significantly faster to iterate through/access than higher-dimension arrays. However, in my tests, the results didn’t show 1D arrays consistently beating out 2D arrays.

I wouldn’t be surprised if there’s an issue with my benchmark, so if I’m doing this wrong I’d appreciate a correction. Back to the question though: Are 2D/3D arrays faster than 1D arrays in Lua?

sevenpolygons · August 29, 2020, 3:57am

Every time I try to include my code I get a “drafts online” error, so here’s a screenshot:

https://gyazo.com/bf54b9b5b8092de2fd586d30031d96b9

plasma_node · August 29, 2020, 4:12am

What were your results? By nature, something that contains more data should generally take a longer time to iterate through.

sevenpolygons · August 29, 2020, 4:20am

After 100 iterations I got an average of ~7.75ms for the 2D array and ~9ms for the 1D array. They should be iterating through the same number of entries

sevenpolygons · August 29, 2020, 5:43am

Perhaps @zeuxcg could provide some insight? If anything, the 2D array should be at a disadvantage since I’m not using table.create() when initializing it, right? Still, the 2D array beats out the 1D array.

nicemike40 · August 29, 2020, 4:35pm

(Was gonna post this last night but the devforum was messed up)

You’re only measuring creation of pretty small tables. I ran some more extensive tests on 10,000x10,000 tables with random elements. It was (kinda) interesting:

Lua test

code

local w, h = 10000, 10000

local startTimer, endTimer;
do
    local start;
    function startTimer()
        start = os.clock();
    end
    function endTimer(name)
        print(string.format("%s took\t%.1f ms", name, 1000*(os.clock() - start)));
    end
end

-- init

wait(3)

startTimer()
local oneDim = table.create(w * h, nil)
endTimer("oneDim alloc")

wait()

startTimer()
for row = 1, h do
    for col = 1, w do
        oneDim[(row-1)*w + col] = math.random()*10
    end
end
endTimer("oneDim init")

wait()

startTimer()
local twoDim = table.create(h, nil)
for row = 1, h do
    twoDim[row] = table.create(w, nil)
end
endTimer("twoDim alloc")

wait()

startTimer()
for row = 1, h do
    for col = 1, w do
        twoDim[row][col] = math.random() * 10;
    end
end
endTimer("twoDim init")

wait()

-- test

local x;

-- one dimensional
startTimer()
for row = 1, h do
    for col = 1, w do
        x = oneDim[(row-1)*w + col]
    end
end
endTimer("oneDim access");

wait()

startTimer()
for i = 1, w*h do
    x = oneDim[i]
end
endTimer("oneDim linear access");

wait()

startTimer()
for row = 1, h do
    for col = 1, w do
        oneDim[(row-1)*w + col] = x
    end
end
endTimer("oneDim mutate");

wait()

-- 2 dimensional
startTimer()
for row = 1, h do
    for col = 1, w do
        x = twoDim[row][col]
    end
end
endTimer("twoDim access")

wait()

startTimer()
for row = 1, h do
    for col = 1, w do
        twoDim[row][col] = x
    end
end
endTimer("twoDim mutate")

Results, luau

	1D (ms)	2D (ms)
alloc	615	599
init	5652	5294
access	1386	1145
lin. access	675	N/A
mutate	1425	1187

I ran the same thing with Lua for Windows, with the modifications that I replaced the table.creates with just {}, and made h = 1000 instead of 10000 due to memory issues.

Results, standard luac/lua:

Do not compare these results directly to luau’s – the size of the tables are different

	1D (ms)	2D (ms)
alloc	0*	1*
init	1252	1124
access	301	346
lin. access	201	N/A
mutate	311	356

*not actually allocating anything

C test

For fun I ran the same thing in c (no optimizations):

code

// forgive the syntax colors
#include <time.h>
#include <stdio.h>
#include <stdlib.h>

const long w = 10000;
const long h = 10000;

double start;

double getTime() {
    return (double)clock() / CLOCKS_PER_SEC;
}

void startTimer() {
    start = getTime();
}

void endTimer(const char* name) {
    printf("%s took\t%.3f ms\n", name, 1000.0 * (getTime() - start));
    fflush(stdout);
}

int main() {
    startTimer();
    double* oneDim = (double*)malloc(w * h * sizeof(double));
    if (oneDim == NULL) {
        return errno;
    }
    endTimer("oneDim alloc");

    startTimer();
    for (long row = 0; row < h; row++) {
        for (long col = 0; col < w; col++) {
            oneDim[row * w + col] = (double)(rand() % 10000) / 1000;
        }
    }
    endTimer("oneDim init");

    startTimer();
    double** twoDim = (double**)malloc(h * sizeof(double*));
    for (long row = 0; row < h; row++) {
        twoDim[row] = (double*)malloc(w * sizeof(double));
    }
    endTimer("twoDim alloc");

    startTimer();
    for (long row = 0; row < h; row++) {
        for (long col = 0; col < w; col++) {
            twoDim[row][col] = (double)(rand() % 10000) / 1000;
        }
    }
    endTimer("twoDim init");

    double x;

    startTimer();
    for (long row = 0; row < h; row++) {
        for (long col = 0; col < w; col++) {
            x = oneDim[row * w + col];
        }
    }
    endTimer("oneDim access");

    startTimer();
    for (size_t i = 0; i < w*h; i++) {
        x = oneDim[i];
    }
    endTimer("oneDim linear access");

    startTimer();
    for (long row = 0; row < h; row++) {
        for (long col = 0; col < w; col++) {
            oneDim[row * w + col] = x;
        }
    }
    endTimer("oneDim mutate");

    startTimer();
    for (long row = 0; row < h; row++) {
        for (long col = 0; col < w; col++) {
            x = twoDim[row][col];
        }
    }
    endTimer("twoDim access");

    startTimer();
    for (long row = 0; row < h; row++) {
        for (long col = 0; col < w; col++) {
            twoDim[row][col] = x;
        }
    }
    endTimer("twoDim mutate");

    // cleanup
    free(oneDim);
    for (long row = 0; row < h; row++) {
        free(twoDim[row]);
    }
    free(twoDim);
}

Results, C:

	1D (ms)	2D (ms)
alloc	1	35
init	2689	2574
access	187	216
lin. access	222	N/A
mutate	197	234

Takeaways

IDK.This wasn’t scientific and these languages are wildly different and benchmarking properly is hard to do. Plus my code’s probably incorrect somewhere. But:

Luau is slower than C? (groundbreaking)

Allocation was slow for both. table.create lies in its docs, and does actually initialize the elements even if the second element is nil. A quick test confirms this:

size	`table.create` time
100	0.0 ms
1000	0.0 ms
10000	0.0 ms
100000	0.6 ms
1000000	5.9 ms
10000000	58.2 ms
Would be nice to have straight allocation without initialization, but what can ya do.

It’s basically a tie for speed. Use whichever one’s easier. If you can utilize the linear access capabilities of the 1D array, use that.
It doesn’t seem like cache locallity or anything actually matters here much.
The 2D array is (slightly) faster for a lot of things in lua. That could be that the extra cost of a multiplication and add to index into the 1D table correctly is more expensive than the extra indirection in the 2D. (again though, it’s pretty unnoticeable).

The 1D array is faster for most things in C. That makes more sense to me.

There may still be other benefits to using the 1D array – e.g. it is probably easier on the garbage collector to track…? I have no idea if that’s true, though. Ask whoever wrote the engine