Checking for mixed table?

Objective: Write a function to check if a table is a mixed table. A mixed table is just a table that acts as both an array and as a dictionary.

The central question is actually this: Can you guarantee that next(tbl, #tbl) will always return the first non-sequential key in the table?


For context:

  • An array is a list of values. In Lua, the values don’t really matter, but the index is always sequential and numerical, starting at 1:
    {2, 4, 45, 2, 1}

  • A dictionary (sometimes called a map or hash map) uses non-sequential and typically non-numeric keys. For instance:
    {points = 32, kills = 10, deaths = 3}

  • And thus, a mixed table might look like this:
    {3, 56, 2, points = 43, kills = 10}


My attempt is this: Write three helper methods (IsEmpty, IsArray, and IsDictionary) which will return true only if they are exclusively true (i.e. IsArray will not return true if it’s also a dictionary). Then, simply check if the array is both a dictionary and an array. Boom. Done.

Here’s what I have so far:

local function IsEmpty(tbl)
	return (next(tbl) == nil)
end

-- This is the potential game-stopper:
local function IsArray(tbl)
	return ((#tbl ~= 0) and (next(tbl, #tbl) == nil))
end

local function IsDictionary(tbl)
	return ((#tbl == 0) and (next(tbl) == nil))
end

local function IsMixed(tbl)
	return (IsArray(tbl) and IsDictionary(tbl))
end

Set your eyes on the IsArray function. This is the crux of everything here. That function is relying on the assumed fact that next will always start with the sequential array indices (starting at 1), and continue sequentially before hitting any other keys. IF THAT IS TRUE, then this will work. However, I have no idea how to prove it. It seems to work from the little testing I’ve done, but I’m not sure.

Can anyone verify this? Or point out any flaws in my logic so far?


If the answer is “no, you cannot guarantee this behavior,” then I also ask: Is there any other way to check for a mixed table?

2 Likes

wouldnt

local function IsMixed(tbl)
	local hasarray
	local hasdictionary
	
	for i,v in pairs(tbl) do
		if typeof(i) ~= "number" then
			hasdictionary = true
		else
			hasarray = true
		end
	end
	
	return (hasarray and hasdictionary) ~= nil
end

work?

am not really understanding the question, are you trying to ask if what you have always works or do you want something that does work?

Yes, I’m asking if what I have actually works. It seems to work, but I don’t know the full details regarding how next works.

1 Like

doing

local testtbl = {}

testtbl[2] = "hi"

-- This is the potential game-stopper:
local function IsArray(tbl)
	return ((#tbl ~= 0) and (next(tbl, #tbl) == nil))
end

print(IsArray(testtbl))

prints false, even though it should be an array?

vvv post below might explain that

[Deleted old post]
[Deleted old post]

1 Like

Take a look through the source for 5.1. You can search on this index page for “next”.

https://www.lua.org/source/5.1/idx.html

I’m making an educated guess here - and please correct me if I’m wrong! - but this looks like it might be what you’re looking for (screenshots because I can’t get discourse to highlight properly):

image

used in

image

2 Likes

But interestingly look if you add #tblx as second argument:

local tbl1 = {
	a = "a",
	"1",
}

local tbl2 = {
	a = "a",
	[1] = "1",
}
print(next(tbl1, #tbl1)) -- a a
print(next(tbl2, #tbl2)) -- nil
2 Likes

According to the source, it looks like it always goes through the array portion first but I don’t think it’s guaranteed that sequential numerical indexes will be in the array portion. To my understanding, the array is just an internal optimization, and sometimes, such as in the above example, numerical indexes will be in the dictionary/hash portion.

2 Likes

No, because the Lua spec claims next does not guarantee any order (which is why ipairs exists to iterate over arrays). It depends on the implementation what the actual behavior is, but you cannot rely on this in the default Lua implementations either:

local t = {[1] = 1, [2] = 2, [3] = 3, [4] = 4}

#t -- 4
next(t, #t) -- 1, 1

Note that the table initialization syntax used here forces all elements to be in the hash part of the table, causing #t and next to behave differently than in this case:

local t = {1, 2, 3, 4}

#t -- 4
next(t, #t) -- nil

To add onto;

Something simple like this can work:

local function is_mixed(t)
    local array_count = 0
    local total_count = 0
    for _ in ipairs(t) do array_count = array_count + 1 end
    for _ in pairs(t) do total_count = total_count + 1 end
    return array_count > 0 and total_count > array_count
end
10 Likes

I had always thought it was smart enough to not do that. Great overall answer though! I guess counting the array is the only foolproof method then unfortunately. Was really hoping to avoid that.

1 Like

The most efficient way is probably this.

local function isMixed(tab)
    local array = #tab ~= 0
    if array then
        for i in next, tab do
            if type(i) ~= 'number' then
                return true
            end
        end
    end
    return false
end

This depends though, because it requires the table to be a “true array” (has a 1 element).

Just for fun: by avoiding reallocations you can get a range of other “odd” scenarios:

local t = {1}

t[8] = 8
t[7] = 7
t[6] = 5
t[5] = 5
t[8] = nil
t[7] = nil
t[6] = nil
t[4] = 4
t[3] = 3
t[2] = 2

for k, v in pairs(t) do
    print(k, v)
end
print()
for k, v in ipairs(t) do
    print(k, v)
end

will yield (because elements 2–5 stay in the hash part of the table);

1	1
4	4
5	5
2	2
3	3

1	1
2	2
3	3
4	4
5	5

A single reallocation will “fix” that, though :wink: .

2 Likes