Functional Shenanigans: map, filter, partition, reduce - two ways

RegularTetragon · November 10, 2018, 8:12am

In this tutorial I’m going to talk about what some of the fundamental functions for operating over lists are, how to implement them in lua, and some interesting use cases for you to work out on your own. This is a fairly advanced tutorial, and you should know something about functions as first class objects, and lua’s generic for loop.

There are many amazing features of lua, but one of my personal favorites are higher order functions. A higher order function is a function whose input is another function. Many languages have this, and this is used all the time in Roblox’s event system. An example is:

SomeEvent:Connect(
    function(SomeInput)
        print(SomeInput," was receieved!")
    end
)

In this case, the function doing the operating is “Connect”, whose argument is another function being operated upon. This should be old news for all of you more experienced developers, but this is a great introduction into the dark side of computer programming: Functional.

A functional programming language is one where:

Everything is ideally immutable
Functions operate over other functions
Linked Lists reign supreme

Now, I could wax poetic about functional programming for days (and I often have!) but I want to focus on something a little more pertinent to Roblox. Operating over sequences (a.k.a. Arrays, Vectors, Linked Lists) is traditionally done with a for loop which has several pros and cons:
Pros:

They are loops, an imperative way for solving iteration
They are fast, especially when it is necessary to embed them
They allow you to modify your structure in place

Cons:

Loops are fundamentally stateful which makes them harder to understand, especially when embedded (i.e. 3 layer nested loops each with their own iterators)
Loops are only faster in single threaded contexts (note this includes Roblox for the time being), when introducing multiple threads you must perform checks that are unnecessary if your data is guaranteed to never change.
They allow you to mutate data which can be less predictable

Now, what could be the alternative to for loops? You could use while loops of course but then you end up with the same problems mutability creates. One answer is higher order functions. We’ll be talking about functions which take the form
f(sequence, function operation())
That do nothing to the original list, and return something which is driven only by the sequence, and the operation.
The following functions are the primary operations on sequences:

map(sequence, transformation)
filter(sequence, predicate)
reduce(sequence, operator)
partition(sequence, predicate)

Let’s begin with map. Note, I will be writing these in an imperative fashion in Lua, however it is entirely possible to write these functions in a functional style within Lua. I will show how one accomplishes this later in the tutorial towards the bottom. That being said, these will be the most efficient implementations in terms of eager evaluation.

Let’s start with map, whose first argument is a sequence, and whose second argument is a transformation function. The first argument is straightforward, it’s an array such as {1,2,3,4}. The second argument is a function which takes a single element from the list, and returns a single element for the new list. That’s it. Let’s see how you’d implement this:

function map(sequence, transformation)
    local newlist = { }
    for i, v in pairs(sequence) do
        newlist[i]=transformation(v)
    end
    return newlist
end
--example usages:
--map every player to their team
map (game.Players:GetPlayers(), function(player) return player.Team end)
--map a list of numbers to their square
map ({1,2,3,4,5,6,7,8}, function(x) return x^2 end)
--pair together parts with their color
map (workspace.Model:GetChildren(), function(part) return {part, part.BrickColor} end)

Now let’s talk about filter, which takes a sequence, again straightforward, and a predicate function. The predicate is just a boolean expression wrapped into a function. filter tests every element in the sequence against the predicate, and if the predicate returns true, then that element will be in the sequence filter returns. Ideally order is preserved so I will be using ipairs instead of pairs.

function filter(sequence, predicate)
    local newlist = { }
    for i, v in ipairs(sequence) do
        if predicate(v) then
            table.insert(newlist, v)
       end
    end
end
--Example use cases:
--Get all the players who are on teams Red and Blue
filter(
    game.Players:GetPlayers(),
    function(player) return player.Team == RED or player.Team == BLUE end
)
--Get all players who are spawned
filter(game.Players:GetPlayers(), function(player) return player.Character ~= nil end)
--Get all the green parts in Workspace
filter(workspace:GetDescendants(), function(part) return part:IsA("BasePart") and part.BrickColor = GREEN end)
--Get all the odd numbers in the list
filter({1,2,3,4,5,6,7,8}, function(x) return x%2==1 end)

Skipping down on the list, let’s talk partition. partition takes a sequence and a predicate, and returns two lists, one where all the elements satisfied the predicate, and one where all the elements failed the predicate. Its definition is similar to filter:

function partition(sequence, predicate)
    local left = { }
    local right = { }
    for i, v in ipairs(sequence) do
        if (predicate(v)) then
            table.insert(left, v)
        else
            table.insert(right,v)
        end
    end
    return left, right
end
--Example use case:
--Get all the players with scores above 10, and all the players with scores below ten:
highscore, lowscore = partition(
    game.Players:GetPlayers(), 
    function(player)
        return player.leaderstats.Points.Value > 10
    end
)

Now for the final one listed, reduce. Reduce takes a sequence, and a binary operator. It rolls up your sequence into whatever that binary operator returns. The operator must be able to handle a nil value as its first argument.

function reduce(sequence, operator)
    if #sequence == 0 then
        return nil
    end
    local out=nil
    for i=1,#sequence do
        out=operator(out, sequence[i])
    end
    return out
end

--Examples:
--Product of a sequence
local product = reduce({4,6,1,5,2}, function(accum, val) return (accum or 1) * val end)
--Total of red team's score
local score = reduce(
    game.Teams.Red:GetPlayers(),
    function(accumulator, player)
        return (accumulator or 0) + player.leaderstats.Score.Value
    end
)

End of Tutorial

Beginning of Discussion
So that’s a complete definition of our major functions, but we can do better. I promised you a realm free of variables, and I can deliver. As it happens, all of these are pure functions whose inputs only depend on its outputs, assuming the user doesn’t try and mutate anything in the operator like a maniac. A good place to start would be then to implement any of these functions with respect to a one of the others. Let’s look at our candidates:

map: can go from a list of a size n, to a list of a size n
filter: can go from a list of size n, to a list of size <= n
partition: can go from a list of size n, to two lists of size <= n
reduce: can go from a list of size n, to a single value

Well, it initially looks like we’re stumped based solely on what these things return, but let’s see how we can stretch words. map, filter, and partition are fairly strict in what they create, however you may notice that reduce reduces to a single value. What if that single value is a list? Well, let’s try it, here’s an implementation of map that uses only function parameters. No variables, honest!

function map(sequence, transformation)
    return reduce(sequence, function(newlist, element) return {unpack(newlist or {}),transformation(element)}
end

You may notice the egregious abuse of unpack there to immutably append an element to the accumulator. Please be assured that this is a perfectly acceptable thing to do if you were working with Linked Lists, as this operation is O(1) for them. Unfortunately we aren’t, so I had to work within the language I’m given if I don’t want to implement a Linked List data type for this tutorial alone. Reasons like this are why I implement map, a pure function, imperatively above. The point here however is not efficiency, but theory.
Let’s take a look at filter:

function filter(sequence, predicate)
    return reduce(
        sequence,
        function(newlist, element)
            return predicate(element) and {unpack(newlist or {}), element} or newlist
        end
    )
end

And then partition is fairly straightforward to implement in this manner, now that we have proved that filter can be implemented with reduce it follows

function partition(sequence, predicate)
    return filter(sequence, predicate), filter(sequence, function(x)not predicate(x) end)
end

But now we come to reduce, how do you implement this without looping over the list with variables? Well, for that we must use everyone’s favorite: recursion! Keep in mind that this is far removed from the goal of efficiency, and exists only to prove that you don’t need for loops to iterate over a list:

function rest(sequence)
    return (function(a,...) return {...} end)(unpack(sequence))
end

function first(sequence)
    return sequence[1]
end

function rightreduce(sequence, operator)
    if #sequence < 0 then
        return nil
    else
        return operator(reduce(rest(sequence)), first(sequence))
    end
end

Please note, that this is the “right associative” form of reduce, as it will start reducing from the end of the list and work its way up. Also note the unforgivable but entirely valid abuse of “…” in the definition of the rest function. Despite these minor shortcomings I have now formally defined reduce in a way that doesn’t use variables, and by extension I have defined map, filter, and partition as such. Now, this is a partially insufficient definition, because reduce in this case is right associative this will cause our other functions to return sequences in the wrong order! How can we fix this horrific transgression!?

function reverse(list)
    if #list == 1 then return {first(list)}
    return {unpack(reverse(rest(list))), first(list)}
end

function reduce(sequence, operator)
    return rightreduce(reverse(sequence), operator)
end

Well that settles that.
I wrote this discussion portion at 3 AM, so if I made any mistakes feel free to point them out and laugh at me. Sorry if this post got a little too long, I’ve been really into functional programming recently and wanted to share.

Autterfly · November 10, 2018, 4:13pm

While functional style is cool and has good use cases, Lua is probably not a good one for it. At least, not Roblox Lua. Granted that it does work, you might end up with less than readable code and there will be performance drops (which vary depending on what you do).

Looking specifically at performance, Roblox Lua gives up a crucial tail call optimization which makes tail calls a bits less efficient, not to mention calls in Lua are probably already one of the slowest operations. “Slow” here is relative. Sure, it’s a few ms, but I heavily discourage heavy functional use within performance dependent code. Every time function blah() is encountered, Lua has to allocate and make a new Closure instance of the function, and every time a {blah...} is encountered, Lua has to allocate and make a new table.

Generally, Lua is not compile-time optimized enough for this style of code to make it into production being used everywhere without some sort issue. I’d stick to imperative/procedural for most things.

RegularTetragon · November 10, 2018, 4:21pm

Like I said the pure functional implementation is solely to show that it can be done, not that it’s a good idea in Roblox specifically. Map, filter, reduce, and partition when implemented imperitively are hardly less efficient than a respective for loop, and are much easier to reason about because they don’t change the original data structure. Performance and clarity are often inversely proportional, and sometimes it is important to choose one above the other, this is for that second case. Also having worked with functional code and having used it in my game, it’s far easier to read, and much more terse than equivalent imperitive code. Shared mutable state is the root of all evil.

EmeraldSlash · November 10, 2018, 9:48pm

Also having worked with functional code and having used it in my game, it’s far easier to read, and much more terse than equivalent imperitive code.

This is probably subjective because to understand your “high-level” code you need to understand the “low-level” code (in this context high-level is the use cases and low-level is the code like the four functions you based this topic around). It looks pretty at the high level but at the low-level reading and understanding any of those four functions is pretty hard for someone who is reading it with no background related to the code (me).

Anyway, this is pretty interesting. I want to try out doing some exclusively functional programming someday but I won’t be doing it in Lua hah.

FieryEvent · November 10, 2018, 10:16pm

That looks very interesting, I may look into it one day, I didn’t know functional programming existed until now, thanks.

I may have found one bug here, you’re indexing i here but it doesn’t seems to be defined

for i=1, #i

RegularTetragon · November 10, 2018, 11:13pm

Oh yeah, thanks! don’t write tutorials at 2 AM!

RegularTetragon · November 29, 2018, 8:39am

You should take a look at Haskell! It’s a beautiful language that has a syntax that more properly supports this kind of stuff, plus it’s lazy evaluated which lets it optimize your code to an insane degree. When it compiles it’ll completely reorder your code, automatically parallelize things, throw out things which are never used, automatically store old results and retrieve them instead of computing them again, etc.

Like if you want to use my map here to get a list of powers of 2 it’s really bulky

map({1,2,3,4,5,6,7,8,9,10}, function (x) return 2^x end)
but in Haskell it's

map (2^) [1..10]

and because Haskell is lazily evaluated it doesn’t actually compute that until it needs to be shown, which lets you construct infinite lists
i.e.

map (2^) [1..]

Which is a perfectly valid list, just don’t go asking for the last element.

EmeraldSlash · November 29, 2018, 8:11pm

So that infinite list part would return powers of 2 infinitely? Nice.

And I’ll check it out, thanks for the suggestion

TheCraneStyle · February 21, 2020, 5:30pm

I like what you’ve done here, I know I may be a cop out, but I use a lib called lume. It comes with very optimized lua map/reduce/filtering ect. =) game on my friends