# Iteration, Recursion, and Tail-call Optimization in Elixir Jeff Kreeftmeijer on

Elixir uses recursion for looping over data structures like lists. Although most of it’s hidden away through higher-level abstractions on the `Enum` module, in some cases writing a recursive function yourself can be more performant. In this episode of Elixir Alchemy, we’ll try to find some of these cases.

Along the way, we’ll learn about body-recursive and tail-recursive functions. Although the latter is often touted as being faster, we’ll also look at cases where that’s not necessarily true. Let’s dive right in!

## Enumeration

Elixir provides functions on the `Enum` module to enumerate over collections.

 ```1 2 3 4 5``` ```def sum_numbers1(list) do list |> Enum.filter(&is_number/1) |> Enum.reduce(0, fn n, sum -> n + sum end) end ```

This example takes a list and returns the sum of all numbers in that list. Because there might be non-numeric items in the input list, it uses `Enum.filter/2` to select only the items that `is_number/1` returns true on. After that, the remaining values are added together through `Enum.reduce/3`.

 `1` ```sum_numbers1([1, 2, "three", 4, %{}]) # => 7 ```

While this solution works, it iterates over the whole list twice to get to the result. To make this more performant, we might instead choose to write a recursive function that can do this in one go.

## Recursion

A recursive function can do both tasks (removing the non-numeric items and adding the rest together) in one run. It will call itself for every item in the list, and use pattern matching to decide what to do with each item.

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14``` ```# 1 def sum_numbers2([head | tail]) when is_number(head) do sum_numbers2(tail) + head end # 2 def sum_numbers2([_head | tail]) do sum_numbers2(tail) end # 3 def sum_numbers2([]) do 0 end ```

Instead of using the functions provided by the `Enum` module, this solution calls itself until the list is empty. It does so using three function heads:

1. When the first item in the list (the head) is a number, we’ll call `sum_numbers2/1` with the rest of list (anything but the head, called the tail). This call returns a number we’ll add our current head to.
2. When the head is not a number, we’ll drop it by calling the `sum_numbers2/1` function with just the tail.
3. When the list is empty, we’ll return 0.
 `1` ```sum_numbers2([1, 2, "three", 4, %{}]) # => 7 ```

When calling this function with `[1, 2, "three", 4, %{}]`, the `sum_numbers2/1` function is called repeatedly in the following order to get to the result:

• `sum_numbers2([1, 2, "three", 4, %{}])`
• `sum_numbers2([2, "three", 4, %{}]) + 1`
• `sum_numbers2(["three", 4, %{}]) + 1 + 2`
• `sum_numbers2([4, %{}]) + 1 + 2`
• `sum_numbers2([%{}]) + 1 + 2 + 4`
• `sum_numbers2([]) + 1 + 2 + 4`
• `0 + 1 + 2 + 4`

This list shows the how the function is simplified until we get to a result, which shows the function being called six times. Once for each item, plus once more for the empty list at the end.

Because the function calls itself for each iteration, each item in the list causes a stack frame to be added to the call stack. That’s because each item warrants a call to a function, which needs to be executed to claim its result. Adding a stack frame to the call stack requires some memory to be allocated, and can cause some overhead.

## Tail recursion and tail-call optimization

To keep the memory footprint to a minimum, some languages—like Erlang and thus Elixir—implement tail-call optimization. With a small rewrite of our code, we can prevent the stack frame being added and that memory allocated.

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19``` ```# 1 def sum_numbers3(list) do do_sum_numbers3(list, 0) end # 2 defp do_sum_numbers3([head | tail], sum) when is_number(head) do do_sum_numbers3(tail, sum + head) end # 3 defp do_sum_numbers3([_head | tail], sum) do do_sum_numbers3(tail, sum) end # 4 defp do_sum_numbers3([], sum) do sum end ```

This example is yet another implementation of the function from before. However, this example is tail-recursive, meaning it doesn’t need to await a call to itself before continuing. It does so by keeping an accumulator (`sum`) that keeps the current value instead of stacking function calls.

1. The `sum_numbers3/1` function is the public function, which calls the private `do_sum_numbers3/2` function with the passed list and an accumulator of `0` to start with.
2. Much like the previous example, `do_sum_numbers3/2`’s first function head matches lists with a numeric head. It calls itself with the list’s tail, and the head added to the accumulator.
3. When the head is not a number, the third function head drops it by calling itself with the tail and the current accumulator, without changing anything.
4. Finally, the exit clause returns the accumulator.

By calling itself at the end of the function and passing the accumulator with all calculated values, the Erlang VM can execute a trick called tail-call optimization, which allows it to circumvent adding new stack frames. Instead, it uses the current frame by jumping back up to the beginning of the function.

Tail call optimization allows functions calling themselves without running into stack problems.

 `1` ```def sleep, do: sleep ```

This example implements a function that continually calls itself. Without tail-call optimization, this function would produce a stack error.

## Which is faster?

We’ve written three implementations of the same function. The first used `Enum.filter/2` and `Enum.reduce/3` to filter and sum the list, and iterated over the list twice. To fix that, we used a recursive function that did it in one go, which we further improved using tail-call optimization.

 ```1 2 3 4 5 6 7 8 9 10 11 12 13``` ```letters = Enum.map(?a..?z, fn x -> <> end) numbers = Enum.to_list(0..10_000) list = Enum.shuffle(letters ++ numbers) Benchee.run( %{ "Enum.filter/2 and Enum.reduce/3" => fn -> Recursion.sum_numbers1(list) end, "Body-recursive" => fn -> Recursion.sum_numbers2(list) end, "Tail-recursive" => fn -> Recursion.sum_numbers3(list) end }, time: 10, memory_time: 2 ) ```

To benchmark our solutions, we’ll use Benchee. We’ll prepare a shuffled list of strings and numbers for each function to sum for ten seconds.

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16``` ```Name ips average deviation median 99th % Tail-recursive 107.35 K 9.32 μs ±21.39% 9 μs 14 μs Body-recursive 55.65 K 17.97 μs ±662.77% 14 μs 34 μs Enum.filter/2 and Enum.reduce/3 20.86 K 47.94 μs ±51.46% 46 μs 85 μs Comparison: Tail-recursive 107.35 K Body-recursive 55.65 K - 1.93x slower Enum.filter/2 and Enum.reduce/3 20.86 K - 5.15x slower Memory usage statistics: Name Memory usage Tail-recursive 0 B Body-recursive 0 B Enum.filter/2 and Enum.reduce/3 16064 B ```

In this test on this machine, we see that the body-recursive version is almost three times as fast as the original implementation that used `Enum.filter/2` and `Enum.reduce/3`. The tail-recursive version was almost twice as fast ad the body-recursive one.

## Is tail-call recursion always faster?

While the results of that benchmark look quite convincing, tail-recursion isn’t always faster than body recursion. In fact, that’s one of the 7 myths of Erlang performance.

While tail-recursive calls are usually faster for list reductions—like the example we’ve seen before—body-recursive functions can be faster in some situations. That’s often true when mapping over a list, where the function takes a list and returns a list without reducing it to a single value.

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35``` ```def double_numbers1(list) do list |> Enum.filter(&is_number/1) |> Enum.map(fn n -> n * 2 end) end def double_numbers2([]) do [] end def double_numbers2([head | tail]) when is_number(head) do [head * 2 | double_numbers2(tail)] end def double_numbers2([_head | tail]) do double_numbers2(tail) end def double_numbers3(list) do list |> do_double_numbers3([]) |> Enum.reverse() end defp do_double_numbers3([], acc) do acc end defp do_double_numbers3([head | tail], acc) when is_number(head) do do_double_numbers3(tail, [head * 2 | acc]) end defp do_double_numbers3([_head | tail], acc) do do_double_numbers3(tail, acc) end ```

This example is a lot like the example we’ve seen before, but instead of adding all numbers together and reducing them to a single number, the `double_numbers/1` function doubles all numbers and returns them in a list.

Like before, we have three implementations. The first uses `Enum.filter/2` and `Enum.map/2` to iterate over the list twice, the second is body-recursive and the last is tail-recursive. Before returning, the tail-recursive function needs to reverse the list, because the values get prepended to the accumulator.

Check out the example project for a comparison of more possible implementations, like list comprehensions and a tail-recursive version that doesn’t reverse the list.

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16``` ```Name ips average deviation median 99th % body-recursive 36.86 K 27.13 μs ±39.82% 26 μs 47 μs tail-recursive 27.46 K 36.42 μs ±1176.74% 27 μs 80 μs Enum.filter/2 and Enum.map/2 12.62 K 79.25 μs ±194.81% 62 μs 186 μs Comparison: body-recursive 36.86 K tail-recursive 27.46 K - 1.34x slower Enum.filter/2 and Enum.map/2 12.62 K - 2.92x slower Memory usage statistics: Name Memory usage body-recursive 15.64 KB tail-recursive 20.09 KB - 1.28x memory usage Enum.filter/2 and Enum.map/2 31.33 KB - 2.00x memory usage ```

When running a benchmark for each of these implementations, we can see that the body-recursive version is fastest in this case. Having to reverse the list in the tail-recursive version negates the improved performance tail-call optimization adds in this specific case.

## As always, it depends

This concludes our look into recursion in Elixir. As a rule of thumb; tail-recursive functions are faster if they don’t need to reverse the result before returning it. That’s because that requires another iteration over the whole list. Tail-recursive functions are usually faster at reducing lists, like our first example.

Which is the best approach depends on the situation. For mission-critical loops, writing a benchmark to measure which solution is faster is usually your best bet, as it’s not immediately obvious which will perform better.

To learn more about recursion, also be sure to read this overview of the available methods, and when to use which. If you liked this article, don’t forget to subscribe to the mailing list if you’d like to read more Elixir Alchemy!

# 10 latest Elixir articles

Go back # Elixir Alchemy

A true alchemist is never done exploring. And neither are we. Sign up for our Elixir Alchemy email series and receive deep insights about Elixir, Phoenix and other developments.