Evaluating Elixir Functions - Part 1

Spiral staircase

When I first read about generators in JavaScript, I assumed, “magic.” Reading the spec is interesting but to get to the guts of how they are implemented, I would have to dig into some C/C++ code. When I wanted to learn about how Elixir implements Streams, I found it very easy to get an understanding how one can lazily iterate over a range.

Why did I find it so easy? This is, in part, due to the functional world in which we operate. You change state by calling a function, not by making an assignment. No longer are we thinking in steps but rather in a sequence of operations. This greatly frees our mind to focus on the domain of the problem rather than focusing too much on organizing and taming the domain.

Although the docs provide a good example why composable lazy streams are efficient for large data sets, I figured we could start with a basic example.

stripes = Stream.cycle([:red, :white, :blue])
stripes |> Stream.take(6_000_000) |> Enum.to_list # [:red, :white, :blue, :red, :white, :blue, ...]

Looking at the above code, you may come up with a few ways one can build up a list like this. If you take it at the source code, you will learn pretty quickly how it works. In this article, I will introduce some broader functional programming concepts as well as dive into how one iteration of a stream is implemented.

Stream.unfold/2

Stream.unfold/2 is a cool utility function that is based on a functional idea called anamorphism — a sequence generated by applying a defined function to the previous result until a terminating condition is received. In fact, the term unfold is related to a standard operator in functional programming called fold, a family of higher order functions that is the basis for all functional programming. In the most familiar sense, this can also be replaced by the word reduce. Stated as defined, it is a function that analyzes a recursive data structure by recombining each analyzed part into a return value. Here is a paper providing explanations and proofs of the power of fold.

With this background, we can simply state Stream.unfold/2 analyzes a recursive data structure and generates successive values until told not to, returning an Enumerable type. unfold/2 also is the basis for other methods in the Stream module like iterate/2, interval/1 and cycle/1. Let’s look at the implementation in parts (as of Elixir 1.8.1) by generating a Fibonacci sequence.

Side note — if you are curious about these functions, iex is one of the most useful tools in our belt when looking for what a specific function does. If you have Elixir installed on your machine, type iex on the command line, and you can find all sorts of documentation. For example, by typing h Enum., hit tab and see which methods exist on the Enum module. Once you want to dig a little further, you can see what the docstring outputs with h Enum.dedup. Another fun note is to just hit tab when you enter the iex shell, and you can explore all of the Kernel methods that exist. Alright, back to the code.

Setup of a Stream with a fibonacci sequence

# generates a fibonacci sequence
# the details of this aren't super helpful. Just here for some setup
Stream.unfold({0, 1}, fn
  {a, b} -> {a, {b, a + b}} # return a tuple, with `a` being returned to the consumer and {b, a + b} is the state for the next iteration
end)
  1. From unfold/2 below, we return an anonymous function after passing in an initial accumulator and a function. We will get back to the anonymous function later but take note of the value placeholders &1 and &2. As we will see, this will eventually be our new accumulator and a function to generate each successive value.
  @spec unfold(acc, (acc -> {element, acc} | nil)) :: Enumerable.t()
  def unfold(next_acc, next_fun) do
    &do_unfold(next_acc, next_fun, &1, &2)
  end

This anonymous function can be rewritten as

fn one, two ->
  do_unfold(next_acc, next_fun, one, two)
end
  1. Jumping into do_unfold/4, you can see our end state is nil, so we stop execution with a :done instruction. If I told you these instructions have no special meaning outside of your program other than to inform state for the next operation, then I would be lying. In fact, no longer are we limited to just returning an accumulator. In Elixir, we can return :cont, :halt and :suspend. End state instructions include :done, :halted and :suspended and are utilized in Elixir. This is important knowledge for other scenarios unfold/2 can cover as noted at the end.
  defp do_unfold(next_acc, next_fun, {:cont, acc}, fun) do
    case next_fun.(next_acc) do
      nil -> {:done, acc}
      {v, next_acc} -> do_unfold(next_acc, next_fun, fun.(v, acc), fun)
    end
  end
  1. Also notice we recursively apply do_unfold/4 above. This is where things get interesting. If we were to follow the path of generating a Fibonacci sequence to this point, we should remember that we haven’t actually executed anything. Computations are only performed when you call a method from the Enum module. Streams are efficient because we delay executing work until the last possible step, side stepping multiple iterations over a list as found in similar Enum implementations. In this case, we will call Enum.to_list/1.
Stream.unfold({0, 1}, fn
  {a, b} -> {a, {b, a + b}}
end)
|> Enum.to_list()

Execution of a Stream with the Enum module

  1. Now onto how streams actually execute. It isn’t magic but some good ol’ functional recursion.

Let’s evaluate the anonymous function returned from Stream.unfold/2 that will be passed to Enum.to_list/1. The Stream documentation states it like so.

  Note the functions in this module are guaranteed to return enumerables.
  Since enumerables can have different shapes (structs, anonymous functions,
  and so on), the functions in this module may return any of those shapes
  and this may change at any time. For example, a function that today
  returns an anonymous function may return a struct in future releases.

An anonymous function is technically an enumerable. After we perform the initial operation, we get a result that we can pass to to_list/1.

result = #Function<64.66371359/2 in Stream.unfold/2>

enum = Enum.to_list(result)

So we got back a function in Stream.unfold/2 with an arity of 2. Only when we actually call Enum.to_list/1 do we calculate the next Fibonacci number from the previous number. As we noted before, fold is a general functional programming term and reduce is the basis for all this work. Coincidentally, by tracking to_list(stream), we eventually come across a method reduce. You can see the stream’s final resting place below.

Let’s work our way backwards starting at the end of the Enum.to_list/1 chain with Enumerable.reduce/3. Our anonymous function returned earlier from Stream.unfold/2 is function in this context.

defimpl Enumerable, for: Function do
  ...

  def reduce(function, acc, fun) when is_function(function, 2), do: function.(acc, fun) # fun = `fn x, acc -> {:cont, build_acc_function.(x, acc)}

  ...
end

What is fun? Let’s jump back to what called Enumerable.reduce/3. fun is a function that returns {:cont, build_acc_function.(x, acc)} when called. Since functions in Elixir are first class citizens, it is simply passed through via function.(acc, fun) above.

defp reduce_enumerable(enumerable, acc \\ [], build_acc_function \\ &[&1 | &2]) do
  Enumerable.reduce(enumerable, {:cont, acc}, fn x, acc -> {:cont, build_acc_function.(x, acc)} end) |> elem(1)
end
  • Note I changed the 3rd argument to build_acc_function and added default values to reduce confusion and show what initial state this method is populated with.

When do we actually call build_acc_function from the reduce method above? Let’s go back to do_unfold/4.

defp do_unfold(next_acc, next_fun, {:cont, acc}, fun) do
  case next_fun.(next_acc) do
    nil -> {:done, acc}
    {v, next_acc} -> do_unfold(next_acc, next_fun, fun.(v, acc), fun)
  end
end

In this scope, the 3rd and 4th arguments were the value placeholders &1 and &2 passed from Enumerable.reduce/3. {:cont, acc} is &1, the accumulator that is holding our state, while fun (&2) is our passed in function from reduce_enumerable/3 and returns {:cont, build_acc_function.(x, acc)} with build_acc_function being &[&1 | &2]. This is our general way of adding new items to the list with v being the new item.

fun = &[&1 | &2]
fun.(1, [2])
# [1, 2]

If we are lucky, we add the new value to the accumulator and return {:cont, new_acc}, calling do_unfold/4 again. This is how the list is built up from Stream.unfold/2. To state another way, by passing our initial anonymous do_unfold/4 function to Enum.to_list/1, we eventually just jump back to the Stream module and and recursively execute do_unfold/3 until we return nil, our :done state. In this case, we should use Enum.take(10) or else it will infinitely recurse. Cool, right?

Stream.unfold({0, 1}, fn
  {a, b} -> {a, {b, a + b}}
end)
|> Stream.map(&IO.inspect(&1))
|> Enum.take(10)

We only generate the first 10 results we need.

  1. Some alternative cases are provided via instructions if the function you provide returns :suspend or :halt.
  defp do_unfold(next_acc, next_fun, {:suspend, acc}, fun) do
    {:suspended, acc, &do_unfold(next_acc, next_fun, &1, fun)}
  end

  defp do_unfold(_next_acc, _next_fun, {:halt, acc}, _fun) do
    {:halted, acc}
  end

Recap

So understanding Elixir streams isn’t hard; it just needs a bit of function logic. Of course, the inner workings may change, but in the meantime, I find it useful to understand these patterns and recognize that it isn’t anything I can’t learn.

To recap, fold (aka reduce) is the building block of many higher order functions in functional programming. Stream.unfold/2 is an interesting function that allows us to solve problems that require generating sequential values based on logic we give it. We learned about streams and how they work — via functions and recursion. Surprised, right? However, Stream.unfold/2 is implemented differently than Stream.map/2. In follow up posts, I will dig into these differences, as well as Enum.sort/2 and Enum.sort_by, both of which bring lots of useful knowledge. Exciting Elixir posts ahead so keep an eye out.

DockYard is a digital product agency offering exceptional strategy, design, full stack engineering, web app development, custom software, Ember, Elixir, and Phoenix services, consulting, and training.

Newsletter

Stay in the Know

Get the latest news and insights on Elixir, Phoenix, machine learning, product strategy, and more—delivered straight to your inbox.

Narwin holding a press release sheet while opening the DockYard brand kit box