elixir

# Algebraic Data Types in Elixir

Gints Dreimanis on

Elixir is a dynamically-typed language. Types in Elixir are checked when a program runs, not when it compiles. If they don’t match up, an exception is thrown.

In statically-typed languages, types are checked during compile time. This can help us write code that is correct, understandable, and refactorable.

But it also introduces a certain focus on types as the foundation for your application. One interesting concept is to use types to model your business domain. In languages like Haskell, F#, and OCaml, this is usually done with algebraic data types (ADTs) — they build compound data types by aggregating types with product (AND) and sum (OR) types.

With the help of Dialyzer, a static analysis tool, you can use ADTs to constrain the number of your application's allowed states. This decreases the chance that errors will slip in.

In this article, we'll cover Dialyzer, ADTs, and how you can solve the problem of illegal states by using them together.

Let's get going!

## Type Declarations in Elixir with Dialyzer

In Elixir (and other BEAM languages), checking type specifications is usually done with Dialyzer.

Dialyzer is different from the type systems of Haskell, OCaml, or even TypeScript. Instead of you proving to the compiler that your code is correct, Dialyzer needs to prove to you that your code is not correct.

This makes Dialyzer rather lax in requirements. If there is a way for the types to work, Dialyzer will assume you know what you are doing, and the types will, indeed, work.

But it’s still useful to reason about types and catch the occasional mistake.

## Quick Intro to Dialyzer in Elixir

You can use Dialyzer in Elixir by adding a type spec for your functions via @spec.

elixir
#          (1)      (2)           (3)
@spec plus_one(integer) :: integer

def plus_one(x), do: x + 1
end

Here, we just write the name of the function (1) with its type as an argument (2), then the return type of the function (3).

You can find a list of basic types to use in Elixir's documentation.

You can also create your own type aliases via @type. To do that, you need to provide the alias' name (1) and its type (2).

elixir
#         (1)        (2)
@type counter :: integer

If you use the Elixir language server, that’s all you need to do: your plugin/extension will notify you about any specs that Dialyzer doesn’t like. Otherwise, you need to run the mix dialyzer task to check your types.

You can find more about the use of Dialyzer on Elixir's website and in Elixir's typespec docs.

Now that we can type spec our Elixir code, let’s dig into algebraic data types.

## Algebraic Data Types

While the name sounds scary (ooh, algebra 👻), algebraic data types are relatively simple.

This section will focus on the two main parts of ADTs: product (AND) and sum (OR) types.

### Product Types

Product types are all around us. A product type is just a type with two or more fields that each hold a data type. You can also think of them as AND types.

For example, a tuple is a product type:

elixir
@type tuple(a, b) :: {a, b}

It takes two data types — a and b — and returns a type that holds both of these data types.

In Elixir, we also use product types with named fields — structs.

elixir
defmodule Person do
defstruct first_name: "Gints", last_name: "Dreimanis"
end

Let's look at the type in this example:

elixir
@type t() :: %__MODULE__{
first_name: String.t(),
last_name: String.t()
}

Generally, you can think of types as being sets of possible values. For example, a boolean has two possible values: :true and :false. A type for a traffic light color has one of three possible values: :green, :yellow, and :red.

If we have two types with sizes a and b, then the product type of those types will contain a * b values — the product of the sizes of those types. If you make a product type of the boolean and traffic light types — e.g., put a boolean and traffic light in a tuple — you'll have 2 * 3 = 6 possible values.

elixir
{:green, :true}
{:yellow, :true}
{:red, :true}
{:green, :false}
{:yellow, :false}
{:red, :false}

### Sum Types

A historically less common kind of type is sum types.

In contrast to a product type, a sum type gives you one of two (or more) options. You can also think of them as OR types.

We use these in Elixir as well. For example, result tuples are sum types.

elixir
@type result(a, b) :: {:error, a} | {:ok, b}

In a result tuple, we can have an error of type a or success of type b.

Or, for example, we can have a sum type for optional values.

elixir
@type optional(a) :: :error | {:ok, a}

But it can also just be used for making lists of alternatives.

elixir
@type direction :: :north | :west | :south | :east

If you list two types in a sum type, the resulting type can pick a type from one set of values or the other. Because of this, its size is generally the sum of the sizes of those types.

The above might not always be true when using Dialyzer: you can put together two sets that overlap. For the statement to be true, they need to be tagged with the set they come from — the result type we defined above is a prime example.

That’s what people usually mean by algebraic data types.

### There's More to Algebraic Data Types

By putting together sums and products, we have something akin to the algebra that we know and love from our school days: multiplication, sums, and variables.

Of course, there is more to algebraic types than just this: there’s also recursion, exponentials, etc. If you want to delve deeper into the subject, check out how they look in languages like Haskell.

## How Algebraic Data Types Help in Domain Modeling

We Elixir programmers usually don’t think in terms of sum types. The primary tool for modeling a domain in Elixir is the struct, a product type with named fields.

While that's good enough for most things, sometimes it is beneficial to use sum types as well.

Let’s look at an example.

### Custom Kanban Board

Let’s imagine we need to create a representation of a customized Kanban board issue.

Our issues can be in one of these states:

• Searching for assignee: in this case, it should have neither an assignee nor a reviewer
• Not started yet: in this and the following state, it should have an assignee but not a reviewer
• In progress
• In review: in this and the following state, it should have an assignee and a reviewer
• Done

All issues also have names and descriptions.

At first, one might be tempted to use a simple struct.

elixir
defmodule Issue do
defstruct name: "",
description: "",
state: :searching_for_assignee
assignee: nil,
reviewer: nil
end

But as we saw in the section on product types, a simple product type has a lot of possible values, some of which might not match our requirements.

For example, we can create an issue that searches for an assignee but still has an assignee and a reviewer.

sh
iex(1)> %Issue{name: "wrong issue", description: "not good at all", state: :searching_for_assignee, assignee: "Jorge Luis Borges", reviewer: "Gabriel García Márquez"}
%Issue{
assignee: "Jorge Luis Borges",
description: "not good at all",
name: "wrong issue",
reviewer: "Gabriel García Márquez",
state: :searching_for_assignee
}

While it is possible to avoid doing that generally, it is simpler to make it impossible. We have a saying for this: “make illegal states unrepresentable”.

To do that, we need to create a sum type that covers all the states that we want to allow. It will enable us to cut out some of the wrong states by substituting products of values with sums of values.

First, let’s combine the state, assignee, and reviewer fields into one field: state.

elixir
defstruct name: "",
description: "",
state: :searching_for_assignee

After that, let’s define a sum type for state that will contain our specified options.

Let’s look at them again. Our issue can be in one of these states:

• Searching for assignee
• Not started yet but with an assignee
• In progress and with an assignee
• In review with an assignee and reviewer
• Done, with the assignee and reviewer left for the historical record

It’s quite easy to define a type that reads almost like this:

elixir
@type state ::
:searching_for_assignee
| {:not_started, String.t()}
| {:in_progress, String.t()}
| {:in_review, String.t(), String.t()}
| {:done, String.t(), String.t()}

So that it's simpler to understand, we can create aliases for the assignee and reviewer.

elixir
@type assignee :: String.t()
@type reviewer :: String.t()

Now, the type looks exactly like our list of rules.

elixir
@type state ::
:searching_for_assignee
| {:not_started, assignee}
| {:in_progress, assignee}
| {:in_review, assignee, reviewer}
| {:done, assignee, reviewer}

All that's left is to create a type specification for the module struct (Issue) by using our state type.

elixir
@type t() :: %__MODULE__{
name: String.t(),
description: String.t(),
state: state
}

Here’s the complete module code:

elixir
defmodule Issue do
defstruct name: "",
description: "",
state: :searching_for_assignee

@type assignee :: String.t()
@type reviewer :: String.t()
@type state ::
:searching_for_assignee
| {:not_started, assignee}
| {:in_progress, assignee}
| {:in_review, assignee, reviewer}
| {:done, assignee, reviewer}

@type t() :: %__MODULE__{
name: String.t(),
description: String.t(),
state: state
}
end

Now we can test whether this type specification can stop us from making a logic error.

We’ll create a function that adds a reviewer to the issue, but we’ll slip a bug inside: it will not change the state of the issue. We’ll also add a type spec.

elixir
@spec add_assignee(Issue.t(), assignee) :: Issue.t()
def add_assignee(%{state: :searching_for_assignee} = issue, assignee_name) do
%{issue | state: {:searching_for_assignee, assignee_name}}
end

Dialyzer will correctly return a type error here:

sh
lib/issue.ex:21:invalid_contract
The @spec for the function does not match the success typing of the function.

Function:

Success typing:
@spec add_assignee(%{:state => :searching_for_assignee, _ => _}, _) :: %{
:state => {:searching_for_assignee, _},
_ => _
}

It’s a bit cryptic, but it basically means that Issue.add_assignee didn’t compile, and we should look into it! 🙃

As you can see, algebraic data types have saved us from making an error. Turns out, they’re not really a scary monster but a friend.

## Benefits of Algebraic Data Types for Your Elixir App

Adopting algebraic data types for your Elixir applications is a two-step decision process.

The first step is to choose to use Dialyzer and typespecs. Dialyzer gives most of the benefits that any language with static typing would:

• It's easier to catch errors you've made in code.
• Types give extra information about code: what it does and what values it operates. This is helpful when trying to understand the code.
• Once you have written your code, types can ensure the code still does the same thing (similar to tests) so it is easier to refactor.

Once you've adopted Dialyzer for your codebase, thinking in terms of algebraic data types should come naturally, and give a few benefits:

• As we have seen in the example, sum types in particular let you cut down the possible states and make illegal states unrepresentable.
• Having AND and OR in your vocabulary helps you build compound types in a way that's intuitive and understandable even to non-developers (domain experts).

Of course, ADTs are just one tool for software correctness — definitely not a panacea. But in general, ADTs are a helpful concept for anyone working with an Elixir codebase that uses Dialyzer.

If your codebase doesn't use Dialyzer, your first goal should be to introduce it, which is a much larger undertaking than changing how you do types when writing type specifications. That undertaking is unfortunately out of the scope of this article.

In this post, we explored how to check type specifications with Dialyzer, before focusing on the two main parts of ADTs: product (AND) and sum (OR) types.

We then turned to the use of ADTs in domain modeling and finally touched on the benefits of ADTs.

If you want to explore algebraic data types, the best option is to get accustomed to a statically-typed functional language such as Haskell, OCaml, or F#.

Each of those languages has excellent materials for using types as the foundation of your code. The resource that would vibe the best with Elixir developers is probably the aforementioned Domain Modeling Made Functional since it also covers domain-driven design, which can confuse new Phoenix developers.

Elixir also has a library that emulates algebraic data types called Algae, which we used in one of our previous articles — Functional Programming in Elixir with Witchcraft. In general, Algae provides better tools than Dialyzer to build up a tower of abstractions from simple building blocks. Brave readers might find it rewarding to experiment with.

Until next time, happy coding!

P.S. If you'd like to read Elixir Alchemy posts as soon as they get off the press, subscribe to our Elixir Alchemy newsletter and never miss a single post!

# Gints Dreimanis

Our guest author Gints is a writer and editor at Serokell Blog. He's excited about making computer science and math concepts accessible to all.

Become our next author!