elixir

# Understanding Associations in Elixir's Ecto

Andrew Scorpil on

Data modeling in Ecto takes a bit of getting used to, especially for developers that have mostly been working with traditional "heavy" ORMs.

For many novice Ecto users, association-related operations become the first stumbling stone. Ecto provides multiple functions for establishing and modifying associations between records, each tailored to the particular use-case. Judging from the number of questions about cast_assoc, put_assoc and build_assoc on StackOverflow and other online communities, choosing the right one can often be challenging, especially if the user is not yet accustomed to the technical terminology in Ecto's official documentation.

The goal of this post is to give a short but definitive answer to such questions in a few of the most common (and most simple) scenarios.

## Traditional "Heavy" ORMs VS Ecto

Traditional ORMs take on the massively complicated task of "masking" data-related operations, giving developers the illusion of working with language-native data containers. To achieve that, ORMs often perform complex transformations behind-the-scenes, which sometimes leads to suboptimal database performance. In this sense, ORMs propose a tradeoff between the convenience of language-native syntax and the precision of hand-crafted SQL queries.

Ecto essentially provides the same tradeoff but leans much closer to the side of hand-crafted SQL queries. The core conceptual difference is that Ecto does not intend to abstract away the database operations, instead, it provides an Elixir syntax for crafting SQL queries themselves. Ecto relies on a developer to format and validate the data to conform with the database schema, craft queries that use indexes efficiently, associate the records together and perform other tasks that ORM would try to automate. The result is a somewhat higher learning curve, but also, significantly increased flexibility. For a more in-depth comparison between ActiveRecord and Ecto, check out this excellent ActiveRecord vs. Ecto post.

## Direct Casting of Association ID's: cast

Working with associations doesn't always have to be complex. In a situation where you have the target ID, Ecto lets you treat the relation column as a normal database field.

To give a concrete example, let's assume we work with two models, Post and Comment, where multiple comments can refer to a single post. In that case, your models would look something like this.

	defmodule Blog.Post do
use Ecto.Schema
schema "post" do
field :title, :string
field :body, :string
end
end

defmodule Blog.Comment do
use Ecto.Schema
schema "post" do
belongs_to :post, Blog.Post
field :body, :string
end
end

These models reflect the following table schemas in the database:

Each table contains a primary field id by default. The has_many field on Post does not refer to a database field, it only exists to hint to Ecto that it's possible to preload comments for a post using the comment's belongs_to field. The belongs_to field, on the other hand, refers to an existing field in a table schema. By default, the name of this field in a table is different from the name in Ecto's model: the database field has _id at the end.

Ecto lets you modify these kinds of association fields the same way you would modify any other field. In Ecto, changing the value of a primitive field is called "casting". If you need to create a new comment for a particular post, you don't really need any of the association-specific functions, you can just cast the value of a primary key:

comment
|> cast(params, [:post_id, :body])

This is the simplest and most straightforward method of creating an association between two tables.

## Casting Associations: cast_assoc

It is useful to think about cast_assoc as a special version of cast that works on associations. However, casting associations can be much more complex than casting normal fields. The cast call normally translates more or less directly into a single SQL query, while cast_assoc might result in multiple INSERT, UPDATE or DELETE queries. Let's assume the database tables from the previous example contains the following content:

##### Post:
idtitlebody
7A story...Once upon a time...
##### Comment
idpost_idbody
107Great story!
117What happened next?
127Thanks for the article

Since the Post model contains has_many association to Comment, it's trivial to preload all comments on a particular post:

Post
|> Repo.get!(id)
|> Repo.preload(:comments)

The shape of the returned data will be as follows:

%Post{
"id" => 1,
"title" => "A story of...",
"body" => "Once upon a time...",
%Comment{"id" => 10, "body" => "Great story!"},
%Comment{"id" => 11, "body" => "What happened next?"},
%Comment{"id" => 12, "body" => "Thanks for the article"},
],
}

Single cast_assoc call on :comments will replace the association as a whole. In effect, this means that the values you pass to cast_assoc will be returned in future preload calls. This does not necessarily mean that all database rows are replaced. Ecto compares before and after states and does the minimal amount of work required to reach the desired state. To illustrate that, consider the following changeset:

params = %{comments: [
%Comment{"id" => 11, "body" => "What happened next?"},
%Comment{"id" => 12, "body" => "Thank you for the post"},
%Comment{"body" => "Interesting"},
]}
post
|> cast(params, [])
|> cast_assoc(:comments)

Executing this changeset results in three calls to the database:

• DELETE the comment with an id of 10
• UPDATE the comment with the id of 12 and set body to "Thank you for the post"
• INSERT a comment with a body "Interesting" and assign it a new id.

The row with the ID of 11 was left unchanged because it matches preloaded values. An important thing to note here is that Ecto will not preload data on its own, so to make use of cast_assoc, you need to remember to call preload beforehand. However, you are not restricted to preloading a complete association. cast_assoc will work just as well when you use preload as a subset of records with Repo.preload(:comments, query). This feature is very useful for limiting the impact of cast_assoc to a subset of associated records.

## Defining Associations: put_assoc

At first glance, put_assoc is in many ways similar to cast_assoc: it also works on a whole association and requires you to pre-load records to be updated. However, upon closer examination, it turns out to be almost opposite in the way you use it. The crucial distinction is that put_assoc is designed to update the association "references", not the data. That is, you would typically use put_assoc when you want to connect a record to one or more records that already exist in the database.

put_assoc can be used to associate a new comment with an existing post, similar to what we did in the "direct casting" section, but without using the post_id field directly:

post = Repo.get!(7)
# ...
comment
|> cast(params, [:body])
|> put_assoc(:post, post)

This makes your code a little bit cleaner in cases with complex primary fields because Ecto does all the bookkeeping for you.

## Building Related Records: build_assoc

build_assoc is the convenience function that allows you to create related records through an association on an existing record. To continue our post/comment example, here is another way to create a new comment:

post = Repo.get!(7)
...
comment_params = %{
"title": "A story..."
"body": "Once upon a time..."
}
# %Comment{post_id: 7, title: "A story...", body: "Once upon a time..."}

The power of build_assoc is in its expressiveness: the code above clearly shows you that comment belongs to the post.

Unlike the functions discussed above, build_assoc does not operate on a changeset — it builds one. This means you would only ever use build_assoc when you want to create a new record.

## Wrap Up

While this post doesn't begin to cover the variety of use-cases you might encounter in production applications, I hope it gives you a strong foundation from which to begin searching for an answer. And if you need a refresher in the future, here is a simple flowchart that will remind you of the discussed use cases:

Ecto's association functions are relatively thin abstractions over field references in a database. Understanding how each of those functions works on a database level is crucial to becoming an expert Elixir/Phoenix developer. Fortunately, Ecto is built in such a way that each function is relatively small, deterministic, and has a single purpose. After you have mastered the basics, you can expect fewer "gotchas" compared to the traditional ORMs (or at least in my experience that was the case). Happy coding!

P.S. If you'd like to read Elixir Alchemy posts as soon as they get off the press, subscribe to our Elixir Alchemy newsletter and never miss a single post!