This post was updated on 9 August 2023 with a code walkthrough for a sample app.
Data modeling in Ecto takes a bit of getting used to, especially for developers that have mostly been working with traditional "heavy" ORMs.
For many novice Ecto users, association-related operations become the first stumbling stone. Ecto provides multiple
functions for establishing and modifying associations between records, each tailored to the particular use-case.
Judging from the number of questions about cast_assoc
, put_assoc
and build_assoc
on StackOverflow and other
online communities, choosing the right one can often be challenging, especially if the user is not yet accustomed to
the technical terminology in Ecto's official documentation.
The goal of this post is to give a short but definitive answer to such questions in a few of the most common (and most simple) scenarios.
Traditional "Heavy" ORMs VS Ecto
Traditional ORMs take on the massively complicated task of "masking" data-related operations, giving developers the illusion of working with language-native data containers. To achieve that, ORMs often perform complex transformations behind-the-scenes, which sometimes leads to suboptimal database performance. In this sense, ORMs propose a tradeoff between the convenience of language-native syntax and the precision of hand-crafted SQL queries.
Ecto essentially provides the same tradeoff but leans much closer to the side of hand-crafted SQL queries. The core conceptual difference is that Ecto does not intend to abstract away the database operations, instead, it provides an Elixir syntax for crafting SQL queries themselves. Ecto relies on a developer to format and validate the data to conform with the database schema, craft queries that use indexes efficiently, associate the records together and perform other tasks that ORM would try to automate. The result is a somewhat higher learning curve, but also, significantly increased flexibility. For a more in-depth comparison between ActiveRecord and Ecto, check out this excellent ActiveRecord vs. Ecto post.
Moving forward, we'll create a simple Elixir blog app which we'll use throughout the article.
Creating an Example Elixir App
In your terminal, create a new Elixir app like so:
Then open up mix.exs
and add the following dependencies to enable us to work with Ecto:
Setting Up the Repo
One thing to note is that Ecto has adaptors for PostgreSQL, MySQL and SQLserver by default. Here, we are adding the PostgreSQL adaptor (we assume you already have PostgreSQL installed).
Next, run mix deps.get
to install the newly added dependencies. Then we need to create a repo which forms the point of contact between the app and the database:
Go ahead and edit the automatically created database configuration:
To ensure the Ecto process is started when the app starts, we need to add the repo to the app's supervisor like so:
Next, we add the Blog
repo to config.exs
:
And finally, run mix ecto.create
to create the database and finalize the setup process. Next, let's do a migration for a Post
model that we'll use
in the proceeding steps.
Adding Ecto Migrations
Run the commands below to create the Post
and associated Comment
migration:
This will generate two time-stamped migrations in priv/repo/migrations
which you should edit as shown below, starting with the Post
:
Also edit the Comment
migration as follows:
With that done, run the migrations with mix ecto.migrate
. Finally, to use our app we need a structured
way to query the database using Ecto schemas.
Adding Ecto schemas
Although it's possible to use schema-less queries, let's avoid having to manually construct queries every time we need to use them.
We'll add the first schema for posts. Create a new file under lib/blog/post.ex
with the following contents:
We won't go into the details of how to build schemas for now. If you need to dig into the topic further, check out the Ecto schema docs.
Next, create another schema to handle comments under lib/blog/comment.ex
and edit it as below:
And with that, the app is ready. Let's now use it to dive straight into learning more on Ecto associations.
Direct Casting of Association ID's: cast
Working with associations doesn't always have to be complex. In a situation where you have the target ID, Ecto lets you treat the relation column as a normal database field.
To give a concrete example, let's assume we work with two models, Post
and Comment
, where multiple comments can
refer to a single post. In that case, your models would look something like this.
These models reflect the following table schemas in the database:
Each table contains a primary field id
by default. The has_many
field on Post
does not refer to a database field, it
only exists to hint to Ecto that it's possible to preload comments for a post using the comment's belongs_to
field.
The belongs_to
field, on the other hand, refers to an existing field in a table schema. By default, the name of this
field in a table is different from the name in Ecto's model: the database field has _id
at the end.
Ecto lets you modify these kinds of association fields the same way you would modify any other field. In Ecto,
changing the value of a primitive field is called "casting". If you need to create a new comment for a particular post,
you don't really need any of the association-specific functions, you can just cast
the value of a primary key:
This is the simplest and most straightforward method of creating an association between two tables.
Casting Associations: cast_assoc
It is useful to think about cast_assoc
as a special version of cast
that works on associations.
However, casting associations can be much more complex than casting normal fields. The cast
call normally translates
more or less directly into a single SQL query, while cast_assoc
might result in multiple INSERT
, UPDATE
or
DELETE
queries. Let's assume the database tables from the previous example contains the following content:
Post:
id | title | body |
---|---|---|
7 | A story... | Once upon a time... |
Comment
id | post_id | body |
---|---|---|
10 | 7 | Great story! |
11 | 7 | What happened next? |
12 | 7 | Thanks for the article |
Since the Post
model contains has_many
association to Comment
, it's trivial to preload all comments
on a particular post:
The shape of the returned data will be as follows:
Single cast_assoc
call on :comments
will replace the association as a whole. In effect, this means that the values
you pass to cast_assoc
will be returned in future preload
calls. This does not necessarily mean that all
database rows are replaced. Ecto compares before and after states and does the minimal amount of work required
to reach the desired state. To illustrate that, consider the following changeset:
Executing this changeset results in three calls to the database:
DELETE
the comment with an id of10
UPDATE
the comment with the id of12
and set body to "Thank you for the post"INSERT
a comment with a body "Interesting" and assign it a new id.
The row with the ID of 11 was left unchanged because it matches preloaded values. An important thing to note here is that
Ecto will not preload data on its own, so to make use of cast_assoc
, you need to remember to call preload
beforehand. However, you are not restricted to preloading a complete association. cast_assoc
will work just as
well when you use preload as a subset of records with Repo.preload(:comments, query)
. This feature is very useful for
limiting the impact of cast_assoc
to a subset of associated records.
Defining Associations: put_assoc
At first glance, put_assoc
is in many ways similar to cast_assoc
: it also works on a whole association
and requires you to pre-load records to be updated. However, upon closer examination, it turns out to be almost
opposite in the way you use it. The crucial distinction is that put_assoc
is designed to update
the association "references", not the data. That is, you would typically use put_assoc
when you want to connect
a record to one or more records that already exist in the database.
put_assoc
can be used to associate a new comment with an existing post, similar to what we did in the "direct casting"
section, but without using the post_id
field directly:
This makes your code a little bit cleaner in cases with complex primary fields because Ecto does all the bookkeeping for you.
Building Related Records: build_assoc
build_assoc
is the convenience function that allows you to create related records through an association
on an existing record. To continue our post/comment example, here is another way to create a new comment:
The power of build_assoc
is in its expressiveness: the code above clearly shows you that comment belongs to the post.
Unlike the functions discussed above, build_assoc
does not operate on a changeset — it builds one. This means you would
only ever use build_assoc
when you want to create a new record.
Wrap Up
While this post doesn't begin to cover the variety of use-cases you might encounter in production applications, I hope it gives you a strong foundation from which to begin searching for an answer. And if you need a refresher in the future, here is a simple flowchart that will remind you of the discussed use cases:
Ecto's association functions are relatively thin abstractions over field references in a database. Understanding how each of those functions works on a database level is crucial to becoming an expert Elixir/Phoenix developer. Fortunately, Ecto is built in such a way that each function is relatively small, deterministic, and has a single purpose. After you have mastered the basics, you can expect fewer "gotchas" compared to the traditional ORMs (or at least in my experience that was the case). Happy coding!
P.S. If you'd like to read Elixir Alchemy posts as soon as they get off the press, subscribe to our Elixir Alchemy newsletter and never miss a single post!