This is the second part of the "ActiveRecord vs. Ecto" series, in which Batman and Batgirl fight over querying databases and we compare apples and oranges.
After looking into database schemas and migrations in ActiveRecord vs. Ecto part one, this post covers how both ActiveRecord and Ecto enable developers to query the database, and how both ActiveRecord and Ecto compare when dealing with the same requirements. Along the way, we'll also find out Batgirl's 1989-2011 identity.
Seed data
Let's get started! Based on the database structure defined in the first post of this series, assume the users
and the invoices
tables have the following data stored in them:
users
id | full_name | created_at* | updated_at | |
---|---|---|---|---|
1 | Bette Kane | bette@kane.test | 2018-01-01 10:01:00 | 2018-01-01 10:01:00 |
2 | Barbara Gordon | barbara@gordon.test | 2018-01-02 10:02:00 | 2018-01-02 10:02:00 |
3 | Cassandra Cain | cassandra@cain.test | 2018-01-03 10:03:00 | 2018-01-03 10:03:00 |
4 | Stephanie Brown | stephanie@brown.test | 2018-01-04 10:04:00 | 2018-01-04 10:04:00 |
* ActiveRecord's created_at
field is named inserted_at
in Ecto by default.
invoices
id | user_id | payment_method | paid_at | created_at* | updated_at |
---|---|---|---|---|---|
1 | 1 | Credit Card | 2018-02-01 08:00:00 | 2018-01-02 08:00:00 | 2018-01-02 08:00:00 |
2 | 2 | Paypal | 2018-02-01 08:00:00 | 2018-01-03 08:00:00 | 2018-01-03 08:00:00 |
3 | 3 | 2018-01-04 08:00:00 | 2018-01-04 08:00:00 | ||
4 | 4 | 2018-01-05 08:00:00 | 2018-01-05 08:00:00 |
* ActiveRecord's created_at
field is named inserted_at
in Ecto by default.
Queries performed through this post assume that the data above is stored in the database, so keep this information in mind while reading it.
Find item using its primary key
Let's start with getting a record from the database using its primary key.
ActiveRecord
Ecto
Comparison
Both cases are quite similar. ActiveRecord relies on the find
class method of the User
model class. It means that every ActiveRecord child class has its own find
method in it.
Ecto uses a different approach, relying on the Repository concept as a mediator between the mapping layer and the domain. When using Ecto, the User
module has no knowledge about how to find itself. Such responsibility is present in the Repo
module, which is able to map it to the underneath datastore, which in our case is Postgres.
When comparing the SQL query itself, we can spot a few differences:
- ActiveRecord loads all the fields (
users.*
), while Ecto loads only the fields listed in theschema
definition. - ActiveRecord includes a
LIMIT 1
to the query, while Ecto doesn't.
Fetching all items
Let's go a step further and load all users from the database.
ActiveRecord
Ecto
Comparison
It follows the exact same pattern as the previous section. ActiveRecord uses the all
class method and Ecto relies on the repository pattern to load the records.
There are again some differences in the SQL queries:
- The same as the previous section, ActiveRecord loads all the fields (
users.*
), while Ecto loads only the fields listed in theschema
definition. - ActiveRecord also defines a
LIMIT 11
, while Ecto simply loads everything. This limit comes from theinspect
method used on the console (#L599).
Querying with conditions
It's very unlikely that we need to fetch all the records from a table. A common need is the use of conditions, to filter out the data returned.
Let's use that example to list all the invoices
which are still to be paid (WHERE paid_at IS NULL
).
ActiveRecord
Ecto
Comparison
In both examples, the where
keyword is used, which is a connection to the SQL WHERE
clause. Although the generated SQL queries are quite similar, the way how both tools get there have some important differences.
ActiveRecord transforms the paid_at: nil
argument to the paid_at IS NULL
SQL statement automatically. In order to get to the same output using Ecto, developers need to be more explicit about their intent, by calling the is_nil()
.
Another difference to be highlighted is the "pure" behaviour of the function where
in Ecto. When calling the where
function alone, it doesn't not interact with the database. The return of the where
function is a Ecto.Query
struct:
The database is only touched when the Repo.all()
function is called, passing the Ecto.Query
struct as argument. This approach allows query composition in Ecto, which is the subject of the next section.
Query composition
One of the most powerful aspects of database queries is composition. It describing a query in a way that contains more than a single condition.
If you are building raw SQL queries, it means you'll probably use some kind of concatenation. Imagine you have two conditions:
not_paid = 'paid_at IS NOT NULL'
paid_with_paypal = 'payment_method = "Paypal"'
In order to combine those two conditions using raw SQL, means you'll have to concatenate them using something similar to:
SELECT * FROM invoices WHERE #{not_paid} AND #{paid_with_paypal}
Luckily both ActiveRecord and Ecto have a solution for that.
ActiveRecord
Ecto
Comparison
Both queries are answering the same question: "Which invoices were paid and used Paypal?".
As already expected, ActiveRecord offers a more succinct way of composing the query (for that example), while Ecto requires developers to spend a bit more on writing the query. As usual, Batgirl (the Orphan, mute one with the Cassandra Cain identity) or Activerecord is not as verbose.
Don't be fooled by the verbosity and apparent complexity of the Ecto query shown above. In a real world environment, that query would be rewritten to look more like:
Seeing from that angle, the combination of the "pure" aspects of the function where
, which does not perform database operations by itself, with the pipe operator, makes query composition in Ecto really clean.
Ordering
Ordering is an important aspect of a query. It enables developers to ensure that a given query result follows a specified order.
ActiveRecord
Ecto
Comparison
Adding order to a query is straight-forward in both tools.
Although the Ecto example uses a Invoice
as first parameter, the order_by
function also accepts Ecto.Query
structs, which enables the order_by
function to be used in compositions, like:
Limiting
What would be a database without limit? A disaster. Luckily, both ActiveRecord and Ecto help to limit the number of returned records.
ActiveRecord
Ecto
Comparison
Both ActiveRecord and Ecto have a way of limiting the number of records returned by a query.
Ecto's limit
works similarly to order_by
, being suitable for query compositions.
Associations
ActiveRecord and Ecto have different approaches when it comes to how associations are handled.
ActiveRecord
In ActiveRecord, you can use any association defined in a model, without having to do anything special about that, for example:
The example above shows that we can get a list of the user invoices when calling user.invoices
. When doing so, ActiveRecord automatically queried the database and loaded the invoices that are associated with the user. While this approach makes things easier, in the sense of writing less code or having to worry about extra steps, it might be a problem if you are iterating over a number of users and fetching the invoices for each user. This issue is known as the "N + 1 problem".
In ActiveRecord, the proposed fix to the "N + 1 problem" is to use the includes
method:
In this case, ActiveRecord eager-loads the invoices
association when fetching the user (as seen in the two SQL queries shown).
Ecto
As you might already have noticed, Ecto really doesn't like magic or implicitness. It requires developers to be explicit about their intents.
Let's try the same approach of using user.invoices
with Ecto:
The result is a Ecto.Association.NotLoaded
. Not so useful.
To have access to the invoices, a developer needs to let Ecto know about that, using the preload
function:
Similarly to ActiveRecord includes
, the preload with fetch the associated invoices
, which will make them available when calling user.invoices
.
Comparison
Once again, the battle between ActiveRecord and Ecto ends up with a known-point: explicitness. Both tools enable developers to easily access associations, but while ActiveRecord makes it less verbose, the result of it might have unexpected behaviours. Ecto follows the WYSIWYG kind of approach, which only does what is seen in the query defined by the developer.
Rails is well-known for using and promoting caching strategies to all the different layers of the application. One example is about using the "Russian doll" caching approach, which relies entirely on the "N + 1 problem" for its caching mechanism to perform its magic.
Validations
Most validations present in ActiveRecord are also available in Ecto. Here's a list of common validations and how both ActiveRecord and Ecto define them:
ActiveRecord | Ecto |
---|---|
validates :title, presence: true | validate_required(changeset, [:title]) |
validates :email, confirmation: true | validate_confirmation(changeset, :email) |
validates :email, format: {with: /@/ } | validate_format(changeset, :email, ~r/@/) |
validates :start, exclusion: {in: %w(a b)} | validate_exclusion(changeset, :start, ~w(a b)) |
validates :start, inclusion: {in: %w(a b)} | validate_inclusion(changeset, :start, ~w(a b)) |
validates :terms_of_service, acceptance: true | validate_acceptance(changeset, :terms_of_service) |
validates :password, length: {is: 6} | validate_length(changeset, :password, is: 6) |
validates :age, numericality: {equal_to: 1} | validate_number(changeset, :age, equal_to: 1) |
Wrap up
There you have it: the essential apples versus oranges comparison.
ActiveRecord focuses on the ease of performing database queries. The great majority of its features are concentrated on the model classes themselves, not requiring developers to have a deep understanding of the database, nor the impact of such operations. ActiveRecord does lots of things implicitly by default. Although that makes it easier to get started, it makes it harder to understand what is happening behind the scenes and it only works if you follow the "ActiveRecord way".
Ecto, on the other hand, requires explicitness which results in more verbose code. As a benefit, everything is in the spotlight, nothing behind the scenes, and you can specify your own way.
Both have their upside depending on your perspective and preference. So having compared apples and oranges, we come to the end of this BAT-tle. Almost forgot to tell you BatGirl's codename (1989 - 2001) was .... Oracle. But let's not go into that. 😉