Subscribe to
Ruby Magic
Magicians never share their secrets. But we do. Sign up for our Ruby Magic email series and receive deep insights about garbage collection, memory allocation, concurrency and much more.
ActiveRecord is Ruby on Rails’ most magical feature. We don’t usually need to worry about its inner workings, but when we do, here’s how AppSignal can help us know what’s going on under the hood.
To talk about ActiveRecord, we need to first think of frameworks, specifically about MVC frameworks. MVC stands for Model-View-Controller, and it’s a popular software design pattern for graphical and web applications.
MVC frameworks are composed of:
ActiveRecord is the model component in the Ruby in Rails framework. It introduces an abstraction layer between code and data, so we don’t have to write SQL code ourselves. Each model is mapped to one table and provides various methods to perform CRUD operations (Create, Read, Update and Delete).
Abstractions feel magical — they help us ignore details we don’t need to know and focus on the task at hand. But when things don’t work as expected, the complexity they add can make it harder to determine the root cause. AppSignal can give us a detailed breakdown of what’s really happening in Rails.
Let’s get a first look into the problem. Troubleshooting performance issues is an iterative process. Once you have your Rails application reporting to AppSignal, go to Incidents > Performance.
The Response Time graph will show the response time percentiles for each namespace. ActiveRecord events are automatically assigned to the namespace the request or background job the queries are executed in.
Next, look at the Event Group graph. It shows how much time is consumed by category. Check how much relative time is spent by active_record
. Check the usage in all your namespaces.
The graph will immediately tell us where we should focus our code-optimizing efforts.
While you’re in the performance graph dashboard, check the response time and the throughput to ensure there isn’t any higher-than-usual activity on your application.
Now that we’ve identified that the problem is data-bound let’s see if we can zoom in to determine the root cause.
Open the Improve > Slow Queries dashboard. This page shows the list of SQL queries ranked by impact on the overall time. All ActiveRecord-originated queries are shown as a sql.active_record
events.
Try clicking on the topmost query to see its details. The dashboard shows the average duration and the query text.
Scrolling below will show you the query’s response time in the last few hours and the originating action.
You might find that some of the actions have an associated incident. This means that AppSignal created a performance incident while the query was running, but it doensn’t necessarily mean that ActiveRecord is the cause of it.
Performance measurement incidents are opened when AppSignal records a new endpoint or background job.
Incidents are located on Performance > Issue List dashboard.
The incident page shows the elapsed time and number of allocations for each of the MVC components. ActiveRecord problems will exhibit long durations in the active_record
category.
The event timeline shows how the event progressed over time.
In this section, we’ll see how AppSignal can help us identify some common ActiveError issues.
Database wisdom says that we should always retrieve the columns needed for the job. For example, instead of SELECT * FROM people
we should SELECT first_name, surname, birthdate FROM people
. That’s all well and good, but how do we do it on Rails?
By default, ActiveRecord retrieves all columns.
1 2 3 | Person.all.each { # process data } |
Luckily, we have the select
method to pick and choose the required columns:
1 2 3 | Person.select(:name, :address, :birthdate).each { # process data } |
It may sound like I’m being obsessive about little details. But on wide tables, selecting all columns is just wasteful. You’ll notice that when this happens, ActiveRecord allocates big chunks of memory:
The N+1 problem happens when the application gets a set of records from the database and loops through it. This causes the application to execute N+1 queries, where N is the number of rows initially obtained. As you might imagine, this pattern scales poorly as the table grows. It’s such a damaging problem that AppSignal specifically warns you about it:
N+1 problems usually appear with associated models. Imagine we have a Person model:
1 2 3 | class Person < ApplicationRecord has_many :addresses end |
Each person can have many addresses:
1 2 3 | class Address < ApplicationRecord belongs_to :person end |
The most straightforward way of retrieving the data leads to the N+1 problem:
1 2 3 4 5 6 7 8 9 | class RelatedTablesController < ApplicationController def index Person.all.each do |person| person.addresses.each do |address| address.address end end end end |
You can see in AppSignal that the application is running a SELECT
per person:
The fix for this particular case is simple: use includes, which tells ActiveRecord to optimize the query for the needed columns:
1 2 3 4 5 6 7 8 9 | class RelatedTablesController < ApplicationController def index Person.all.includes(:addresses).each do |person| person.addresses.each do |address| address.address end end end end |
Now we have two queries instead of N+1:
1 2 3 4 5 6 7 | Processing by RelatedTablesController#index as HTML (0.2ms) SELECT sqlite_version(*) ↳ app/controllers/related_tables_controller.rb:12:in `index' Person Load (334.6ms) SELECT "people".* FROM "people" ↳ app/controllers/related_tables_controller.rb:12:in `index' Address Load (144.4ms) SELECT "addresses".* FROM "addresses" WHERE "addresses"."person_id" IN (1, 2, 3, . . .) |
This is sometimes confused with the N+1 problem, but it’s a bit different. When we query a table, we should retrieve all the data we think we’ll need to minimize read operations. There is, however, a lot of innocent-looking code that triggers redundant queries. For example, look how count
always results in a SELECT COUNT(*)
query in the following view:
1 2 3 4 5 6 7 | <ul> <% @people.each do |person| %> <li><%= person.name %></li> <% end %> </ul> <h2>Number of Persons: <%= @people.count %></h2> |
Now ActiveRecord does two queries:
1 2 3 4 5 | Rendering duplicated_table_query/index.html.erb within layouts/application (69.1ms) SELECT COUNT(*) FROM "people" WHERE "people"."name" = ? [["name", "John Waters"]] ↳ app/views/duplicated_table_query/index.html.erb:3 Person Load (14.6ms) SELECT "people".* FROM "people" WHERE "people"."name" = ? [["name", "John Waters"]] ↳ app/views/duplicated_table_query/index.html.erb:6 |
In AppSignal, the symptom you’ll notice is that there are two active_record
events on the same table:
The reality is that we don’t need two queries; we already have all the data we need in memory. In this case, the solution is to swap count
with size
:
1 2 3 4 5 6 7 | <ul> <% @people.each do |person| %> <li><%= person.name %></li> <% end %> </ul> <h2>Number of Persons: <%= @people.size %></h2> |
Now we have a single SELECT
, as it should be:
1 2 3 | Rendering duplicated_table_query/index.html.erb within layouts/application Person Load (63.2ms) SELECT "people".* FROM "people" WHERE "people"."name" = ? [["name", "Abdul Strosin"]] ↳ app/views/duplicated_table_query/index.html.erb:5 |
Another solution is to use preload to cache the data in memory.
Aggregation is used to compute a value based on a set of data. Databases are great at working with big datasets. This is what they do and what we use them for. On the other hand, using Rails to aggregate doesn’t scale since it requires getting all the records from the database, holding them in memory, and then calculating using high-level code.
We’re doing aggregation in Rails whenever we resort to Ruby functions like max
, min
, or sum
over ActiveRecord elements or other enumerables.
1 2 3 4 5 | class AggregatedColumnsController < ApplicationController def index @mean = Number.pluck(:number).sum() end end |
Fortunately, ActiveRecord models include specific methods that map to aggregation functions in the database. For example, the following query maps to SELECT SUM(number) FROM ...
, which is much faster and cheaper to run than the previous example:
1 2 3 4 5 6 7 | # controller class AggregatedColumnsController < ApplicationController def index @mean = Number.sum(:number) end end |
1 2 | Processing by AggregatedColumnsController#index as */* (2.4ms) SELECT SUM("numbers"."number") FROM "numbers" |
If you need more complex or combined aggregation functions, you may need to include a bit of raw SQL code:
1 2 | sql = "SELECT AVG(number), STDDEV(number), VAR(number) FROM ..." @results = ActiveRecord::Base.connection.execute(sql) |
SQL transactions ensure consistent and atomic updates. When we use a transaction to make a change, either every row is updated successfully, or the whole thing is rolled back. In any case, the database always remains in a consistent state.
We can bundle a batch of changes in a single transaction with ActiveRecord::Base.transaction
.
1 2 3 4 5 6 7 8 9 | class BigTransactionController < ApplicationController def index ActiveRecord::Base.transaction do (1..1000).each do Person.create(name: 'Les Claypool') end end end end |
There are a lot of legitimate cases for using large transactions. These, however, run into the risk of slowing the database down. Besides, transactions exceeding certain thresholds will result in the database rejecting them.
The first sign of transactions being too big is spending a lot of time on commit transaction
events:
Barring configuration issues on the database itself, the solution is to break down the transaction into smaller chunks.
Sometimes even though we search the code, we can’t find anything wrong with it. Then, there may be a problem in the database itself. Database engines are complex, and many things can go wrong: low memory, default settings, missing indexes, inconveniently-scheduled backup jobs. The picture can’t be complete unless we get information about the machine running the database.
If we’re running our own databases, we can install the standalone agent to capture host metrics. To learn more about how to use the standalone agent, read: Monitoring Any System with StatsD and AppSignal’s Standalone Agent.
The following signs might show something is going on with the database. Go to the Inspect > Host Metrics dashboard to see the resource usage in your server:
Diagnosing performance issues is never an easy task. Today, we’ve learned how to use Appsignal to quickly pinpoint the source of the problem.
Let’s keep learning about AppSignal and Ruby on Rails:
P.S. If you’d like to read Ruby Magic posts as soon as they get off the press, subscribe to our Ruby Magic newsletter and never miss a single post!
Our guest author Tomas spent 10 years working at IBM, where he did a bit of everything: development, service delivery, database administration, and cloud engineering. He’s now an independent consultant and a technical writer.