In the first part of this two-part series on memory leaks, we looked at how Ruby manages memory and how Garbage Collection (GC) works.
You might be able to afford powerful machines with more memory, and your app might restart often enough that your users don't notice, but memory usage matters.
Allocation and Garbage Collection aren't free. If you have a leak, you spend more and more time on Garbage Collection instead of doing what you built your app to do.
In this post, we'll look deeper into the tools you can use to discover and diagnose a memory leak.
Let's continue!
Finding Leaks in Ruby
Detecting a leak is simple enough. You can use GC
, ObjectSpace
,
and the RSS graphs in your APM tool to watch your memory usage increase. But
just knowing you have a leak is not enough to fix it. You need to
know where it is coming from. Raw numbers can't tell you that.
Fortunately, the Ruby ecosystem has some great tools to attach context to
those numbers. Two are memory-profiler
and derailed_benchmarks
.
memory_profiler
in Ruby
The memory_profiler
gem offers a very simple API and a detailed (albeit a
little overwhelming) allocated and retained memory report — that includes the
classes of objects that are allocated, their size, and where they were allocated.
It's straightforward to add to our leaky program.
Outputting a report that looks similar to this.
There is a lot of information here, but generally, the
allocated objects by location
and retained objects by location
sections
can be the most useful when looking for leaks. These are the file locations
that allocate objects, ordered by the number of allocated objects.
allocated
objects are all objects allocated (created) within thereport
block.retained
objects are objects that have not been garbage collected by the end of thereport
block. We forced a GC run before the end of the block so we could see the leaked objects more clearly.
Be careful about trusting the retained
object counts. They depend heavily
on what portion of the leaking code is within the report
block.
For example,
if we move the declaration of an_array
into the report
block, we might be
fooled into thinking the code isn't leaky.
The top of the resulting report won't report many retained objects (just the report itself).
derailed_benchmarks
in Ruby
The derailed_benchmarks
gem is a suite of very useful tools for all
kinds of performance work, primarily aimed at Rails apps. For finding
leaks, we want to look at perf:mem_over_time
, perf:objects
, and
perf:heap_diff
.
These tasks work by sending curl
requests to a running app, so we can't add
them to our little leaky program. Instead, we'll need to set up a small
Rails app with an endpoint that leaks memory, then install the
derailed_benchmarks
on that app.
You should now be able to boot the app with bin/rails s
. You'll be able
to curl
an endpoint that leaks on each request.
We can now use derailed_benchmarks
to see our leak in action.
perf:mem_over_time
This will show us memory use over time (similarly to how we watched the
memory growth of our leaky script with watch
and ps
).
Derailed will boot the app in production mode, repeatedly hit an endpoint
(/
by default), and report the memory usage. If it never stops growing, we
have a leak!
Note: Derailed will boot the Rails app in production mode to perform the tests. By default, it will also require rails/all
first. Since we don't have a database in this app, we need to override this behavior with DERAILED_SKIP_ACTIVE_RECORD=true
.
We can run this benchmark against different endpoints to see which one/s (if any) leak.
perf:objects
The perf:objects
task uses memory_profiler
under the hood so the produced
report will look familiar.
This report can help narrow down where your leaked memory is being
allocated. In our example, the report's last section — the
Retained String Report
— tells us exactly what our problem is.
We've leaked 10,000 strings containing "ABC" from the LeaksController
on
line 3. In a non-trivial app, this report would be significantly larger and
contain retained strings that you want to retain — query caches, etc. —
but this and the other 'by location' sections should help you narrow down your
leak.
perf:heap_diff
The perf:heap_diff
benchmark can help if the report from perf:objects
is too complex to
see where your leak is coming from.
As the name suggests, perf:heap_diff
produces three
heap dumps and calculates the difference between them. It creates a report
that includes the types of objects retained between dumps and the
location that allocated them.
You can also read Tracking a Ruby memory leak in 2021 to understand better what's going on.
The report points us exactly where we need to go for our leaky baby app. At the
top of the diff, we see 999991 retained string objects allocated
from the LeaksController
on line 3.
Leaks in Real Ruby and Rails Apps
Hopefully, the examples we've used so far have never been put into real-life apps — I hope no one intends to leak memory!
In non-trivial apps, memory leaks can be much harder to track down. Retained objects are not always bad — a cache with garbage collected items would not be of much use.
There is something common between all leaks, though. Somewhere, a root-level object (a class/global, etc.) holds a reference to an object.
One common example is a cache without a limit or an eviction policy. By definition, this will leak memory since every object put into the cache will remain forever. Over time, this cache will occupy more and more of the memory of an app, with a smaller and smaller percentage of it actually in use.
Consider the following code that fetches a high score for a game. It's similar to something I've seen in the past. This is an expensive request, and we can easily bust the cache when it changes, so we want to cache it.
The @scores
hash is completely unchecked. It will grow to hold
every single high score for every user — not ideal if you have a lot of
either.
In a Rails app, we would probably want to use Rails.cache
with a sensible expiry (a memory leak in Redis is still a memory leak!) instead.
In a non-Rails app, we want to limit the hash size, evicting the oldest
or least recently used items. LruRedux
is a nice
implementation.
A more subtle version of this leak is a cache with a limit, but whose keys are of arbitrary size. If the keys themselves grow, so too will the cache. Usually, you won't hit this. But, if you're serializing objects as JSON and using that as a key, double-check that you're not serializing things that grow with usage as well — such as a list of a user's read messages.
Circular References
Circular references can be garbage collected. Garbage Collection in Ruby uses the "Mark and Sweep" algorithm. During their presentation introducing variable width allocation, Peter Zhu and Matt Valentine-House gave an excellent explanation of how this algorithm works.
Essentially, there are two phases: marking and sweeping.
-
In the marking phase, the garbage collector starts at root objects (classes, globals, etc.), marks them, and then looks at their referenced objects.
It then marks all of the referenced objects. Referenced objects that are already marked are not looked at again. This continues until there are no more objects to look at — i.e., all referenced objects have been marked.
-
The garbage collector then moves on to the sweeping phase. Any object not marked is cleaned up.
Therefore, objects with live references can still be cleaned up. As long as a root object does not eventually reference an object, it will be collected. In this way, clusters of objects with circular references can still be garbage collected.
Application Performance Monitoring: The Event Timeline and Allocated Objects Graph
As mentioned in the first part of this series, any production-level app should use some form of Application Performance Monitoring (APM).
Many options are available, including rolling your own (only recommended for larger teams). One key feature you should get from an APM is the ability to see the number of allocations an action (or background job) makes. Good APM tools will break this down, giving insight into where allocations come from — the controller, the view, etc.
This is often called something like an 'event timeline.' Bonus points if your APM allows you to write custom code that further breaks down the timeline.
Consider the following code for a Rails controller.
When reported by an APM, the 'event timeline' might look something like the following screenshot from AppSignal.
This can be instrumented so we can see which part of the code makes the allocations in the timeline. In real apps, it is probably going to be less obvious from the code 😅
Here's an example of an instrumented event timeline, again from AppSignal:
Knowing where to instrument can often be difficult to grasp. There's no substitute for really understanding your application's code, but there are some signals that can serve as 'smells'.
If your APM surfaces GC runs or allocations over time, you can look for spikes to see if they match up with certain endpoints being hit or certain running background jobs. Here's another example from AppSignal's Ruby VM magic dashboard:
By looking at allocations in this way, we can narrow down our search when
looking into memory problems. This makes it much easier to use tools like
memory_profiler
and derailed_benchmarks
efficiently.
Read about the latest additions to AppSignal's Ruby gem, like allocation and GC stats tracking.
Wrapping Up
In this post, we dived into some tools that can help find and fix memory leaks, including memory_profiler
, derailed_benchmarks
, perf:mem_over_time
, perf:objects
, perf:heap_diff
, the event timeline and allocated objects graph in AppSignal.
I hope you've found this post, alongside part one, useful in diagnosing and sorting out memory leaks in your Ruby app.
Read more about the tools we used:
Additional detailed reading:
GC
module documentationObjectSpace
module documentation- Garbage Collection Deep Dive
- Variable Width Allocation
Happy coding!
P.S. If you'd like to read Ruby Magic posts as soon as they get off the press, subscribe to our Ruby Magic newsletter and never miss a single post!