Practical Garbage Collection Tuning in Ruby

Guest Guest Author on

We have discovered that the below post is based on an article by Nate Berkopec from 2017 called ‘Understanding Ruby GC through GC.stat’. It appears that parts of this article were plagiarised, something we were unaware of until the original author mentioned it. We run all of our articles through a plagiarism tool before publishing, but it didn’t pick this up. We give huge apologies to Nate and our readers for this inadvertent error.

It is vital that you understand how garbage collection works in Ruby to stay in complete control of your app’s performance.

In this post, we will dive into how to implement and customize garbage collection in Ruby.

Let’s get going!

The Ruby Garbage Collector Module

The Ruby Garbage Collector module is an interface to Ruby’s mark and sweep garbage collection mechanism.

While it runs automatically in the background when needed, the GC module lets you call the GC manually whenever required and gain insights into how garbage collection cycles are running. The module provides some parameters which you can alter to moderate performance.

Some of the most commonly used methods of this module are:

Understanding Ruby Garbage Collector Parameters

To understand how Ruby’s GC works internally, let’s look at the GC module’s metrics. Run the following command on a freshly booted irb:

1
puts GC.stat

You will notice that a bunch of numbers pop up on your screen, looking something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
{
    :count=>12,
    :heap_allocated_pages=>49,
    :heap_sorted_length=>49,
    :heap_allocatable_pages=>0,
    :heap_available_slots=>19975,
    :heap_live_slots=>19099,
    :heap_free_slots=>876,
    :heap_final_slots=>0,
    :heap_marked_slots=>16659,
    :heap_eden_pages=>49,
    :heap_tomb_pages=>0,
    :total_allocated_pages=>49,
    :total_freed_pages=>0,
    :total_allocated_objects=>66358,
    :total_freed_objects=>47259,
    :malloc_increase_bytes=>16216,
    :malloc_increase_bytes_limit=>16777216,
    :minor_gc_count=>10,
    :major_gc_count=>2,
    :remembered_wb_unprotected_objects=>191,
    :remembered_wb_unprotected_objects_limit=>312,
    :old_objects=>16024,
    :old_objects_limit=>23556,
    :oldmalloc_increase_bytes=>158824,
    :oldmalloc_increase_bytes_limit=>16777216
}

This holds all the information about how garbage collection has been happening in the runtime. Let’s examine each of these numbers in detail.

Counts in Ruby Garbage Collector

We’ll begin by describing these keys:

1
2
3
4
5
6
{
    :count=>12,
    #…
    :minor_gc_count=>10,
    :major_gc_count=>2,
}

These are GC counts, and they convey pretty straightforward information. minor_gc_count and major_gc_count are the counts of each type of garbage collection run.

There are two types of garbage collections in Ruby.

Minor GC refers to a garbage collection attempt that tries to garbage collect only those objects that are new, i.e., they have survived three or fewer garbage collection cycles.

On the other hand, major GC is a garbage collection attempt that tries to garbage collect all objects, even those that have survived more than three garbage collection cycles. count is the sum of minor_gc_count and major_gc_count.

Tracking the GC count can be helpful for a few reasons. You can figure out if a particular job or process always triggers GCs and the number of times it triggers them. It might not be 100% accurate in cases like multithreaded applications, but it is a good starting point to figure out where your memory is bleeding.

Heap Numbers: Slots and Pages

Next, let’s talk about these keys, also known as heap numbers:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{
    # page numbers
    :heap_allocated_pages=>49,
    :heap_sorted_length=>49,
    :heap_allocatable_pages=>0,

    # slots
    :heap_available_slots=>19975,
    :heap_live_slots=>19099,
    :heap_free_slots=>876,
    :heap_final_slots=>0,
    :heap_marked_slots=>16659,

    # Eden and Tomb pages
    :heap_eden_pages=>49,
    :heap_tomb_pages=>0,
}

The heap that we are talking about here is a C data structure. It contains references to all the currently live Ruby objects. A heap page is composed of memory slots, and each slot includes information on only one live Ruby object:

Now, coming to the slots:

Then we have tomb_pages and eden_pages.

tomb_pages is the count of pages that contain no live objects. These pages are eventually released back to the operating system by Ruby.

On the other hand, eden_pages is the count of those pages that contain at least one live object, so they can’t be released back to the operating system.

Consider monitoring the metric heap_free_slots if you face memory bloat issues in your application.

A high number of free slots (more than 250,000) usually indicates that you have a handful of controller actions that allocate many objects at once and then free them. This can permanently bloat the size of your running Ruby process.

Cumulative Numbers

1
2
3
4
5
6
{
    :total_allocated_pages=>49,
    :total_freed_pages=>0,
    :total_allocated_objects=>66358,
    :total_freed_objects=>47259,
}

These numbers are cumulative or additive in nature for the entire life of the process. They are never reset by the GC and can’t technically go down. All four of these numbers are self-explanatory.

Garbage Collection Thresholds

To understand these numbers, you first need to understand when GC is triggered:

1
2
3
4
5
6
7
8
9
10
{
    :malloc_increase_bytes=>16216,
    :malloc_increase_bytes_limit=>16777216,
    :remembered_wb_unprotected_objects=>191,
    :remembered_wb_unprotected_objects_limit=>312,
    :old_objects=>16024,
    :old_objects_limit=>23556,
    :oldmalloc_increase_bytes=>158824,
    :oldmalloc_increase_bytes_limit=>16777216
}

Contrary to a common assumption that GC runs happen at fixed intervals, GC runs are triggered when Ruby starts running out of memory space. Minor GC happens when Ruby runs out of free_slots.

If Ruby is still low on free_slots after a minor GC run — or the threshold of oldmalloc, malloc, old object count, or shady/write-barrier-unprotected count is exceeded — a major GC run is triggered. The above part of gc.stat shows the values of these thresholds.

malloc_increase_bytes refers to the amount of memory allocated outside of the heap we talked about so far. When the size of an object exceeds the standard size of a memory slot — say, 40 bytes — Ruby mallocs some space somewhere else just for that object. When the total extra allocated space exceeds malloc_increase_bytes_limit, a major GC is triggered.

oldmalloc_increase_bytes is a similar threshold for old objects. old_objects is a count of object slots marked as old. When the number exceeds the old_objects_limit, major GC is triggered.

remembered_wb_unprotected_objects is the total count of objects that are not protected by the write-barrier and are a part of the remembered set.

The write-barrier is an interface between Ruby runtime and its objects, which allows the interpreter to track references to and from the object as soon as they are created.

C-extensions can make new references to objects without using the write-barrier, in which case the objects are marked shady or write-barrier unprotected. The remembered set is simply a list of old objects with at least one reference to a new object.

Customize Ruby Garbage Collection Performance

Now that you understand how Ruby GC manages your application’s memory, it is time to look at the options available to customize GC’s behavior.

Here are environment variables that you can use to moderate the performance of Ruby GC and, in turn, improve the performance of your application:

1
2
3
4
5
6
RUBY_GC_HEAP_INIT_SLOTS
RUBY_GC_HEAP_FREE_SLOTS
RUBY_GC_HEAP_GROWTH_FACTOR
RUBY_GC_HEAP_GROWTH_MAX_SLOTS
RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR
and other variables

Let’s talk about the important parameters here, one by one:

Fine-tuning Garbage Collection in Ruby

We discussed some common and simple ways to customize the GC module to help you improve your application’s overall performance. However, these tweaks might not work in all cases. You need to figure out the memory usage pattern of your app before deciding what to customize.

On the flip side, you can consider running an automated test that finds the best values of these parameters for you. Tools like TuneMyGC are pretty straightforward when figuring out the best set of values for your environment variables.

Definitely look at GC parameters if your application is behaving weirdly. A small change here can go a long way to bring down your app’s memory consumption and prevent memory bloats.

I hope this article has given you a good idea of what to look out for when customizing your Ruby Garbage Collection module. For more of an introduction, check out the Introduction to Garbage Collection Part I and Part II.

Have fun coding!

5 favorite Ruby articles

10 latest Ruby articles

Go back
Ruby magic icon

Subscribe to

Ruby Magic

Magicians never share their secrets. But we do. Sign up for our Ruby Magic email series and receive deep insights about garbage collection, memory allocation, concurrency and much more.

We'd like to set cookies, read why.