Visit AppSignal.com

Ruby Magic

Introduction to Garbage Collection (Part I)

Thijs Cadier on

Whenever you run your code, you use memory. When you write in a language like Ruby, it seems like the memory available to you is infinite. You can just keep going without thinking about the fixed amount of memory the system running your code has. In this Ruby Magic episode we'll explain how this works!

A bit of history

Back in the day, scripting languages such as Ruby did not exist yet. People only wrote code in languages such as C, a low level programming language. One of the things that makes these languages low level is that you have to clean up after yourself. For example, whenever you allocate memory to store a String, you also have to decide when to clean it up.

Manual cleanup

This looks a little something like the following mock Ruby code. It declares a variable and uses the method free –this method does not actually exist in Ruby– to clean up the memory we've used after we're done with the variable.

1
2
3
4
5
  1_000_000.times do |i|
    variable = "Variable #{i}"
    puts variable
    free(variable)
  end

A tedious way of programming

You might have already realized there's a risk here: what if you forget to free the variable? In that case the content of that variable will just stick around in memory until the process exits. If you do this often enough, you will be out of memory and your process crashes.

The next example demonstrates another common issue:

1
2
3
4
5
  1_000_000.times do |i|
    variable = "Variable #{i}"
    free(variable)
    puts variable
  end

We declare the variable and free it. But then we try to use it again, which is impossible because it doesn't exist anymore. If this were C, your program would now crash with a segfault. Oops!

Humans are mistake machines

Humans are notoriously bad at not making these kinds of mistakes all of the time. Hence the need for a way to automatically clean up memory. The most popular way to do this –also used in Ruby– is Garbage Collection (GC).

How Garbage Collection (GC) works

In a language that uses GC, you can create objects without manually cleaning them up. Whenever you create an object, it's registered with the Garbage Collector. GC tries to keep track of all references you make to this object. When it determines you're not using the object any more, it is marked for cleanup. Every once in a while the Garbage Collector pauses your program and cleans up all the marked objects.

Looking at some examples

In the simple loop we used earlier the GC's job is fairly easy. With every iteration of the loop, the variable isn't used anywhere anymore. The variable can immediately be marked for cleanup.

1
2
3
4
  1_000_000.times do |i|
    variable = "Variable #{i}"
    puts variable
  end

In the next example we pass the variable into the puts_later method which waits for 30 seconds and then puts the variable.

1
2
3
4
5
6
7
8
9
10
11
  def puts_later(variable)
    Thread.new do
      sleep 30
      puts variable
    end
  end

  1_000_000.times do |i|
    variable = "Variable #{i}"
    puts_later variable
  end

The Garbage Collector's job is already pretty complicated in this relatively simple example. It has to understand that we reference the variable in the puts_later method. Because the method starts a thread, the Garbage Collector has to keep track of the thread and wait for it to finish. Only then can the variable can be marked for cleanup.

When it gets complicated

Without getting into complex examples, trust me when I say the Garbage Collector's job is really hard. This also explains why GC can cause overhead and problems in your production environment. It needs to have a very detailed understanding of what's happening in your program to properly clear memory, which takes quite a few CPU cycles to get right. But hey, it beats cleaning up after yourself!

There's more to Garbage Collection

This was only our introduction to Garbage Collection. In a future article we'll look at how exactly this works in Ruby, and how you can measure and tune GC to improve the performance of your application.

Update: The next episode is available here.

Go back

Subscribe to

Ruby Magic

Magicians never share their secrets. But we do. Sign up for our Ruby Magic email series and receive deep insights about garbage collection, memory allocation, concurrency and much more.