ruby

Mastering Concurrency

Thijs Cadier
Wes Oudshoorn

Thijs Cadier and Wes Oudshoorn on

Mastering concurrency

Multiple people will use your app at the same time, and you want to deliver your app as fast as possible. So you'll need some way to handle concurrency. Fear not! Most web servers already do this by default. But when you need to scale, you want to use concurrency in the most efficient way possible.

Different types of concurrency

There are multiple ways to handle concurrency: multi-process, multi-threading and event-driven. Each of these have their uses, pros and cons. In this article, you'll learn how they differ and when to use which.

Multi-process (Unicorn)

This is the easiest way to handle concurrency. A master process forks itself to multiple worker processes. The worker process handles the actual requests, while the master manages the workers.

Each worker process has the full codebase in memory. This makes this method pretty memory-intensive, and makes it hard to scale to larger infrastructures.

Multi-process summary
Use case

One non-ruby example you probably know is the

Chrome browser

. It uses multi-process concurrency to give each tab their own process. It allows a single tab to crash without taking the full application down. In their case, it also helps to isolate exploits to a single tab.

Pros

Most simple to implement.


Ignores difficulties with thread safety.


Each worker can crash without damaging the rest of the system.

Cons

Each process loads the full codebase in memory. This makes it memory-intensive.


Hence, it does not scale to large amounts of concurrent connections.

Multi-threading (Puma)

This threading model allows one process to handle multiple requests at the same time. It does so by running multiple threads within a single process.

As opposed to the multi-process approach, all threads run within the same process. This means they share data such as global variables. Therefore, only small chunks of extra memory are used per thread.

Threaded

Global Interpreter Lock

This brings us to the global interpreter lock (GIL) in MRI. The GIL is a lock around the execution of all Ruby code. Even though our threads appear to run in parallel, only one thread is active at a time.

IO operates outside of the GIL. When you execute a database query waiting for the result to come back, it won't lock. Another thread will have a chance to do some work in the meantime. If you do a lot of math and operations on hashes or arrays in threads, you will only utilize a single core if you use MRI. In most cases you still need multiple processes to fully utilize your machine. Or you could use Rubinius or jRuby, which don't have a GIL.

Thread safety

If you use multiple threads you have to be careful to write all code that manipulates shared data in a thread safe way. You can do this for example by using a Mutex to lock shared data structures before you manipulate them. This will ensure that other threads are not basing their work on stale data while you're changing the data.

Multi-threaded summary
Use case

This is the "middle of the road" option. Used for a lot of standard web applications which should handle loads of short requests (such as a busy web application).

ProsUses less memory than multi-process.
Cons

You have to make sure your code is thread safe.


If a thread causes a a crash, it can potentially take down your process.


The GIL locks all operations except I/O.

Event-loop (Thin)

Event-loops are used when you need to do a lot of concurrent I/O operations. The model itself doesn't force multiple requests to be executed at the same time, but it is an efficient way to handle a lot of concurrent users.

Below you'll see a very simple event loop written in Ruby. The loop will take the event from the event_queue and handle it. If there is no event, it will sleep and repeat to see if there are new events in the queue.

ruby
loop do if event_queue.any? handle_event(event_queue.pop) else sleep 0.1 end end

Illustrated version

In this illustration, we're taking it a step further. The event loop now does a beautiful dance with the OS, queue and some memory.

Event loops

Step by step

  1. The OS keeps track of network and disk availability.
  2. When the OS sees the I/O is ready, it sends an event to the queue.
  3. The queue is a list of events from which the event loop takes the top one.

  4. The event loop handles the event.
  5. It uses some memory to store meta data about the connections.
  6. It can send a new event directly into the event queue again. For example, a message to shut down the queue based on the contents of an event.

  7. If it wants to do an I/O operation, it tells the OS that it's interested in a specific I/O operation. The OS keeps track of the network and disk (see [1]) and adds an event again when I/O is ready.

Event-loop summary
Use case

When using a lot of concurrent connections to your users. Think of services like Slack. Chrome notifications.

Pros

Almost no memory overhead per connection.


Scales to a huge number of parallel connections.

Cons

It's a difficult mental model to understand.


Batch sizes must be small and predictable to avoid queues building up.

Which one should you use?

We hope this article has given you a better understanding of the different concurrency models. It's some of the more difficult subject matter to grasp as a developer, but understanding it will give you the tools to experiment and use the right setup for your app.

In summary

  • For most apps threading makes sense, Ruby/Rails ecosystem seems to (slowly) be moving this way.

  • If you run highly concurrent apps with long-running streams, event-loop allows you to scale.

  • If you don't have a high traffic site, or you expect your workers to break go for good old multi-process.

And, it is possible to run an event loop, inside a thread, inside a multi-process setup. So yes, you can have your stroopwafel and eat it too!

If you want to read more more about these concurrency models check out our detailed articles on multi-process, multi-threading and event loops.

Thijs Cadier

Thijs Cadier

Thijs is a co-founder of AppSignal who sometimes goes missing for months on end to work on our infrastructure. Makes sure our billions of requests are handled correctly. Holds the award for best drummer in the company.

All articles by Thijs Cadier
Wes Oudshoorn

Wes Oudshoorn

Once a rogue designer, now co-founder and manager of colors at AppSignal. If our app looks great, it's to his credit. If something is wrong, he points at developers. Loves skiing.

All articles by Wes Oudshoorn

Become our next author!

Find out more

AppSignal monitors your apps

AppSignal provides insights for Ruby, Rails, Elixir, Phoenix, Node.js, Express and many other frameworks and libraries. We are located in beautiful Amsterdam. We love stroopwafels. If you do too, let us know. We might send you some!

Discover AppSignal
AppSignal monitors your apps