In the first part of this series, we set up a basic Rack app, learned how to process a request and send a response.
In this post, we'll take over connections from Rack and hold persistent connections to enable pathways such as WebSockets.
First, though, let's look at how an HTTP connection actually works.
HTTP Connections
As this diagram shows, a TCP socket is opened, and a request is sent to a server. The server responds and closes the connection. All communication is in plain text.
Using a technique called socket hijacking, we can take control of a socket from Rack when a request comes in. Rack offers two techniques for socket hijacking:
- Partial hijack: Rack sends the HTTP response headers and hands over the connection to the application.
- Full hijack: Rack simply hands over the connection to the client without writing anything to the socket.
Partial Hijacking
This is how you do a partial hijack:
rack.hijack
is a Rack header, set in the same Hash as the HTTP response headers. Rack will look for such headers and process them as per the specification, instead of writing them to the HTTP response.
Run the above app and curl
to it. You'll see that it writes the time at one-second intervals.
Full Hijacking
This is how you'd do a full hijack:
In this case, we call the proc passed to us using the rack.hijack
key, instead of setting one ourselves in the response. This gives us complete control over the socket. At the end, we return an array with the status -1
only because Rack expects an array to be returned. The contents of this array are ignored since we've taken over the socket.
This is a bad practice, rife with gotchas and weird behavior. Don't do it. Samuel Williams, who is a maintainer of Rack, recommends against it as well.
Streaming Bodies in Rack for Ruby
While full hijacking is a terrible idea, partial hijacking is a useful tool. But it still feels hacky, so Rack 3 formally adopted that approach into the spec by introducing the concept of streaming bodies.
Here we provide a block as the response body rather than an array. Rack keeps the connection open until the block finishes executing.
There's a huge gotcha here when using Puma. Puma is a multi-threaded server that assigns a thread to each incoming request. We're taking over the socket from Rack, but we're still tying up a Puma thread as long as the connection is open.
Puma concurrency can be configured, but threads are limited, and tying one up for long periods is not a good idea. Let's see this in action first.
In two separate terminal windows, run the following command at the same time:
One request is immediately served, but the other is held until the first one completes. This is because we started Puma with a single worker and single thread, meaning it can only serve a single request at a time.
We can get around this by creating our own thread.
Now if you try the above experiment again, you'll see both curl
requests are served concurrently because they don't tie up a Puma thread.
Once again, I must warn against this approach, unless you know what you're doing. These demonstrations are largely academic, as systems programming is a deep and complex topic.
Falcon Web Server
Since the threading problem is specific to the Puma web server, let's look at another option: Falcon. This is a new, highly concurrent Rack-compliant web server built on the async
gem. It uses Ruby Fibers instead of Threads, which are cheaper to create and have much lower overhead.
The async
gem hooks into all Ruby I/O and other waiting operations, such as sleep
, and uses these to switch between different Fibers (ensuring a program is never held up doing nothing).
Revert your app to the previous version where we're not spawning a new thread:
Then remove Puma and install Falcon.
Run the Falcon server. We need to explicitly bind it because it only serves https
traffic by default.
The server only uses a single thread, which you can confirm with the command below. You'll need to grab your specific pid
from Falcon's logs.
The thread count printed by the above command will be 2
because the MRI uses a thread internally.
Try the earlier experiment again and run two curl
requests simultaneously.
You'll see they're both served at the same time, thanks to Ruby Fibers!
Falcon is relatively new. Ruby Fibers were only introduced in Ruby 3.0. Since Falcon is Rack-compliant, it can be used with Rails too, but the docs recommend using it with v7.1 or newer only. As such, it's a bit risky to use Falcon in production but it's a very exciting development in the Ruby world, in my opinion. I can't wait to see its progress in the next few years.
We've now learned how to create persistent connections in Rack and how to run them without blocking other requests, but the use cases so far have been academic and contrived. In the next and final part of this series, we'll examine how we can use this technique in a practical way.
Until then, happy coding!
P.S. If you'd like to read Ruby Magic posts as soon as they get off the press, subscribe to our Ruby Magic newsletter and never miss a single post!