It's 2024, and the HyperText Transfer Protocol (HTTP) is 35 years old. The fact that the vast majority of web traffic still relies on this simple, stateless form of communication is a marvel in itself.
A first set of content retrieval optimizations were added to the protocol when v1.0 was published in 1996. These include the infamous caching instructions (aka headers) that the client and server use to negotiate whether content needs refreshing.
28 years later, many web developers still avoid using it like the plague. A great deal of traffic and server load (and thus, latency) can be avoided, though, by using the built-in caching mechanism of the web. Not only does this result in a greatly improved user experience, it can also be a powerful lever to trim down your server bill. In this post, we'll take a look at:
- The basic concepts around HTTP caching.
- What cache layers we can leverage.
- How web caching is configured and controlled.
- How simple it is to cache in Ruby on Rails.
Let's get started!
Concepts
Before we dive deep into the mechanics of web caching, let's get some definitions out of the way.
Fresh Vs. Stale
Let's assume we have a shared cache server in place that stores responses from our app server for reuse. Later, a client issues a request to our app server. The stored response in our cache is now in one of two states:
- Fresh: The response is still valid (we'll see what that means in a second) and will be used to fulfill the request.
- Stale: The response isn't valid anymore. A new response has to be calculated and served by the upstream server.
HTTP Cache Layers
The HTTP Caching spec defines two types of cache — shared and private.
Shared Caches
Shared caches store responses that are reusable among a group of users. Typically, shared caches are implemented at intermediaries, e.g., Nginx, Caddy, or other reverse proxies. Of course, commercial edge caches like Cloudflare or Fastly also implement shared caches.
Another flavor of shared caches can be implemented in service workers using the Cache API.
We can control which responses are eligible for caching with the Cache-Control
header.
Private Caches
Private caches are allotted to a single user. You are most likely to encounter a private cache as a browser component. The main feature here is that the stored responses are not shared with other clients. Any content that the server renders containing personalized data must only be stored in a private cache. In general, this will be the case for any authenticated route in your Rails app, but there might be exceptions.
You can force a response to be cached privately using the Cache-Control: private
header.
Controlling Caching Via the Cache-Control
Header
Although, historically, other headers have been in use (like Expires
or Pragma
), we will focus on the Cache-Control
header here. The specification has an exhaustive description, but we'll focus on the salient parts.
The Cache-Control
header sets one or several comma-separated directives that can be further divided into request and response directives.
Request Directives
When requesting a resource from the server, the client (browser) can specify one of the following directives to negotiate how caching should behave:
no-cache
- Advises the cache to revalidate the response against the origin server. Typically this happens when you force reload a page or disable caching in the browser's developer tools.no-store
- Indicates that no cache must store any part of this request or any response.max-age
- In seconds, indicates the timespan for which a response from the server is considered fresh. For example,Cache-Control: max-age=3600
would tell the server that any response older than an hour cannot be reused.
Response Directives
Essentially, response directives consist of what's above, with different semantics, plus a few more. A cache must obey these directives coming from the origin server:
no-cache
- A bit counterintuitively, this directive does not imply that the response cannot be cached. Rather, each cache must revalidate the response with the origin server for each reuse. This is the normal case for a response from a Rails application.no-store
- No cache of any type (shared or private) must store this response.max-age
- Indicates the period into the future for which the response should count as fresh. So,Cache-Control: max-age=3600
would specify a period of one hour from the moment of generation on the origin server. Afterwards, a cache cannot reuse it.private
- This directive specifies that the response can only be stored in a private cache (i.e., the browser). This should be set for any user-personalized content, especially when a session cookie is required to access a resource.public
- Indicates that a response can be stored in a shared cache.must-revalidate
- An edge case worth noting: HTTP allows the reuse of stale responses when caches are disconnected from the origin server. This directive prevents that and forces a fresh response or a 504 gateway timeout error.
(In)validation
Let's say a response has become stale for some reason (it's expired, for example). Even so, there's a chance that it is still valid, but we have to ask the origin server if this is the case. This process is called validation and is performed using a conditional request in one of two ways: by expiration or based on the response content.
By Expiration
The client sends an If-Modified-Since
header in the request, and the cache uses this header to determine if the respective response has to be revalidated. Let's consider this response.
HTTP/1.1 200 OK ... Date: Fri, 29 Mar 2024 10:30:00 GMT Last-Modified: Fri, 29 Mar 2024 10:00:00 GMT Cache-Control: max-age=3600 ...
It was retrieved at 10:30 and is stored on the client. The combination of Last-Modified
and max-age
tells us that this response becomes stale at 11:30. If the client sends a request at 11:31, it includes an If-Modified-Since
header:
GET /index.html HTTP/1.1 Host: example.com Accept: text/html If-Modified-Since: Fri, 29 Mar 2024 10:00:00 GMT
The server now calculates the content of the response and, if it hasn't changed, sends a 304 Not Modified
response:
HTTP/1.1 304 Not Modified ... Date: Fri, 29 Mar 2024 11:31:00 GMT Last-Modified: Fri, 29 Mar 2024 10:00:00 GMT Cache-Control: max-age=3600
Because it's a redirect, this response has no payload. It merely tells the client that the stale response has been revalidated and can revert to a fresh state again. Its new expiry time is now 12:31.
Based On the Response Content
Timing is a problematic matter: servers and clients can drift out of sync or file system timestamps may not be appropriate. This is solved by using an ETag
header. This can be an arbitrary value, but most frequently, it is a digest of the response body. Picking up the example from above, we swap Last-Modified
for an ETag
header:
HTTP/1.1 200 OK ... Date: Fri, 29 Mar 2024 10:30:00 GMT ETag: "12345678" Cache-Control: max-age=3600 ...
If this response is stored in a private cache and becomes stale, the client now uses the last known ETag
value and asks the server to revalidate it:
GET /index.html HTTP/1.1 Host: example.com Accept: text/html If-None-Match: "12345678"
Now, the server will return a 304 Not Modified
response if the values of a freshly computed ETag
and the requested If-None-Match
header match. Otherwise, it will respond with 200 OK
and a new version of the content.
I will not go into the details here, but the Rails docs explain weak vs. strong ETags.
Implementation In Rails
We're now ready to implement this knowledge in a Rails app. The relevant module is wired into ActionController
and is called ActionController::ConditionalGet
. Let's examine the interface it uses to emit the Cache-Control
directives discussed above.
expires_in
This method sets the max-age
directive, overwriting all others. You can pass it ActiveSupport::Duration
and options such as public
and must_revalidate
, which will set the respective directives.
When would you want to use this? Typically, when you need to balance cache effectiveness and freshness. For example:
This exemplifies the compromise you might make between the probability that a new product is added, changed, or removed in the course of 15 minutes and somebody seeing a stale response. Every application will have its own limitations here, but it's a good idea to have application monitoring like AppSignal for Ruby built into your production environment. This will enable you to query how often an endpoint is accessed by the same user vs the data manipulation frequency.
expires_now
and no_store
This will set Cache-Control
to no-cache
and no-store
, respectively. Look above for the implications.
http_cache_forever
http_cache_forever
sets a max-age
of 100 years internally and caches the response. It will consult stale?
, though (see below), to determine if a fresh response should be rendered. You call it like this:
This renders the about
view and allows intermediary shared caches to store it (because public
is set in Cache-Control
).
fresh_when
and stale?
These methods are siblings and are both concerned with setting appropriate ETag
and Last-Modified
headers. Thus, they form the heart of Rails' conditional GET implementation. Let's look at each of them now:
The show
action exhibits the simplest way to enable conditional GET requests on an endpoint. If it's passed an ActiveRecord instance, it will extract the update_at
timestamp, reuse it as Last-Modified
, and ETag
will be computed from the record as a hex digest.
When working with relations, like in the index
action, fresh_when
will check for the most recent updated_at
in the collection using the maximum
method.
If desired, you can override this behavior using explicit options. These match the directives discussed above (the only outlier being the template
option, which allows you to specify the template used to calculate the ETag
digest). This is useful when your controller action uses a different template than the default.
In either case, before rendering a response, fresh_when
will check if a stored response is still fresh and return a 304 in this case.
On the other hand, stale?
is a predicate method that falls back to fresh_when
internally and returns true
or false
based on its evaluation. You can use it to guard against expensive method calls, analogously to the above examples:
In this example, since it uses the same internal semantics as fresh_when
, it would automatically send a 304 Not Modified
response unless the article is stale. If it is stale, though (i.e., if 15 minutes have passed since it was generated), the counters are refreshed and return a fresh response. Updating low-priority data in your responses only infrequently, based on a "timeout", is a typical use case.
The etag
Class Method
How can caches, even private ones, deal with personalized information? The answer is that data identifying the session in some way has to be included in the ETag
. This is exactly what the etag
controller class method does: it provides the necessary information to differentiate between private responses to individual users.
Continuing with the example above, imagine that some users are administrators who are served "edit" buttons for the articles in the UI. You don't want that information to spill over to an anonymous user, so you can prevent that:
Assuming that current_user
is a method returning the currently logged-in user, his/her id is now added to the ETag
before digesting, preventing the leak of sensitive data to unauthorized visitors.
Takeaways
Given all we have learned above, do you feel confident adding HTTP caching to your controllers? I would cautiously assume that you still have some leftover anxiety regarding the leakage of sensitive data. But all in all, using Turbo Frames and personalizing a UI can provide some valuable insights.
Rigorously Decompose into Turbo Frames
If you are using Turbo in your frontend, you can leverage eager or lazy-loaded Turbo Frames. A lot of applications comprise personalized sections of the UI, while others do not. By detaching the non-personalized parts into separate endpoints that employ HTTP caching individually, you not only gain performance benefits but also a clearer application architecture.
Apply Personalized Bits of the UI After the Fact
Another way to deal with user-dependent data is to apply JavaScript after a page has loaded. This is only viable if the amount of personalized data on a page is small. In essence, you'd query an endpoint for a current user's data in JSON format, then use a Stimulus controller to apply it to the relevant bits of the UI.
Wrapping Up
In this post, we demystified HTTP caching. We looked at some basic concepts, cache layers, and configuration, including how to use the Cache-Control
header and validation. We also examined the elegant solution that Rails provides in the form of the ActionController::ConditionalGet
module.
Until next time, happy coding!
P.S. If you'd like to read Ruby Magic posts as soon as they get off the press, subscribe to our Ruby Magic newsletter and never miss a single post!