appsignal

Track cache hits with custom metrics

Robert Beekman

Robert Beekman on

Track cache hits
with custom metrics

Every server running an app that uses AppSignal sends a collection of samples and metrics to our Push API every 30 seconds.

Each request has a key we use to determine which app the data came from. To do that, we need to query our database to find the app for each incoming request. With thirty billion requests per month, we're constantly trying to find ways to reduce the number of queries to make AppSignal faster.

We implemented caching to reduce the number of queries on our database clusters. Whenever we fetch an app from the database, we store it in Memcached for one minute. After deploying this change to production, we found that we were doing more queries than before. It seemed like the cache was invalidated too often. To find out where that happened, we added some custom metrics to find out where the cache was improperly invalidated.

There are a couple of places where we invalidate the cache, such as when the push processed time is updated or when we detect a new namespace.

# Update last push processed at time
if app.last_push_processed_at < 5.minutes.ago
  app.set(:last_push_processed_at => Time.now)
  Rails.cache.delete(cache_key)
end
 
if namespaces_diff.any?
  app.add_to_set(:namespaces => namespaces_diff)
  Rails.cache.delete(cache_key)
end

We added multiple counters to determine which of these cache invalidations was the culprit. In this example we increment the app.cache.invalidate counter to count the total number of validations, and use specific keys such as app.cache.invalidate_push_time and app.cache.invalidate_namespaces for specific invalidations.

# Update last push processed at at time
if app.last_push_processed_at < 5.minutes.ago
  app.set(:last_push_processed_at => Time.now)
  Rails.cache.delete(cache_key)
  Appsignal.increment_counter('app.cache.invalidate', 1)
  Appsignal.increment_counter('app.cache.invalidate_push_time', 1)
end
 
if namespaces_diff.any?
  app.add_to_set(:namespaces => namespaces_diff)
  Rails.cache.delete(cache_key)
  Appsignal.increment_counter('app.cache.invalidate', 1)
  Appsignal.increment_counter('app.cache.invalidate_namespaces', 1)
end

Adding the custom metrics above, we were able to graph our cache hits over time. It became immediately apparent which of the cache keys caused the rise in queries. The app.cache.invalidate_namespaces key was invalidated for each request.

Cache metrics before

The total number of cacheable requests is counted as app.cache.maybe.

After deploying a fix for this issue, the number of invalidations dropped to zero as long as the namespaces for an app weren't updated.

Cache metrics after

Adding custom metrics makes it easier to understand what's happening where, when, and how often. In this case, knowing the number of cache invalidations and showing them in a readable graph allowed us to quickly find an issue. It just needs a couple of lines of code to increment a certain value and create a dashboard.

Let us know if you have any questions about custom metrics and if we can help you to get them set up in your application. We're happy to help!

Write for our blog

Would you like to contribute to the AppSignal blog? We're looking for skilled mid/senior-level Ruby, Elixir, and Node.js writers.

Find out more and apply

Share this article

RSS
Robert Beekman

Robert Beekman

As a co-founder, Robert wrote our very first commit. He's also our support role-model and knows all about the tiny details in code. Travels and photographs (at the same time).

All articles by Robert Beekman

AppSignal monitors your apps

AppSignal provides insights for Ruby, Rails, Elixir, Phoenix, Node.js, Express and many other frameworks and libraries. We are located in beautiful Amsterdam. We love stroopwafels. If you do too, let us know. We might send you some!

Discover AppSignal
AppSignal monitors your apps