Measuring the Impact of Feature Flags in Ruby on Rails with AppSignal

Feature flags are a powerful tool in software development, allowing developers to control the behavior of an application at runtime without deploying new code. They enable teams to test new features, perform A/B testing, and roll out changes gradually.

In Ruby on Rails, feature flags can be managed using diverse tools, the most popular being the Flipper gem. This article will explore implementing and measuring the impact of feature flags in a Solidus storefront using Flipper and AppSignal's custom metrics.

What Are Feature Flags in Rails, Again?

If you are looking for an introduction to the subject, check out the post Add Feature Flags in Ruby on Rails with Flipper.

In a nutshell, though, feature flags are a way to influence how your application behaves at runtime, without having to deploy new code. The simplest type of feature flags are environment variables. Every Ruby on Rails application uses them out of the box. One example is the configuration of application server concurrency using ENV['WEB_CONCURRENCY'].

However, there are other ways to manage feature flags, such as using a persistence layer like ActiveRecord or Redis. A comprehensive way to do this is offered by the Flipper gem.

The following snippet exemplifies how the performance_improvement feature flag is evaluated for a given user:

Ruby

@categories = if Flipper.enabled?(:performance_improvement, user)
  Category.all.includes(:products)
else
  Category.all
end

Next, we will set up a Solidus storefront to start experimenting with feature flags.

Our Example App: A Solidus Storefront

To measure the impact of feature flags in a somewhat realistic scenario, let's quickly bootstrap a Solidus store:

Shell

rails new coder_swag_store && cd coder_swag_store
bundle add solidus
bin/rails g solidus:install

This generator will guide you through the process and ask you a few setup questions.

Choose the starter frontend when queried for the frontend type.
Skip setting up a payment method.
Choose to mount your Solidus application at /, since we are using it as a standalone app.

Afterward, run bin/dev from your terminal and you should be good to go. When you go to http://localhost:3000, you'll see this screen:

Implement Feature Flags with Flipper

Now let's implement two exemplary use cases for feature flags:

A performance improvement.
An attempt at conversion rate optimization.

First of all, though, we have to add the flipper gem along with its active_record storage adapter:

Shell

bundle add flipper
bundle add flipper-active_record
bin/rails g flipper:setup
bin/rails db:migrate

This will set up the required database tables to look up Flipper "gates", i.e., concrete conditionals to evaluate when checking a feature flag.

Testing a Performance Improvement

To assess this scenario, we will simulate a slow request in the storefront by adding a sleep 1 call for the unoptimized case:

Ruby

# app/controllers/products_controller.rb
 
def index
  @searcher = build_searcher(params.merge(include_images: true))
  @products = @searcher.retrieve_products
 
  if Flipper.enabled?(:performance_improvement)
    # 🚅
  else
    sleep 1
  end
end

Now, we will use a "percentage of time" strategy to roll out the optimization across a random set of requests. Open a Rails console and key in the following:

Ruby

Flipper.enable_percentage_of_time(:performance_improvement, 50)

Using the oha load testing tool, we can confirm that indeed half of the requests take one second longer than the others:

Shell

$ oha http://localhost:3000/products -z 30s -q 2
Summary:
  Success rate: 100.00%
  Total:        30.0037 secs
  Slowest:      1.2662 secs
  Fastest:      0.1049 secs
  Average:      0.6260 secs
  Requests/sec: 2.0664
 
  Total data:   2.59 MiB
  Size/request: 44.24 KiB
  Size/sec:     88.47 KiB
 
Response time histogram:
  0.105 [1]  |■
  0.221 [24] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.337 [8]  |■■■■■■■■■■
  0.453 [0]  |
  0.569 [0]  |
  0.686 [0]  |
  0.802 [0]  |
  0.918 [0]  |
  1.034 [0]  |
  1.150 [6]  |■■■■■■■■
  1.266 [21] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■

Testing Conversion Rate Optimization

When dealing with user-facing features, for example, changes in the UI, it's often advisable to use a "percentage of actors" strategy to roll out flags. This way, every user is consistently offered the same experience.

So to start, we'll create two users for our e-commerce application. Fire up a Rails console and issue the following commands:

Ruby

Spree::User.create(email: "test1@example.com", login: "test1@example.com", password: "super_safe_password", password_confirmation: "super_safe_password")
Spree::User.create(email: "test2@example.com", login: "test2@example.com", password: "super_safe_password", password_confirmation: "super_safe_password")
Flipper.enable_percentage_of_actors(:conversion_rate_optimization, 50)

This creates two sample users and ensures that the feature flag is consistently enabled for one of them.

To simulate a feature attempting to drive conversion rates up, we'll make the checkout button pulsate:

erb

<!-- app/views/carts/_cart_footer.html.erb -->
<% order = order_form.object %>
 
<footer class="cart-footer">
  <%= render 'carts/cart_adjustments' %>
  <p class="cart-footer__total flex justify-between mb-3 text-body-20 p-2">
    <%= t('spree.total') %>: <span class="font-sans-md"><%= order.display_total %></span>
  </p>
  <div class="cart-footer__primary-action">
    <%= order_form.button(
      I18n.t('spree.checkout'),
      class: "button-primary w-full #{'animate-pulse' if Flipper.enabled?(:conversion_rate_optimization, spree_current_user)}",
      id: 'checkout-link',
      name: :checkout
    ) %>
  </div>
</footer>

If we log in with both users and arrange the browser windows side by side, we can observe that indeed the effect is active for one (the left) user:

Use AppSignal Custom Metrics to Measure the Impact of Feature Flags

The best feature flag system is useless if there's no way to evaluate its impact. In our example scenario, we simply want to know:

Has the performance improvement led to a significant latency reduction?
Has our pulsating checkout button led to a significantly higher conversion rate?

We will use AppSignal's custom metrics to measure the pay-off of these optimizations.

First of all, create a new application in your AppSignal organization and connect it to your app by following the instructions:

Shell

bundle add appsignal
bundle exec appsignal install YOUR_APPSIGNAL_UUID

Measuring Latency with a Measurement Metric

We have verified how effective our improvement is with the oha CLI above, but to make valid judgments we'll install server-side telemetry that reports latency to AppSignal. A measurement metric allows for exactly that: we will send over response times in milliseconds, and add a metric tag indicating whether our performance optimization was active for a specific request.

There's a small gotcha here: because we're employing the "Percentage of Time" metric, we have to capture the flag's state in an instance variable so that the same value is used for execution and for reporting:

Ruby

# app/controllers/products_controller.rb
 
class ProductsController < StoreController
  around_action :measure_response_time, only: :index
 
  # ...
 
  def index
    @searcher = build_searcher(params.merge(include_images: true))
    @products = @searcher.retrieve_products
 
    @performance_improvement_enabled = Flipper.enabled?(:performance_improvement)
 
    if @performance_improvement_enabled
      # 🚅
    else
      sleep 1
    end
  end
 
  # ...
 
  private
 
  # ...
 
  def measure_response_time
    response_time = Benchmark.realtime { yield }
    Appsignal.add_distribution_value("products_response_time", response_time * 1000, performance_improvement_enabled: @performance_improvement_enabled)
  end
end

Now let's repeat the local load testing from above:

Shell

$ oha http://localhost:3000/products -z 30s -q 2

We'll look at charting and evaluating this metric in a bit. Before that, let's turn to our second feature flag.

Tallying Conversions with a Count Metric

We'll use a counter metric to count conversions. This is a great choice if all you want to do is just keep a tally of an event.

To do this, we'll have to open CartsController, and, for demonstration purposes, add an increment_counter call if the checkout button is clicked:

Ruby

# app/controllers/carts_controller.rb
 
def update
  authorize! :update, @order, cookies.signed[:guest_token]
  if @order.contents.update_cart(order_params)
    # ...
 
    Appsignal.increment_counter("checkout_count", 1, optimization_active: Flipper.enabled?(:conversion_rate_optimization, spree_current_user))
 
    # ...
  else
    render action: :show
  end
end

Now let's test this by manually opening respective browser windows and clicking the "Checkout" button 3 times, and in another case only once. In this way, we can see if the optimization flag is active.

Set Up Custom Dashboards in AppSignal

Our final step is to create informative graphics to make data-informed business decisions. We'll use AppSignal's dashboards to achieve this. Let's go through this step by step:

In the left sidebar, click "Add dashboard" and name it "Feature Flag Evaluation".

Click "Add Graph" and the products_response_time metric. Select "mean" to display only averages and apply the performance_improvement_enabled tag.

Click "Add new Graph" to add a chart for the checkout counts. Again, apply the optimization_active tag.

Now your custom dashboard is ready. In the line graph on the left, you can assert that your performance improvement was effective. On the right, observe how the higher count of checkouts in the optimized case was recorded.

And that's it!

Wrapping Up

We've seen how feature flags offer a flexible and efficient way to manage and deploy new features in a Ruby on Rails application. By using tools like the Flipper gem and AppSignal's custom metrics, developers can not only control feature rollouts, but also measure their impact on performance and user behavior.

This approach ensures that new features are thoroughly tested and optimized before being fully deployed, ultimately leading to a more stable and user-friendly application. Finally, it can lead to more informed business decisions when gauging the effectiveness of alternative approaches.

Happy coding!

P.S. If you'd like to read Ruby Magic posts as soon as they get off the press, subscribe to our Ruby Magic newsletter and never miss a single post!

Core features

Advanced tools

Supported Languages

Larger scale

Add-Ons

Measuring the Impact of Feature Flags in Ruby on Rails with AppSignal

What Are Feature Flags in Rails, Again?

Our Example App: A Solidus Storefront

Implement Feature Flags with Flipper

Testing a Performance Improvement

Testing Conversion Rate Optimization

Use AppSignal Custom Metrics to Measure the Impact of Feature Flags

Measuring Latency with a Measurement Metric

Tallying Conversions with a Count Metric

Set Up Custom Dashboards in AppSignal

Wrapping Up

Wondering what you can do next?

Most popular Ruby articles

What's New in Ruby on Rails 8

Five Things to Avoid in Ruby

Should You Use Ruby on Rails or Hanami?

Julian Rubisch

AppSignal monitors your apps