ruby

Measuring the Impact of Feature Flags in Ruby on Rails with AppSignal

Julian Rubisch

Julian Rubisch on

Measuring the Impact of Feature Flags in Ruby on Rails with AppSignal

Feature flags are a powerful tool in software development, allowing developers to control the behavior of an application at runtime without deploying new code. They enable teams to test new features, perform A/B testing, and roll out changes gradually.

In Ruby on Rails, feature flags can be managed using diverse tools, the most popular being the Flipper gem. This article will explore implementing and measuring the impact of feature flags in a Solidus storefront using Flipper and AppSignal's custom metrics.

What Are Feature Flags in Rails, Again?

If you are looking for an introduction to the subject, check out the post Add Feature Flags in Ruby on Rails with Flipper.

In a nutshell, though, feature flags are a way to influence how your application behaves at runtime, without having to deploy new code. The simplest type of feature flags are environment variables. Every Ruby on Rails application uses them out of the box. One example is the configuration of application server concurrency using ENV['WEB_CONCURRENCY'].

However, there are other ways to manage feature flags, such as using a persistence layer like ActiveRecord or Redis. A comprehensive way to do this is offered by the Flipper gem.

The following snippet exemplifies how the performance_improvement feature flag is evaluated for a given user:

Ruby
@categories = if Flipper.enabled?(:performance_improvement, user) Category.all.includes(:products) else Category.all end

Next, we will set up a Solidus storefront to start experimenting with feature flags.

Our Example App: A Solidus Storefront

To measure the impact of feature flags in a somewhat realistic scenario, let's quickly bootstrap a Solidus store:

Shell
rails new coder_swag_store && cd coder_swag_store bundle add solidus bin/rails g solidus:install

This generator will guide you through the process and ask you a few setup questions.

  1. Choose the starter frontend when queried for the frontend type.
  2. Skip setting up a payment method.
  3. Choose to mount your Solidus application at /, since we are using it as a standalone app.

Afterward, run bin/dev from your terminal and you should be good to go. When you go to http://localhost:3000, you'll see this screen:

Solidus sample store

Implement Feature Flags with Flipper

Now let's implement two exemplary use cases for feature flags:

  • A performance improvement.
  • An attempt at conversion rate optimization.

First of all, though, we have to add the flipper gem along with its active_record storage adapter:

Shell
bundle add flipper bundle add flipper-active_record bin/rails g flipper:setup bin/rails db:migrate

This will set up the required database tables to look up Flipper "gates", i.e., concrete conditionals to evaluate when checking a feature flag.

Testing a Performance Improvement

To assess this scenario, we will simulate a slow request in the storefront by adding a sleep 1 call for the unoptimized case:

Ruby
# app/controllers/products_controller.rb def index @searcher = build_searcher(params.merge(include_images: true)) @products = @searcher.retrieve_products if Flipper.enabled?(:performance_improvement) # 🚅 else sleep 1 end end

Now, we will use a "percentage of time" strategy to roll out the optimization across a random set of requests. Open a Rails console and key in the following:

Ruby
Flipper.enable_percentage_of_time(:performance_improvement, 50)

Using the oha load testing tool, we can confirm that indeed half of the requests take one second longer than the others:

Shell
$ oha http://localhost:3000/products -z 30s -q 2 Summary: Success rate: 100.00% Total: 30.0037 secs Slowest: 1.2662 secs Fastest: 0.1049 secs Average: 0.6260 secs Requests/sec: 2.0664 Total data: 2.59 MiB Size/request: 44.24 KiB Size/sec: 88.47 KiB Response time histogram: 0.105 [1] |â–  0.221 [24] |â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â–  0.337 [8] |â– â– â– â– â– â– â– â– â– â–  0.453 [0] | 0.569 [0] | 0.686 [0] | 0.802 [0] | 0.918 [0] | 1.034 [0] | 1.150 [6] |â– â– â– â– â– â– â– â–  1.266 [21] |â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– 

Testing Conversion Rate Optimization

When dealing with user-facing features, for example, changes in the UI, it's often advisable to use a "percentage of actors" strategy to roll out flags. This way, every user is consistently offered the same experience.

So to start, we'll create two users for our e-commerce application. Fire up a Rails console and issue the following commands:

Ruby
Spree::User.create(email: "test1@example.com", login: "test1@example.com", password: "super_safe_password", password_confirmation: "super_safe_password") Spree::User.create(email: "test2@example.com", login: "test2@example.com", password: "super_safe_password", password_confirmation: "super_safe_password") Flipper.enable_percentage_of_actors(:conversion_rate_optimization, 50)

This creates two sample users and ensures that the feature flag is consistently enabled for one of them.

To simulate a feature attempting to drive conversion rates up, we'll make the checkout button pulsate:

erb
<!-- app/views/carts/_cart_footer.html.erb --> <% order = order_form.object %> <footer class="cart-footer"> <%= render 'carts/cart_adjustments' %> <p class="cart-footer__total flex justify-between mb-3 text-body-20 p-2"> <%= t('spree.total') %>: <span class="font-sans-md"><%= order.display_total %></span> </p> <div class="cart-footer__primary-action"> <%= order_form.button( I18n.t('spree.checkout'), class: "button-primary w-full #{'animate-pulse' if Flipper.enabled?(:conversion_rate_optimization, spree_current_user)}", id: 'checkout-link', name: :checkout ) %> </div> </footer>

If we log in with both users and arrange the browser windows side by side, we can observe that indeed the effect is active for one (the left) user:

Use AppSignal Custom Metrics to Measure the Impact of Feature Flags

The best feature flag system is useless if there's no way to evaluate its impact. In our example scenario, we simply want to know:

  • Has the performance improvement led to a significant latency reduction?
  • Has our pulsating checkout button led to a significantly higher conversion rate?

We will use AppSignal's custom metrics to measure the pay-off of these optimizations.

First of all, create a new application in your AppSignal organization and connect it to your app by following the instructions:

Shell
bundle add appsignal bundle exec appsignal install YOUR_APPSIGNAL_UUID

Measuring Latency with a Measurement Metric

We have verified how effective our improvement is with the oha CLI above, but to make valid judgments we'll install server-side telemetry that reports latency to AppSignal. A measurement metric allows for exactly that: we will send over response times in milliseconds, and add a metric tag indicating whether our performance optimization was active for a specific request.

There's a small gotcha here: because we're employing the "Percentage of Time" metric, we have to capture the flag's state in an instance variable so that the same value is used for execution and for reporting:

Ruby
# app/controllers/products_controller.rb class ProductsController < StoreController around_action :measure_response_time, only: :index # ... def index @searcher = build_searcher(params.merge(include_images: true)) @products = @searcher.retrieve_products @performance_improvement_enabled = Flipper.enabled?(:performance_improvement) if @performance_improvement_enabled # 🚅 else sleep 1 end end # ... private # ... def measure_response_time response_time = Benchmark.realtime { yield } Appsignal.add_distribution_value("products_response_time", response_time * 1000, performance_improvement_enabled: @performance_improvement_enabled) end end

Now let's repeat the local load testing from above:

Shell
$ oha http://localhost:3000/products -z 30s -q 2

We'll look at charting and evaluating this metric in a bit. Before that, let's turn to our second feature flag.

Tallying Conversions with a Count Metric

We'll use a counter metric to count conversions. This is a great choice if all you want to do is just keep a tally of an event.

To do this, we'll have to open CartsController, and, for demonstration purposes, add an increment_counter call if the checkout button is clicked:

Ruby
# app/controllers/carts_controller.rb def update authorize! :update, @order, cookies.signed[:guest_token] if @order.contents.update_cart(order_params) # ... Appsignal.increment_counter("checkout_count", 1, optimization_active: Flipper.enabled?(:conversion_rate_optimization, spree_current_user)) # ... else render action: :show end end

Now let's test this by manually opening respective browser windows and clicking the "Checkout" button 3 times, and in another case only once. In this way, we can see if the optimization flag is active.

Set Up Custom Dashboards in AppSignal

Our final step is to create informative graphics to make data-informed business decisions. We'll use AppSignal's dashboards to achieve this. Let's go through this step by step:

  1. In the left sidebar, click "Add dashboard" and name it "Feature Flag Evaluation".
Add dashboard
  1. Click "Add Graph" and the products_response_time metric. Select "mean" to display only averages and apply the performance_improvement_enabled tag.
Add graph — mean
  1. Click "Add new Graph" to add a chart for the checkout counts. Again, apply the optimization_active tag.
Add new graph

Now your custom dashboard is ready. In the line graph on the left, you can assert that your performance improvement was effective. On the right, observe how the higher count of checkouts in the optimized case was recorded.

Custom dashboard

And that's it!

Wrapping Up

We've seen how feature flags offer a flexible and efficient way to manage and deploy new features in a Ruby on Rails application. By using tools like the Flipper gem and AppSignal's custom metrics, developers can not only control feature rollouts, but also measure their impact on performance and user behavior.

This approach ensures that new features are thoroughly tested and optimized before being fully deployed, ultimately leading to a more stable and user-friendly application. Finally, it can lead to more informed business decisions when gauging the effectiveness of alternative approaches.

Happy coding!

P.S. If you'd like to read Ruby Magic posts as soon as they get off the press, subscribe to our Ruby Magic newsletter and never miss a single post!

Julian Rubisch

Julian Rubisch

Our guest author Julian is a freelance Ruby on Rails consultant based in Vienna, specializing in Reactive Rails. Part of the StimulusReflex core team, he has been at the forefront of developing cutting-edge HTML-over-the-wire technology since 2020.

All articles by Julian Rubisch

Become our next author!

Find out more

AppSignal monitors your apps

AppSignal provides insights for Ruby, Rails, Elixir, Phoenix, Node.js, Express and many other frameworks and libraries. We are located in beautiful Amsterdam. We love stroopwafels. If you do too, let us know. We might send you some!

Discover AppSignal
AppSignal monitors your apps