ruby

A Guide to Memoization in Ruby

Abiodun Olowode

Abiodun Olowode on

A Guide to Memoization in Ruby

Memoization is a caching technique to make your Ruby application run more efficiently and faster.

In this post, we'll look at the benefits of memoization and when to use it in your Ruby application. We'll also look at some memoization mistakes to avoid.

Let's first start by looking at code optimization — what it is and some of the different optimization techniques available.

What Is Code Optimization?

Code optimization is the process of improving code quality to make a piece of code or program more efficient and functional. Its advantages include — but are not limited to:

  • Less memory consumption during an expensive calculation
  • Much faster execution
  • Sometimes, less space in terms of codebase size

The need to achieve some of the goals mentioned above arises at one time or another during an app's life cycle. If you're at a loss about where to begin with code optimization, try profiling!

What Is Profiling?

Profiling refers to analyzing a program to measure its space and time complexities. Via profiling, we can obtain information such as:

  • The frequency and duration of function calls
  • The percentage of program execution time spent in a function in comparison with other functions
  • The call stack of every function
  • How many database calls are made for an HTML page to successfully load and how long it takes

This information can guide us to where code optimization is highly required.

Code Optimization Methods for Ruby and Rails

A few Ruby and Rails code optimization techniques include:

  • Getting rid of N+1 queries - This can help improve the speed of an app. The Bullet or Prosopite gems can give a lending hand here. The N+1 Dilemma — Bullet or Prosopite? entails a brief comparison of both.
  • Using static code analyzers - These reduce memory consumption and codebase size, as we're alerted to code duplication, unused variables or method arguments, and the like. Examples include Rubocop and RubyCritic.
  • Caching - Stores the content generated during the request-response cycle and reuses it when responding to similar requests, to improve the speed of an application. In Rails, we have page caching, fragment caching, action caching, low-level caching, and many more.
  • Choosing appropriate data structures - Some data structures perform better in certain situations than others. As a result, a good way to optimize code is to use the most appropriate data structures for each situation, considering their space and time complexities.
  • Memoization - Improves speed by reducing the number of times certain computations are carried out.

Let's now turn our focus to memoization.

An Introduction to Memoization in Ruby

Memoization is the act of caching a method's result so that the next time that method is called, the previous result is returned (as opposed to carrying out a recomputation). This helps save time when running a program.

Let's look at an example involving anagrams.

In the class below, we create a dictionary to pass in any word as an argument, and find the anagram.

class Dictionary
  def words
    puts "creating my dictionary"
    words = File.readlines('/usr/share/dict/words')
    dictionary = Hash.new {|h,k| h[k] = []}
    words.each do |word|
      word = word.chomp
      dictionary[word.chars.sort.join("")] = word
    end
    dictionary
  end
 
  def check(word)
    words[word.chars.sort.join("")]
  end
end

Testing the class above, we obtain:

dictionary = Dictionary.new
dictionary.check('rasp')
# creating my dictionary
=> "spar"
dictionary.check('kame')
# creating my dictionary
=> "make"

We can see that every time we call the check method, we create the dictionary again. This is definitely not optimal, as the dictionary does not change.

What if we created the dictionary once and used it whenever it was needed? We can; using memoization. Memoization allows us to cache the dictionary if it has been previously created.

def words
  @words ||= begin
    puts "creating my dictionary"
    words = File.readlines('/usr/share/dict/words')
    dictionary = Hash.new {|h,k| h[k] = []}
    words.each do |word|
      word = word.chomp
      dictionary[word.chars.sort.join("")] = word
    end
    dictionary
  end
end

Testing the above, we obtain:

dict.check('eat')
#creating my dictionary
=> "tea"
dict.check('kame')
=> "make"
dict.check('live')
=> "vile"

As we can see, the dictionary is created once and the cached version is used after that.

Let's take a benchmark to see how much faster our program has become because of memoization. Name the memoized version memoized_check, and the non-memoized version check:

require "benchmark"
 
dictionary = Dictionary.new
puts Benchmark.measure { 10.times { dictionary.check('rasp') } }
puts Benchmark.measure { 10.times { dictionary.memoized_check('rasp') } }

We get the following result:

5.771061   0.044656   5.815717 (  5.836218)
0.563966   0.000016   0.563982 (  0.564909)

This shows us that the unmemoized version takes 5.83 seconds while the memoized one takes 0.56 seconds, making it about ten times faster.

Memoization Mistakes to Avoid in Your Ruby Application

Let's now look at some mistakes to avoid when using memoization.

Ignoring the False Or Nil Return

In cases where a method's computation returns a false or nil, each call leads to a recomputation whenever the method is called (even though memoized). This is because the comparison is made using an or — and in Ruby, both nil and false are false values.

2 || 4+5
=> 2
nil || 4+5
=> 9
false || 4+5
=> 9

Memoization using ||= doesn't consider false/nil return values, so such cases should be handled.

def do_computation
  puts "I am computing"
  nil
end
 
def check
  @check ||= do_computation
end

Calling the check method, we get the following results:

check
# I am computing
=> nil
check
# I am computing
=> nil

To deal with this, we can ascertain if the variable has been defined. If so, we return early before proceeding to the computation.

def check
  return @check if defined?(@check)
  @check ||= do_computation
end

This results in:

check
# I am computing
=> nil
check
=> nil

Passing Parameters to a Method

Another common mistake is assuming memoization will work differently when parameters are passed to a method.

Let's say that a method's result is memoized using ||=, but that result depends on parameters. If these parameters change, the result does not change.

def change_params(num)
  @params ||= num
end

Let's see what happens here:

change_params(4)
=> 4
change_params(8)
=> 4

The result does not change just because we changed the parameters, and frankly, this is what is expected, considering how memoization works in its basic form.

To deal with cases like this, you have to be comfortable with:

  • storing the parameters as keys in a hash
  • using a gem or module that takes all the different cases to be handled into consideration

Using a hash:

def change_params(num)
  @params_hash ||= {}
  if (@params_hash.has_key?(num))
    @params_hash[num]
  else
    puts 'creating a new key-value pair'
    @params_hash[num] = num
  end
end

Trying this out, we have:

change_params(4)
creating a new key-value pair
=> 4
change_params(8)
creating a new key-value pair
=> 8
change_params(4)
=> 4
change_params(8)
=> 8

Another way to rewrite this would be as the following:

def change_params(num)
  @params_hash ||= {}
  @params_hash[num] ||= num
end

Or you can use the gem Memoist. It handles caching the method's results, considering the arguments passed. It also provides a way to flush the current value or the entire memoization cache for an object.

When To Memoize — and When Not To

To decide when to memoize, take note of the following:

Expensive Operations

Let's say that an expensive operation will most definitely be called multiple times within a class and return the same result.

We can move this into an instance variable initialized upon object instantiation (the expensive operation is also done at this time).

attr_reader :result
 
def initialize
  @result = do_expensive_calculation
end

result is available throughout the lifecycle of the class instance, and the expensive calculation is only done once.

In cases like this, we don't need a separate method to memoize the do_expensive_calculation value.

An expensive calculation that might not happen — but if it does, might happen more than once (and return the same value) — is a good candidate for memoization. This means we only do_expensive_calculation when needed and then cache the result.

def expensive_calculation
  @expensive_calculation ||= do_expensive_calculation
end

Analysing Potential Performance Improvements

Memoization may be seen as necessary only after we've carried out a performance analysis. We need to accurately ascertain that the implemented memoization actually improves performance (as we did within the Dictionary class).

Without this, we could add unnecessary complexities to the code base. Be sure that the space and code complexities incurred from memoization are low compared to the gain in run-time.

Changing Parameters

If the parameters used for computation are constantly changing, memoization is not a good option.

Memoization is more suitable for pure functions whose return values are the same for the same set of arguments.

def calculation(a, b)
  a + b + Time.now.to_i
end

In the example above, let's say we do cache the method result. Every other time we call the method, our cached value is wrong because Time.now changes.

Wrapping Up

In this post, we explored how memoization, like every caching technique, comes with its advantages and disadvantages. We first looked at a range of code optimization techniques before diving into an example involving memoization. Then we touched on some mistakes to look out for. Finally, we explored when memoization is beneficial and when to avoid it.

When memoization is carried out on methods available to class instances, the memoized result is only available for the lifecycle of that object. Where this result is the same for multiple class instances (e.g., multiple web requests), class-level memoization is usually a go-to.

However, this might add more complexities in terms of cache invalidation. Using a cache store might be a better alternative to caching and allow for better optimization.

Before you decide to employ it, you must ascertain that the pros of memoization outweigh the cons for your specific use case.

Happy memoizing!

P.S. If you'd like to read Ruby Magic posts as soon as they get off the press, subscribe to our Ruby Magic newsletter and never miss a single post!

Share this article

RSS
Abiodun Olowode

Abiodun Olowode

Our guest author Abiodun is a software engineer who works with Ruby/Rails and React. She is passionate about sharing knowledge via writing/speaking and spends her free time singing, binge-watching movies, and watching football games.

All articles by Abiodun Olowode

Become our next author!

Find out more

AppSignal monitors your apps

AppSignal provides insights for Ruby, Rails, Elixir, Phoenix, Node.js, Express and many other frameworks and libraries. We are located in beautiful Amsterdam. We love stroopwafels. If you do too, let us know. We might send you some!

Discover AppSignal
AppSignal monitors your apps