Welcome back to another edition of Ruby Magic! A year ago, we learned about Ruby’s Enumerable
module, which provides the methods you use when working with enumerable objects like arrays, ranges, and hashes.
Back then, we created a LinkedList
class to show how to make an object enumerable by implementing the #each
method on it. By including the Enumerable
module, we were able to call methods like #count
, #map
and #select
on any linked list without having to implement them ourselves.
We've learned how to use enumerables, but how do they work? Part of the magic in enumerables in Ruby comes from their internal implementation, which is all based on the single #each
method, and even allows chaining enumerators.
Today, we’ll learn how the methods in the Enumerable
class are implemented and how Enumerator
objects allow chaining enumeration methods.
As you've become accustomed to, we'll dive in deep by implementing our own versions of the Enumerable
module and Enumerator
class. So, put on your over-engineering helmet and let's go!
Linked Lists
Before we begin, let’s start with a new version of the linked list class we wrote previously.
class LinkedList def initialize(head = nil, *rest) @head = head if rest.first.is_a?(LinkedList) @tail = rest.first elsif rest.any? @tail = LinkedList.new(*rest) end end def <<(head) @head ? LinkedList.new(head, self) : LinkedList.new(head) end def inspect [@head, @tail].compact end def each(&block) yield @head if @head @tail.each(&block) if @tail end end
Unlike the previous version, this implementation allows empty lists to be created, as well as lists with more than two items. This version also allows passing a linked list as the tail when initializing another.
irb> LinkedList.new => [] irb> LinkedList.new(1) => [1] irb> LinkedList.new(1, 2) => [1,[2]] irb> LinkedList.new(1, 2, 3) => [1,[2,[3]]] irb> LinkedList.new(1, LinkedList.new(2, 3)) => [1,[2,[3]]] irb> LinkedList.new(1, 2, LinkedList.new(3)) => [1,[2,[3]]]
Previously, our LinkedLIst
class included the Enumerable
module. When mapping over an object using one of Enumerable
's methods, the result are stored in an array. This time, we'll implement our own version to make sure our methods return new linked lists instead.
Enumerable Methods
Ruby's Enumerable
module comes with enumeration methods like #map
, #count
, and #select
. By implementing the #each
method and including the Enumerable
module in our class, we'd be able to use those methods directly on our linked lists.
Instead, we'll implement DIYEnumerable
and import that instead of Ruby's version. This isn't something you'd typically do, but it will give us a clear insight into how enumeration works internally.
Let's start with #count
. Each of the importable methods in the Enumerable
class uses the #each
method we implemented in our LinkedList
class to loop over the object to calculate their results.
module DIYEnumerable def count result = 0 each { |element| result += 1 } result end end
In this example, we've implemented the #count
method on a new DIYEnumerable
module that we’ll include in our linked list. It starts a counter at zero and calls the #each
method to add one to the counter for every loop. After looping over all elements, the method returns the resulting counter.
module DIYEnumerable # ... def map result = LinkedList.new each { |element| result = result << yield(element) } result end end
The #map
method is implemented similarly. Instead of keeping a counter, it uses an accumulator, which starts as an empty list. We'll loop over all elements in the list, and yield the passed block on each element. The result of each yield is appended to the accumulator list.
The method returns the accumulator after looping over all of the elements in the input list.
class LinkedList include DIYEnumerable #... end
After including the DIYEnumerable
in our LinkedList
, we can test our newly added #count
and #map
methods.
irb> list = LinkedList.new(73, 12, 42) => [73, [12, [42]]] irb> list.count => 3 irb> list.map { |element| element * 10 } => [420, [120, [730]]]
Both methods work! The #count
method correctly counts the items in the list, and the #map
method runs a block for each item and returns an updated list.
Reversed lists
However, the #map
method seems to have reverted the list. That’s understandable, as the #<<
method on our linked list class prepends items to the list instead of appending them, which is a feature of the recursive nature of linked lists.
For situations where it’s essential that the order of the list is retained, we need a way to reverse the list when mapping over it. Ruby implements Enumerable#reverse_each
, which loops over an object in reverse. That which sounds like an excellent solution to our problem. Sadly, we can’t use an approach like that because our list is nested. We don’t know how long the list is until we loop over it entirely.
Instead of running the block over the list in reverse, we’ll add a version of #reverse_each
that does this two steps. It first loops over the list to reverse it by creating a new list. After that, it runs the block over the reversed list.
module DIYEnumerable # ... def reverse_each(&block) list = LinkedList.new each { |element| list = list << element } list.each(&block) end def map result = LinkedList.new reverse_each { |element| result = result << yield(element) } result end end
Now, we'll use #reverse_each
in our #map
method, to make sure it's returned in the correct order.
irb> list = LinkedList.new(73, 12, 42) => [73, [12, [42]]] irb> list.map { |element| element * 10 } => [730, [120, [420]]]
It works! Whenever we call our #map
method on a linked list, we'll get a new list back in the same order as the original.
Chaining Enumeration with Enumerators
Through the #each
method implemented on our linked list class and the included DIYEnumerator
, we can now loop both ways and map over linked lists.
irb> list.each { |x| p x } 73 12 42 irb> list.reverse_each { |x| p x } 42 12 73 irb> list.reverse_each.map { |x| x * 10 } => [730, [120, [420]]] => [420, [120, [730]]]
However, what if we need to map over a list in reverse? Since we now reverse the list before mapping over it, it always returns in the same order as the original list. We've already implemented both #reverse_each
and #map
, so we should be able to chain them together to be able to map backwards. Luckily, Ruby's Enumerator
class can help with that.
Last time, we made sure to call Kernel#to_enum
if the LinkedList#each
method was called without a block. This allowed for chaining enumerable methods by returning an Enumerator
object. To find out how the Enumerator
class works, we’ll implement our own version.
class DIYEnumerator include DIYEnumerable def initialize(object, method) @object = object @method = method end def each(&block) @object.send(@method, &block) end end
Like Ruby's Enumerator
, our enumerator class is a wrapper around a method on an object. By bubbling up to the wrapped object, we can chain enumeration methods.
This works because a DIYEnumerator
instance is enumerable itself. It implements #each
by calling the wrapped object, and includes the DIYEnumerable
module so all enumerable methods can be called on it.
We’ll return an instance of our DIYEnumerator
class if no block is passed to the LinkedList#each
method.
class LinkedList # ... def each(&block) if block_given? yield @head @tail.each(&block) if @tail else DIYEnumerator.new(self, :each) end end end
Using our own enumerator, we can now chain enumeration to get the result in the original order without having to pass an empty block to the #reverse_each
method call.
irb> list = LinkedList.new(73, 12, 42) => [73, [12, [42]]] irb> list.map { |element| element * 10 } => [420, [120, [730]]]
Eager and lazy enumeration
This concludes our peek into the implementation of the Enumerable
module and the Enumerator
class for now. We've learned how some of the enumerable methods work and how an enumerator helps chaining enumeration by wrapping an enumerable object.
There are some issues with our approach, though. By its nature, enumeration is eager, meaning it loops over the list as soon as one of the enumerable methods is called on it. While that's fine in most cases, mapping over a list in reverse reverses the list twice, which should be unnecessary.
To lower the number of loops, we could employ Enumerator::Lazy
to delay looping to the last moment, and have duplicate list reversing cancel itself out.
We'll have to save that for a future episode, though. Don't want to miss that, and further expeditions into Ruby's magical inner workings? Subscribe to the Ruby Magic e-mail newsletter, to get new articles delivered to your inbox as soon as they're published.