How did the client get there?
Before diving into the details, let's try to understand how an app might end up in this state. We start with a simple users table. After a few weeks, we need to be able to determine the last sign in time so we add users.last_sign_in_at. Then we need to know the user's name. We add first_name and last_name. Twitter handle? Another column. GitHub profile? Phone number? After a few months the table becomes mind-boggling.
What's wrong with this?
A table that large indicates several problems:
Userhas multiple unrelated responsibilities. This makes it more difficult to understand, change and test.- Exchanging data between the app and the database requires extra bandwidth.
- The app needs more memory to store the bulky model.
The app fetched User on every request for authentication and authorisation purposes but usually used only a handful of columns. Fixing the problem would improve both the design and performance.
Extracting a table
We can solve the problem by extracting seldom-used columns to a new table (or tables). For example, we can extract profile information (first_name, etc.) into profiles with the following steps:
- Create
profileswith columns duplicating profile-related columns inusers. - Add
profile_idtousers. Set it toNULLfor now. - For each row in
users, insert a row toprofilesthat duplicates profile-related columns. - Point
profile_idof the corresponding row inusersto the row inserted in 3. - Do not make
users.profile_idnon-NULL. The app isn't aware of its existence yet so it'd break.
We need to replace references to users.first_name with profiles.first_name and so on. If we're extracting just a few columns with a handful of references then I recommend we do this manually. But as soon as we catch ourselves thinking "Oh, no. This is the worst job ever!" we should look for an alternative.
Don't neglect the problem. A part of the code that everyone avoids will deteriorate further and suffer from even greater inattention. The easiest way to break the vicious circle is to start small.
Read on, if you're curios how my client solved the problem.
Fixing the code one line at a time
The most incremental approach is fixing one reference to the old column at a time. Let's focus on moving first_name from users to profiles.
First, create Profile with:
rails generate model Profile first_name:string
Then add a reference from users to profiles and copy users.first_name to profiles:
class ExtractUsersFirstNameToProfiles < ActiveRecord::Migration # Redefine the models to break dependency on production code. We need # vanilla models without callbacks, etc. Also, removing a model in the future # might break the migration. class User < ActiveRecord::Base; end class Profile < ActiveRecord::Base; end def up add_reference :users, :profile, index: true, unique: true, foreign_key: true User.find_each do |user| profile = Profile.create!(first_name: user.first_name) user.update!(profile_id: profile.id) end change_column_null :users, :profile_id, false end def down remove_reference :users, :profile end end
Because it forces each user to have exactly one profile, a reference from users to profiles is preferable to the opposite reference.
With the database structure in place, we can delegate first_name from User to Profile. My client had several requirements:
- Accessors should use the associated
Profile. They should also log where the deprecated accessor was called from. - Saving
Usershould automatically saveProfilein order to avoid breaking code using the deprecated accessors. User#first_name_changed?and otherActiveModel::Dirtymethods should still work.
This means User should look like this:
class User < ActiveRecord::Base # We need autosave as the client code might be unaware of # Profile#first_name and still reference User#first_name. belongs_to :profile, autosave: true def first_name log_backtrace(:first_name) profile.first_name end def first_name=(new_first_name) log_backtrace(:first_name) # Call super so that User#first_name_changed? and similar still work as # expected. super profile.first_name = new_first_name end private def log_backtrace(name) filtered_backtrace = caller.select do |item| item.start_with?(Rails.root.to_s) end Rails.logger.warn(<<-END) A reference to an obsolete attribute #{name} at: #{filtered_backtrace.join("\n")} END end end
After these changes, the app works the same but may be a bit slower because of the extra references to Profile (if performance becomes an issue just use a tool like AppSignal). The code logs all references to the legacy attributes, even ungreppable ones (e.g. user[attr] = ... or user.send("#{attr}=", ...)) so we'll be able to locate all of them even when grep is unhelpful.
With this infrastructure in place, we can commit to fixing one reference to users.first_name on a regular schedule, e.g. every morning (to start the day with a quick win) or around noon (to work on something easier after a focused morning). This commitment is essential because our goal is to lessen mental barriers to fixing the issue. Leaving the code above in place without taking action will impoverish the app even further.
After removing all deprecated references (and confirming with grep and logs) we can finally drop users.first_name:
class RemoveUsersFirstName < ActiveRecord::Migration def change remove_column :users, :first_name, :string end end
We should also get rid of the code added to User as it's no longer necessary.
Limitations
The method might apply to your case but keep in mind some of its limitations:
- It doesn't handle bulk queries like
User.update_all. - It doesn't handle raw SQL queries.
- It may break monkey-patches (remember that dependencies might introduce them too).
UserandProfilemay be out of sync, ifprofiles.first_nameis updated butusers.first_nameis not.
You might be able to overcome some of them. For example, you might keep the models in sync with a service object or a callback on Profile. Or if you use PostgreSQL you might consider using a materialized view in the interim.
That's it!
The most important lesson of the article is do not avoid code that smells but tackle it head on instead. If the task is overwhelming then work iteratively on a regular schedule. The article presented a method to consider when extracting a table is difficult. If you can't apply it then look for something else. If you have no idea how then just drop me a line. I'll try to help. Don't let your bits rot.
Wondering what you can do next?
Finished this article? Here are a few more things you can do:
- Subscribe to our Ruby Magic newsletter and never miss an article again.
- Start monitoring your Ruby app with AppSignal.
- Share this article on social media
Most popular Ruby articles

What's New in Ruby on Rails 8
Let's explore everything that Rails 8 has to offer.
See more
Measuring the Impact of Feature Flags in Ruby on Rails with AppSignal
We'll set up feature flags in a Solidus storefront using Flipper and AppSignal's custom metrics.
See more
Five Things to Avoid in Ruby
We'll dive into five common Ruby mistakes and see how we can combat them.
See more

Greg Navis
In his consulting business, Greg is exposed to different design decisions, trade-offs, problems, and bugs. In his articles he shares his tips and tricks.
All articles by Greg NavisBecome our next author!
AppSignal monitors your apps
AppSignal provides insights for Ruby, Rails, Elixir, Phoenix, Node.js, Express and many other frameworks and libraries. We are located in beautiful Amsterdam. We love stroopwafels. If you do too, let us know. We might send you some!

