When requirements change for your product, there arises a need to change not only the codebase but also the existing data that already lives in production.
If you're performing the changes locally, the whole process seems fairly simple. You test your new feature against a sparkling clean database, the test suite is green, and the feature looks great. Then you deploy, and everything goes to hell because you forgot that production was in a slightly different state.
In this post, you'll learn how to handle migrations that may involve systems other than the database itself, while keeping the entire process idempotent and backward-compatible.
Prerequisites
The post mostly requires knowledge of how database migrations work in frameworks such as Ruby on Rails and Phoenix. It also helps to have some knowledge of the constraints of deploying updates to an application that already has existing data in use.
If you've ever dealt with a production system, this is probably nothing new to you. Other tools, such as Redis, are mentioned, but you don't need to know anything about them, other than the fact that they exist.
Migrating Feature Flags to Redis
Here's an example of a list of feature flags, or settings, for your application:
Feature | Enabled 2fa_sms | true notifications_beta | false pagination | true
You now need to migrate these over to Redis, to take advantage of its Pub-Sub and notify all of your services when a flag gets toggled.
You need to somehow get these flags over to the other system, without breaking the behavior in production.
One simple way to do this without having to patch your code to temporarily manage two storage systems simultaneously is to automatically perform this migration right after the deploy. This is typically where database migrations happen. But this is no common database migration. It’s not even a good idea to do it all at once since it touches a Redis instance, which has nothing to do with your database. To solve this, you can write a mechanism similar to those migrations, but tailored to your own needs. Imagine the following script:
You need to update your order management to also include the converted price in EUR (because the finance department needs it).
defmodule DataMigrator.Migrations.MigrateFlagsToRedis do import Ecto.{Query, Changeset} alias App.{Flag, Repo, Redis} def run do Redis.start_link() |> Repo.all(Flag) |> Enum.each(&migrate_flag/1) :ok end defp migrate_flag(%Flag{name: name, enabled: enabled}) do Redis.put(name, enabled) end end
But how do we guarantee that this runs automatically? And even more important, how do we make this idempotent? We certainly don’t want this running again by accident a few days later when the database flags have long become outdated.
Data Migrations
This is where the concept of migrations comes in handy. Ecto migrations keep a table in PostgreSQL tracking which migrations have already been executed. On every run, the system checks which ones are new and runs only those. We can do something similar ourselves.
First, let’s write a function that looks for new migrations at ./priv/data_migrations
for any new files:
defmodule App.DataMigrator do alias App.Repo alias DataMigrator.DataMigration def new_migrations do already_migrated = Repo.all(from dm in DataMigration, select: dm.version) Application.app_dir(:app, "priv/data_migrations/") |> Path.join("*") |> Path.wildcard() |> Enum.map(&extract_migration_info/1) |> Enum.reject(fn # reject migrations that have already ran {version, _, _} -> Enum.member?(migrated_versions, version) _ -> true end) |> Enum.map(fn {version, _name, file} -> # load elixir module in given file [{mod, _}] = Code.load_file(file) {mod, version} end) end defp extract_migration_info(file) do file |> Path.basename() |> Path.rootname() |> Integer.parse() |> case do {integer, _} -> {integer, file} _ -> nil end end
With this snippet, we can write data migrations into ./priv/data_migration
using a name similar to what is done for database migrations, such as 202002111319_migrate_flags_to_redis.exs
The new_migrations/0
function will find those files, filter out the ones that are present in a DataMigration
model (meaning migrations that have already run, and return that as [{202002111319,
DataMigrator.Migrations.``MigrateFlagsToRedis}]
The Migrator
Next, it’s purely a matter of writing a task that runs all new migrations and stores their timestamp in the DataMigration
:
defmodule DataMigrator do ... def run() do Application.load(:app) Repo.start_link() new_migrations() |> Enum.map(&execute_data_migration/1) end def execute_data_migration({version, mod}) do case mod.run() do :ok -> # migration successful, track it Repo.insert!(%DataMigration{ version: version, created_at: DateTime.truncate(DateTime.utc_now(), :second) }) :ok {:error, error} -> # migration failed, most likely a bug that needs fixing Logger.error("Data migration failed with error #{inspect(error)}") System.halt(1) end end end
Finally, using Mix releases, we can easily set this up as a pre-start hook for the next time our application is deployed:
#rel/start/10_data_migrations.sh $RELEASE_ROOT_DIR/bin/app command Elixir.App.DataMigrator run
By doing this, we’re able to immediately change our code to use Redis instead of PostgreSQL flags, as well as having a guarantee that once our refactor is deployed, all flags will automatically be migrated, without us needing to ensure backward compatibility.
Conclusion
With these suggestions, you should be able to simplify some otherwise painful migrations within your system, particularly those that involve keeping both an old and a new system living in parallel. Instead, you can just automate the migration to the new one, and have this as a hook that gets triggered automatically once you deploy.