Migrating Production Data in Elixir

When requirements change for your product, there arises a need to change not only the codebase but also the existing data that already lives in production.

If you're performing the changes locally, the whole process seems fairly simple. You test your new feature against a sparkling clean database, the test suite is green, and the feature looks great. Then you deploy, and everything goes to hell because you forgot that production was in a slightly different state.

In this post, you'll learn how to handle migrations that may involve systems other than the database itself, while keeping the entire process idempotent and backward-compatible.

Prerequisites

The post mostly requires knowledge of how database migrations work in frameworks such as Ruby on Rails and Phoenix. It also helps to have some knowledge of the constraints of deploying updates to an application that already has existing data in use.

If you've ever dealt with a production system, this is probably nothing new to you. Other tools, such as Redis, are mentioned, but you don't need to know anything about them, other than the fact that they exist.

Migrating Feature Flags to Redis

Here's an example of a list of feature flags, or settings, for your application:

Elixir

Feature            | Enabled
2fa_sms            | true
notifications_beta | false
pagination         | true

You now need to migrate these over to Redis, to take advantage of its Pub-Sub and notify all of your services when a flag gets toggled.

You need to somehow get these flags over to the other system, without breaking the behavior in production.

One simple way to do this without having to patch your code to temporarily manage two storage systems simultaneously is to automatically perform this migration right after the deploy. This is typically where database migrations happen. But this is no common database migration. It’s not even a good idea to do it all at once since it touches a Redis instance, which has nothing to do with your database. To solve this, you can write a mechanism similar to those migrations, but tailored to your own needs. Imagine the following script:

You need to update your order management to also include the converted price in EUR (because the finance department needs it).

Elixir

defmodule DataMigrator.Migrations.MigrateFlagsToRedis do
  import Ecto.{Query, Changeset}
  alias App.{Flag, Repo, Redis}
 
  def run do
    Redis.start_link()
 
    |> Repo.all(Flag)
    |> Enum.each(&migrate_flag/1)
 
    :ok
  end
 
  defp migrate_flag(%Flag{name: name, enabled: enabled}) do
    Redis.put(name, enabled)
  end
end

But how do we guarantee that this runs automatically? And even more important, how do we make this idempotent? We certainly don’t want this running again by accident a few days later when the database flags have long become outdated.

Data Migrations

This is where the concept of migrations comes in handy. Ecto migrations keep a table in PostgreSQL tracking which migrations have already been executed. On every run, the system checks which ones are new and runs only those. We can do something similar ourselves.

First, let’s write a function that looks for new migrations at ./priv/data_migrations for any new files:

Elixir

defmodule App.DataMigrator do
  alias App.Repo
  alias DataMigrator.DataMigration
 
  def new_migrations do
    already_migrated = Repo.all(from dm in DataMigration, select: dm.version)
 
    Application.app_dir(:app, "priv/data_migrations/")
    |> Path.join("*")
    |> Path.wildcard()
    |> Enum.map(&extract_migration_info/1)
    |> Enum.reject(fn
      # reject migrations that have already ran
      {version, _, _} -> Enum.member?(migrated_versions, version)
      _ -> true
    end)
    |> Enum.map(fn {version, _name, file} ->
      # load elixir module in given file
      [{mod, _}] = Code.load_file(file)
      {mod, version}
    end)
  end
 
  defp extract_migration_info(file) do
    file
    |> Path.basename()
    |> Path.rootname()
    |> Integer.parse()
    |> case do
      {integer, _} ->
        {integer, file}
 
      _ ->
        nil
    end
end

With this snippet, we can write data migrations into ./priv/data_migration using a name similar to what is done for database migrations, such as 202002111319_migrate_flags_to_redis.exs

The new_migrations/0 function will find those files, filter out the ones that are present in a DataMigration model (meaning migrations that have already run, and return that as [{202002111319, DataMigrator.Migrations.``MigrateFlagsToRedis}]

The Migrator

Next, it’s purely a matter of writing a task that runs all new migrations and stores their timestamp in the DataMigration:

Elixir

defmodule DataMigrator do
  ...
 
  def run() do
    Application.load(:app)
    Repo.start_link()
 
    new_migrations()
    |> Enum.map(&execute_data_migration/1)
  end
 
  def execute_data_migration({version, mod}) do
    case mod.run() do
      :ok ->
        # migration successful, track it
        Repo.insert!(%DataMigration{
          version: version,
          created_at: DateTime.truncate(DateTime.utc_now(), :second)
        })
 
        :ok
 
      {:error, error} ->
        # migration failed, most likely a bug that needs fixing
        Logger.error("Data migration failed with error #{inspect(error)}")
        System.halt(1)
    end
  end
end

Finally, using Mix releases, we can easily set this up as a pre-start hook for the next time our application is deployed:

Elixir

#rel/start/10_data_migrations.sh
 
$RELEASE_ROOT_DIR/bin/app command Elixir.App.DataMigrator run

By doing this, we’re able to immediately change our code to use Redis instead of PostgreSQL flags, as well as having a guarantee that once our refactor is deployed, all flags will automatically be migrated, without us needing to ensure backward compatibility.

Conclusion

With these suggestions, you should be able to simplify some otherwise painful migrations within your system, particularly those that involve keeping both an old and a new system living in parallel. Instead, you can just automate the migration to the new one, and have this as a hook that gets triggered automatically once you deploy.