Building Compile-time Tools With Elixir's Compiler Tracing Features

Elixir 1.10 was recently released, and with that release came a little-known, but very interesting feature—compiler tracing. This feature means that as the Elixir compiler is compiling your code, it can emit messages whenever certain kinds of things are compiled. This ability to know what's going on when Elixir is compiling our code might seem simple, but it actually opens up a lot of doors for opportunities to build customized compile-time tooling for Elixir applications.

In this post, we'll go over several kinds of tooling you might consider building with this new feature, explain why these tools are a great idea, and then show a quick implementation of a basic check one might write to automatically enforce a project-specific rule in an application.

Why You Should Build Your Own Tools

There is already a great deal of excellent tooling in the Elixir community that can help us build better, safer, more consistent applications. But this tooling is very generalized to a lowest common denominator that might apply to any Elixir application, and there are often rules or conventions that a team might want to enforce that are specific to that team or the domain in which they're working. Normally, this is done manually with practices like pull request reviews, pair programming, or manually enforced style guides.

However, there are a great many of these rules that can be easily enforced automatically! Moving this responsibility away from humans and onto computers has many benefits. First and foremost, it frees up time for humans to do what they do best—think about complicated problems. When we're not worrying about formatting and naming rules, we can spend more time thinking about higher-level problems such as design and complex domain problems.

It's also great to get faster feedback when we're working on something. Instead of waiting for someone to review your PR, only to tell you that there's some variable that is named incorrectly according to your team's naming conventions, you can get that feedback quickly with the tools you've built. If you've integrated the tools with whatever editor you're using, the feedback can happen almost instantly.

Lastly, these sorts of tools can help offload a rather difficult interpersonal task from your team members. Having to remind your colleagues to fix things like formatting and styling in a PR review can strain your relationships. Instead, when these reminders come from an automated check, it leads to less frustration with your colleagues and better team morale. This is a frequently under-valued benefit of this kind of automation.

Today, I'm going to show you how to use Elixir 1.10's compiler tracing features to build a simple check that will enforce a rule that all modules that define an Ecto schema must use the application's extension of Ecto.Schema and not Ecto.Schema directly. This is a fairly common pattern in applications that use Ecto. Let's imagine that we've defined a module called MyApp.Schema that looks like this:

Elixir

# my_app/schema.ex
defmodule MyApp.Schema do
  defmacro __using__(_) do
    quote do
      use Ecto.Schema
      import MyApp.Helpers
      @derive Jason.Encoder
    end
  end
end

Normally, enforcing that all schemas use this macro would need to be done manually, but with this tracer enabled, if someone accidentally forgets to use MyApp.Schema and uses Ecto.Schema instead, they'll get a compilation warning.

Implementing a Basic Tracer

A valid tracer for Elixir 1.10's compiler events is any module that implements the trace/2 function. This function should return :ok. The arguments it accepts can vary depending on which event is being emitted. Each event that the compiler emits can have a slightly different signature (which we'll go over later), but for now, we're going to just focus on the one thing we care about for this example—the :remote_macro event.

To get this working quickly, we're going to write a quick .exs script to do this task for us (because of many reasons that we won't go into today). Here's what that task looks like:

Elixir

# scripts/schema_validator.exs
defmodule SchemaValidator do
  def run() do
    :ets.new(:schemas, [:named_table, :public])
    Mix.Task.clear()
    Mix.Task.run("compile", ["--force", "--tracer", __MODULE__])
  end
 
  @spec trace(tuple, Macro.Env.t()) :: :ok
  def trace({:remote_macro, _meta, MyApp.Schema, :__using__, 1}, env) do
    :ets.insert(:schemas, {env.module, true})
    :ok
  end
 
  def trace({:remote_macro, meta, Ecto.Schema, :__using__, 1}, env) do
    case :ets.lookup(:schemas, env.module) do
      [] -> IO.warn("#{env.file}:#{meta[:line]} - #{inspect(env.module)} should use `MyApp.Schema`", [])
      _ -> :ok
    end
  end
 
  def trace(_, _), do: :ok
end
 
SchemaValidator.run()

That's it! With that, we can run mix run scripts/schema_validator.exs and have a very basic version of this check working! So, let's dive in and explain what's going on there, starting with run/0. In that function, we start an ETS table that will hold the global state for our task. It's important to remember that the Elixir compiler runs in parallel, with each module compiled in its own process, so we'll need a place to keep the data we're collecting during compilation, and ETS is as good a place as any—we just need to remember to make it a :public table.

Then we have to run Mix.Task.clear() to clear out mix's cache of tasks that have already been run, including compile. Otherwise, mix is nice enough to not duplicate tasks that have already been run to speed things up and compile would return :noop and not actually compile our code.

Once we've cleared that cache, we run Mix.Task.run("compile", ["--force", "--tracer", __MODULE__]). This compiles our code, forcing a full compilation of all modules in our application, using the current module as the tracer for the compilation process. This is where we're hooking into the trace events emitted by the compiler.

Now, as Elixir is compiling our code, it's going to be calling trace/2 with the events that take place during compilation. We're looking specifically for two of these events—when we call use MyApp.Schema and when use Ecto.Schema is called. Because of how macro expansion works, both of those modules will end up having their __using__/1 macros called, but if folks are using MyApp.Schema in their schema definition, then the message showing that MyApp.Schema.__using__/1 has been called will be emitted before the message showing that Ecto.Schema.__using__/1 has been called, and this is what we use for our check.

When we get the message that MyApp.Schema.__using__/1 has been called, we put the module that called it—which we get from env.module—in our ETS table. Then, if we get a message that Ecto.Schema.__using__/1 has been called, we check to see if MyApp.Schema.__using__/1 has been called for the same module. If it has, then we're sure that the module, in that case, is doing the right thing. If it hasn't, then we're sure that the module is using Ecto.Schema directly, and we can emit our warning message letting the developer know what they should be doing instead.

One important note is that we absolutely need to have that second catch-all function there, since every time a possible event is emitted, our function will be called, and if one of our tracer functions fails, then compilation will fail at that point. So if any message doesn't match the event that we care about, which is the pattern we're looking for, we're just going to ignore it.

Events We Can Consume

There is great documentation about the types of events that the Elixir compiler will emit trace messages for in the documentation of the Codemodule. The ones I've been most interested in when it comes to potential uses of tooling are {:import, meta, module, opts}, so I can see if we're ever importing a certain module (which can really slow down compilation times), {:compile_env, app, path, return} to see all the places that we're accessing compile-time application configuration in the application, and {:local_function, meta, module, name, arity} combined with {:remote_function, meta, module, name, arity}, so that I can find functions that could be safely made private (or even deleted!).

These are just some of the ideas out there, and there are already really great tools taking advantage of this new feature starting to be released. A great example of this that I like to point to is Sasa Juric's boundary package, but I'm sure more are in the works.

Conclusion

Now that we've seen some of the kinds of tooling one can build with compiler tracing, why these tools are a great addition to a project, and walked through the implementation of a basic check, now it's time to think about what things you'd like to automate to make your life easier! What rules or conventions might be easier to enforce in your application with compiler tracing instead of with manual review?

P.S. If you'd like to read Elixir Alchemy posts as soon as they get off the press, subscribe to our Elixir Alchemy newsletter and never miss a single post!