elixir

Structs and Embedded Schemas in Elixir: Beyond Maps

Nikola Begedin

Nikola Begedin on

Structs and Embedded Schemas in Elixir: Beyond Maps

If you work with Elixir, chances are you've used structs plenty of times and are likely aware of Ecto schema.

However, you may not have explored structs in depth or used Ecto schemas beyond the database context.

In this post, we'll explore the ins and outs of structs and Ecto schemas.

What is a Struct?

Visually, a struct is the closest thing Elixir has to a class. Functionally, it's a named map with some extra features. You'd typically use it in place of a map when your code might benefit from those extra features.

The defstruct macro in Kernel.Utils is fairly large, but mostly understandable.

Generally, a struct relies on some built-in Elixir features to enhance a map by:

  • Stopping you from calling defstruct in the same module twice
  • Setting up a layer of checks to ensure you only access and use specific keys in a struct
  • Setting up optional key enforcement

Some of these features use :ets, Erlang’s in-memory table system (which warrants its own article).

So eventually, you end up with an :ets-powered map that has a few extra hidden keys and special features. These features make structs a bit stricter, better defined, and, in some ways, more powerful than plain maps.

Let's see how structs work in practice.

What Happens Behind the Scenes with Structs in Elixir?

Let's start by defining a struct:

Elixir
defmodule User do defstruct [:name, :email, :age] end

With that small bit of code, we've declared a special kind of map. Where a regular map would look like this:

Elixir
map_user = %{name: "John", email: "john@example.com", age: 30}

A struct is declared like this:

Elixir
user = %User{name: "John", email: "john@example.com", age: 30}

Under the hood, the struct is still a map, but has an extra key:

Elixir
iex> Map.keys(user) [:name, :__struct__, :email, :age] iex> user.__struct__ User

The __struct__ field is automatically added and helps Elixir distinguish between regular maps and structs.

Structs can have strict or required keys: let's look at those next.

Structs Have Strict Keys

Structs provide compile-time guarantees about which keys are valid.

This works:

Elixir
%User{name: "John", email: "john@example.com"}

But this raises an error:

Elixir
%User{name: "John", foo: "value"} ** (KeyError) key :foo not found expanding struct: User.__struct__/1

Accessing a value through a key actually works in the same way as with maps:

Elixir
user.foo # ** (KeyError) key :foo not found in: %User{name: "John", email: "john@example.com", age: 30} map_user.foo # ** (KeyError) key :foo not found in: %{name: "John", age: 30, email: "john@example.com"}

Accessing using [], however, is very different:

Elixir
map_user[:foo] # nil user[:foo] # ** (UndefinedFunctionError) function User.fetch/2 is undefined (User does not implement the Access behavior # You can use the "struct.field" syntax to access struct fields. You can also use Access.key!/1 to access struct fields dynamically inside get_in/put_in/update_in). Make sure the module name is correct and has been specified in full (or that an alias has been defined)

So using Access behavior doesn’t add extra enforcement. Accessing an invalid key fails simply because structs don’t implement the Access protocol.

Matching now works like this:

Elixir
%{foo: foo} = user # ** (MatchError) no match of right hand side value: # %User{name: "John", email: "john@example.com", age: 30} %{foo: foo} = map_user # ** (MatchError) no match of right hand side value: # %{name: "John", age: 30, email: "john@example.com"}

So it works mostly in the same way, but there is a slight difference if you include the struct name in the match (which we will get to).

Structs Can Have Required Keys

You can also enforce required keys. If we define our struct like this:

Elixir
defmodule User do @enforce_keys [:email] defstruct name: nil, email: nil, age: nil end

This is possible:

Elixir
user = %User{email: "john@example.com"} user = %User{email: "john@example.com", age: 25} user = %User{email: "john@example.com", name: "John"} user = %User{email: "john@example.com", name: "John", age: 25}

But not giving an email when creating a struct raises an error:

Elixir
user = %User{age: 30} # ** (ArgumentError) the following keys must also be given when building struct User: [:email] # expanding struct: User.__struct__/1

It's a bit of extra strictness you can give your struct for various use cases.

Structs Can Have Initial Values

This doesn't have a wide application, but can be useful for a bunch of stuff (for example, when you're using structs to power configuration options):

Elixir
defmodule User do defstruct [name: nil, email: nil, age: 20] end %User{} # User{name: nil, email: nil, age: 20}

It also acts as the basis for Ecto schemas.

Structs Can Be Pattern-matched in Function Clauses

Structs enable a bit of extra pattern matching that goes beyond what maps offer. Here, we define a few basic function clauses for our user struct:

Elixir
def process_user(%User{age: age}) when age >= 18 do "Adult user" end def process_user(%User{age: age}) when age < 18 do "Minor user" end def process_user(%User{name: name, email: email}) do "User #{name} with email #{email}" end

The compiler can statically verify that you're matching on valid struct fields, catching typos and invalid keys at compile time.

You can't just call process_user/1 with any map that has :name, :email, or :age keys. It has to be the User struct. Not some random map, not some random other struct, specifically the user struct.

Really, the User part of %User{...} is a form of pattern matching. It means that it's a map, that has the __struct__ key, the value of which is User.

That means you can do more with it.

The following clause accepts a %User{} and a %PowerUser{}:

Elixir
def process_user(%m{name: name} = user) when m in [User, PowerUser] do "User #{name} is a #{m}" end

And the following accepts any struct, but not regular maps:

Elixir
def process_user(%_{name: name} = user) do "User #{name} is a #{m}" end

Note that matching in function clauses follows the same rules with structs as anywhere else. You can't specify keys that the struct doesn't define.

This wouldn't work, and would raise a CompileError:

Elixir
def process_user(%User{foo: foo}), do: nil

Structs Can Have Dialyzer Types Declared

If your project uses Dialyzer, a common pattern (as suggested in the official documentation) is to declare a type t within the struct module.

In our example, we get User.t():

Elixir
defmodule User do defstruct [:name, :email, :age] @type t :: %__MODULE__{ name: String.t() | nil, email: String.t() | nil, age: non_neg_integer() | nil } end @spec create_user(String.t(), String.t(), non_neg_integer()) :: User.t() def create_user(name, email, age) do %User{name: name, email: email, age: age} end

Dialyzer can use these type specifications to catch type mismatches and provide better static analysis.

With the advent of actual types in elixir, this is slowly getting phased out. Once we have real type declarations, Dialyzer typespecs will probably get replaced. For now, though, it's a useful pattern to follow.

Structs Are Stricter, More Powerful Maps

For a bit of extra boilerplate, structs give you:

  • More power: Compile-time validation, enforced keys, and better pattern matching.
  • Extra checks: Static analysis of field names and types.
  • Clarity and self-documentation: Clear contracts about data shape.

Now let's move on to look at schemas.

What is an Embedded Schema?

There are two types of Ecto schemas:

  • Regular schemas are backed by a database table or view, and used to map database tables to Elixir code.
  • Embedded schemas are not directly associated to a database table and were originally intended to power jsonb columns in databases using embeds_one, embeds_many, etc.

Very obviously, both were originally intended to power Ecto and work with database tables.

With the split of ecto and ecto_sql, embedded schemas started to get used for much more.

Embedded Schemas Behind the Scenes

An embedded schema is essentially a struct with additional Ecto functionality. Here's how you declare one:

Elixir
defmodule Address do use Ecto.Schema embedded_schema do field :street, :string field :city, :string field :postal_code, :string field :country, :string, default: "US" end end

This creates a struct similar to what we saw before, but with Ecto's schema capabilities layered on top.

What actually happens is not a huge amount, but a bit more than what we get with just structs.

The line use Ecto.Schema first calls the __using__ macro, which registers a bunch of accumulating module attributes that the subsequent steps will use.

A module attribute is a named value you can use to store data within a module. Here's an example:

Elixir
defmodule MyModule do @foo "bar" def get_foo, do: @foo @foo "baz" def get_foo_also, do: @foo end

Here, get_foo() will return "bar", while get_foo_also() returns "baz". However, if you set the attribute to accumulate, using this line in the same module:

Elixir
Module.register_attribute(__MODULE__, :foo, accumulate: true)

Then extra values set to the attribute will no longer replace the old value, but rather append to a list. So get_foo_also() in the same example will return ["bar", "baz"].

Eleven such attributes in total are registered, their intent being to hold the schema definition: fields, associations, primary keys, etc. This is then used by the embedded_schema macro.

The embedded_schema macro simply calls the schema macro with the source argument set to nil. So instead of:

Elixir
embedded_schema do # ... end

You could actually just do:

Elixir
schema nil do # ... end

Really, the macros do exactly the same thing, other than not have a source.

When defining a schema using either of these macros, the following happens:

  • A bit of code is injected into the module to import the various helpers we need, such as field, embeds_one, etc.
  • The block we pass into the macro contains calls to the above helpers. This code now runs and puts values into the accumulating attributes registered, storing all the schema information we specified for later use.
  • The module now contains all the schema information within its attributes. The Ecto.Schema.__schema__/1 function is called, and the module is passed in as an argument. This gives us a tuple containing all of our struct's fields and something called bags_of_clauses.
  • defstruct is called with the list of struct fields, defining a plain struct.
  • __changeset__ and __schema__ functions are defined on the module. The __changeset__ function is what allows our schema struct to be used with Ecto changesets, and the schema function provides introspection for us as well as for usage with Ecto.Query. The bags_of_clauses value defines a few of the clauses for the __schema__ function.

Our Address embedded schema effectively becomes this struct:

Elixir
defstruct [street: nil, city: nil, postal_code: nil, country: "US"]

Note that @enforce_keys is not set, but you could still set it if you add it right before the call to embedded_schema.

So, effectively, embedded_schema creates a struct, but with extra features that make it work with Ecto.

Changesets and Validation

Embedded schemas shine when combined with changesets:

Elixir
defmodule Address do use Ecto.Schema embedded_schema do field :street, :string field :city, :string field :postal_code, :string field :country, :string, default: "US" end def changeset(address, attrs) do address |> cast(attrs, [:street, :city, :postal_code, :country]) |> validate_required([:street, :city, :postal_code]) |> validate_format(:postal_code, ~r/^\d{5}(-\d{4})?$/) end end

This provides type enforcement, format validation, and required field checks. Your changeset will check that all the required fields have a value, validate the format of the postcode, and will mark itself as invalid if anything is invalid.

This is clearly originally intended for databases, but because the embedded schema doesn't need a backing table, it can be used for much more.

API Input Prevalidation and the Command Pattern

Embedded schemas with changesets provide a cheap way to parse and sanitize API input. I usually call these special embedded schemas "commands", in that they command our business logic to do something.

Elixir
defmodule CreateUserCommand do use Ecto.Schema alias Ecto.Changeset embedded_schema do field :name, :string field :email, :string field :age, :integer end def changeset(params) do %__MODULE__{} |> cast(params, [:name, :email, :age]) |> validate_required([:name, :email]) |> validate_format(:email, ~r/@/) |> validate_number(:age, greater_than: 0) end def validate(params) do case changeset(params) do %Changeset{valid?: true} = changeset -> {:ok, Changeset.apply_changes(changeset)} %Changeset{valid?: false} = changeset -> {:error, changeset} end end end defmodule UserController do def create(conn, %{"user" => user_params}) do with {:ok, command} <- CreateUserCommand.validate(user_params) do UserService.create_user(command) send_resp(conn, 200) # the FallbackController handles the error response end end end

This is something you would generally use for checks and parameter sanitization that don't need access to the database (which tends to be true for most of the checks you would want to do.)

For example, when creating a user, you might need to check that the email is unique, that any associated foreign keys exist in the database, or that a number is greater than another number in the database. Those all need a database check.

But there are things you can use an embedded schema for, to prevalidate the insert by just checking and parsing the params, such as:

  • The email being a string and of a valid format
  • The first name being provided
  • Age being a number between 18 and 30

Basically, anything that enforces basic constraints and rules you know the input should obey.

That way, you can do a fast and cheap validation, reject the request if it doesn't pass, and avoid wasting resources by doing all the necessary validations at once (both cheap and expensive ones). This enables a fail fast approach in your API.

The added bonus of this pattern is that if your changeset is valid, you can use Changeset.apply_changes/1 to get the CreateUserCommand struct with all of the data set, ready to be passed into the deeper business logic.

So instead of one big complex thing, you get two simpler things, and the more important, more expensive one can now be reused in other places (like, for example, in your LiveView admin dashboard).

Powering Forms in Live (and Dead) Views

LiveView already promotes using %Form{} structs to power forms. These structs are basically wrappers around changesets. Converting a changeset to a form is as simple as calling Phoenix.Component.to_form(changeset).

So understanding that the schemas powering those changesets don't need to be tied to a table unlocks a lot you could do.

Say you have a LiveView page with a list of users, and a form to add a new user. You can create a changeset for that form as early on as in your mount function.

Elixir
def mount(_params, _session, socket) do changeset = CreateUserCommand.changeset(%{}) {:ok, assign(socket, changeset: changeset)} end

Then you render the form passing in to_form(@changeset), binding change and submit events.

Elixir
def user_form(assigns) do ~H""" <.form for={to_form(@changeset)} :let={f} phx-change="validate" phx-submit="create" > <.input type="text" :field={f[:name]} /> <.input type="email" :field={f[:email]} /> <.input type="number" :field={f[:number]} /> </.form> """ end

This isn't really new. We've used changesets to power forms since before LiveView, but the key point is that the schema powering the changeset doesn't need to be tied to a database.

Elixir
def handle_event("validate", %{"user" => params}, socket) do changeset = params |> CreateUserCommand.changeset() |> Map.put(:action, :validated) # This makes errors visible {:noreply, assign(socket, changeset: changeset)} end def handle_event("create", %{"user" => params}, socket) do case CreateUserCommand.validate(params) do {:ok, command} -> # this is live view. we expect it will work as the form is correctly setup and prevalidation worked # if it fails, let it crash, elixir style! {:ok, user} = UserService.create_user(command) {:noreply, put_flash(socket, "User created!")} {:error, changeset} -> {:noreply, assign(socket, changeset: Map.put(changeset, :action, :validated)} end end

Instead, we've reused our CreateUserCommand, so that our API and LiveView share roughly two-thirds of the code.

If the command doesn't suit your form, you can always define a custom embedded schema module right in your LiveView, or even use schemaless changesets.

The same principle applies. The only difference with schemaless changesets is that we need to give them a name when converting them to a form using to_form(changeset, as: :my_name).

Important: Validation errors aren't visible until the changeset's :action field is something other than nil. When Repo.insert(changeset) or Repo.update(changeset) get called, the returning invalid changeset in {:error, changeset} has the action field set to insert or update, respectively.

This never happens with changesets based on embedded schemas, so you need to set the action manually using Map.put, for example.

A Word of Caution About Elixir Packages typed_struct and domo

The typed_struct package eliminates some boilerplate by automatically generating type specifications:

Elixir
defmodule User do use TypedStruct typedstruct do field :name, String.t() field :email, String.t() field :age, non_neg_integer() end end

This is effectively the same as declaring a User struct using defstruct, enforcing all keys and declaring a User.t() type.

However, this package hasn't been receiving updates recently, which is a concern for long-term maintenance. Plus, with the advent of Elixir types, its usefulness is likely to decrease.

The domo package builds on top of typed structs by generating utility functions, such as new and new!. But, in my experience, it also adds a significant compilation time overhead.

At V7, we've been using these extensively, and they've proven useful.

They've also proven to increase our compilation time significantly, and domo seems to regularly cause compilation deadlocks, so we're seriously considering removing them.

Let's finally take a quick look at future Elixir types before wrapping up.

Future Elixir Types

With the growth of the Elixir type system, structs are becoming safer and safer. The compiler is inferring more and catching more bugs.

We're likely approaching a point where it becomes questionable whether we should use Dialyzer, for example.

The way I see it, Dialyzer is as useful as ever, and whatever type of improvements we get in the background are free and do not conflict with it at all. If anything, they might reveal where we can improve our type specs.

Once Elixir type specifications become available, we will probably be replacing type specs with those, and hopefully, that will be an easy migration.

In the meantime, we should lean into things that already benefit from the type system and offer the best of both worlds.

Wrapping Up

In this post, we ran through some use cases for Elixir structs and schemas.

Hopefully, you now have a deeper understanding of structs, even if you were already comfortable using them.

As for embedded schemas, maybe you've found a new use case for them.

Happy coding!

Wondering what you can do next?

Finished this article? Here are a few more things you can do:

  • Share this article on social media
Nikola Begedin

Nikola Begedin

Guest author Nikola Begedin is a full stack engineer who actually enjoys the full stack, with a focus on Elixir and Vue. Loves to run, but has two young kids, so doesn't do it as much as he would love to.

All articles by Nikola Begedin

Become our next author!

Find out more

AppSignal monitors your apps

AppSignal provides insights for Ruby, Rails, Elixir, Phoenix, Node.js, Express and many other frameworks and libraries. We are located in beautiful Amsterdam. We love stroopwafels. If you do too, let us know. We might send you some!

Discover AppSignal
AppSignal monitors your apps