In the previous part of this series about validating data at the boundary of an Elixir application, we covered a few general programming tactics to try and reject invalid and unexpected data in our software.
Continuing with that subject, we'll now explore how two libraries, namely Ecto and NimbleOptions, can further assist us.
Let's get started!
Using Ecto
As we've seen previously, Elixir provides many native techniques to help us guarantee data quality in our systems.
But it's also possible to leverage Ecto to cast, validate, and prune data even if there's no database interaction.
Schemaless Changesets in Ecto
Schemaless changesets (basically Ecto changesets that aren't tied to a database table) are a convenient way to create data structures. They also prevent bad data from making its way into structs (this can happen when bad data makes it past constructors or is added directly to a struct).
This approach is typically helpful when dealing with untrusted input (e.g., from an API request or as part of a module's public API). We can expose a function that will accept a plain map and, after proper vetting, will yield a struct. Other functions in the same module can then accept such a struct instance (rather than a map) to indicate that the data has been vetted and is safe to consume.
Here's what that approach can look like:
Note that we're applying an :insert
action, so we could even use our function's error to display directly in a Phoenix form: those require an :insert
or :update
action to render possible errors with the form's data.
Applying This To Phoenix Forms
Indeed, Phoenix forms inspect the action to determine whether error hints should be displayed: if no action is set, no errors will be rendered in the form. This is useful if an empty changeset is being used to render a Phoenix form: the changeset is invalid, but we don't want to berate the user with errors when they haven't (yet) made any actual errors. But this also means that if you want the validation errors resulting from from_params
to show up in a Phoenix form, you need to set the :action
value yourself, which is what we're accomplishing with our call to apply_action
.
Of course, if you don't plan on using this with a Phoenix-rendered form, it won't be needed and can be safely skipped.
Creating a new validated account is now a simple matter of Account.from_params(%{name: "ACME", suspended: false})
.
And since the types
are dynamic, they could also be passed in as arguments if your domain requires it.
Embedded Schemas in Ecto
Let's say you already have an Account
struct that isn't persisted to a database but still want to use Ecto to validate the data. In that case, you can transition to using an embedded schema rather than a basic struct. The trade-off between a schemaless changeset and an embedded schema is that the latter provides a bit more convenience at the expense of flexibility.
Namely, embedded schemas require you to have a struct within which to define the schema. Schemaless changesets don't have that requirement since, as their name implies, they don't need a schema definition.
Also, the datatypes for attributes can be dynamic in the schemaless changeset case (and provided as an argument to a function call, for example). In contrast, embedded schema types cannot vary from their defined value. Here's the above example rewritten to use an embedded schema:
If you'd like to dig deeper into how Ecto can also help you with your non-persisted data, I highly recommend reading Ecto's Data mapping and validation guide.
Validating Options
There are many circumstances where data is passed in as keyword lists (e.g., options given to OTP modules such as GenServers). These values need to be validated for conformity, but we also want that validation to be easy to understand and communicate. Enter NimbleOptions!
The NimbleOptions Library for Elixir
NimbleOptions is a great tool for validating options, as it is a lightweight library that verifies keyword lists and returns errors on invalid data. Let's see how it can be used!
Say we want to email suspended Account
records. The email template to use will be different if we email a corporate or personal account. Additionally, you want to specify which values to pass into the template.
We'll want to adapt the signature: corporate accounts, for example, should have an account manager's name attached, while private accounts can simply have a generic signature. We'll also specify which URL to include as the call to action:
This is a great start: we're enforcing the :template
to be provided and the program will crash if the template isn't given. This keeps out bad data, but isn't a great experience for callers: if they make a mistake (even as simple as a typo on the template name!) everything is going to crash instead of just generating an error they can handle.
We could, of course, do something like this:
But that doesn't scale very well once the number of options grows: it gets pretty difficult to see what options are permitted, what their expected type is, and so on. Not to mention, this is going to get repetitive really fast.
Bringing In NimbleOptions
Let's try out NimbleOptions. We need to specify a schema that the options are expected to conform to:
Do note we've specified default: []
in the values
configuration: that way, an empty list will be used as the default, enabling the cascading of default child values such as landing_url
to be filled in. Without it, a default landing_url
won't be filled in if a values
keyword isn't present at all.
Here it is in action:
As we can see, default values are being filled in when required, and they won't overwrite any provided values.
Validation also works out of the box:
It will also verify that we're not passing in unexpected option keys, such as typos:
Allowing Strings for landing_page
While having the landing_page
as a URI ensures it at least looks right, having to provide it as a string is a bit of a pain. It would be nice to provide the option as a string, but here's how that currently behaves:
Luckily, there's an easy fix — we just have to tweak the type definition:
Strings are now accepted for the landing_page
value:
While a step in the right direction, there are now two minor issues:
- Our code has to handle both string and URI data types (or we need to manually convert one to another after validation).
- Leaving the value as a string doesn't indicate anything. In particular, there's no indication that the value is a valid URI "safe" to process.
Elixir's NimbleOptions To the Rescue!
Once again, NimbleOptions comes to our rescue: we can use a :custom
data type that specifies a parser function. In our case, that parser function can simply be URI.new/1
, as it already conforms to the expectation set by :custom
(namely, returning {:ok, value}
or {:error, message}
):
Check it out:
We've now got the best of both worlds: convenient argument types and a standardized representation post-validation!
Wrapping Up
I hope you've enjoyed these two blog posts covering how you can keep bad data out of your Elixir applications while also making your code more expressive.
As we've seen, there are both "plain vanilla" Elixir techniques as well as support you can get from libraries like NimbleOptions to help us reject invalid data from being processed in our code.
By introducing these approaches to our modules, we'll provide more immediate feedback to callers and also make our programs more resilient when faced with unexpected data.
Hopefully, you've now got a few more tools in your belt to try out. Enjoy!
P.S. If you'd like to read Elixir Alchemy posts as soon as they get off the press, subscribe to our Elixir Alchemy newsletter and never miss a single post!