In this post, we'll do a quick overview of monitoring memory issues in Erlang and Elixir setups. We'll do so by monitoring memory usage at three levels: Host, OS, and within the Erlang VM.
Getting the Data
To keep the article short, we'll assume you are using APM software. As an example, we'll be using AppSignal. No surprises there 😉
At AppSignal, we love enabling developers to deeply zoom into issues in their stack, as well as zoom out for an overview. We also like making that as easy as possible, so we automatically instrument a lot of your setup. We added auto instrumentation for Ecto to the AppSignal for Elixir 2.0 release.
Today, we focus on memory data, which is automatically gathered and visualized within AppSignal.
Level 1: Host Level Memory Usage
One way of zooming out to see the forest rather than the trees is to start at the Host level. If there are performance issues, let's rule out that they are caused by noisy neighbors affecting your host. Under the Inspect - Host metrics, you'll find a dashboard with Load Average, CPU, and memory usage graphs. You'll see the processes running on each Erlang setup, as well as on the host.
Level 2: OS Level Memory Usage
Every Elixir (or Ruby, or Node.js) setup on AppSignal includes an automatically generated dashboard showing RSS memory usage. For Erlang setups, this allows you to see the overall memory usage of each Erlang process that you have running. If you have more than one installation/OS on the host used, this will let you zoom into the processes within the instance.
We graph both the average and the 95% percentile memory usage. This specifically helps you spot situations where there are multiple processes running at OS level, but only a small percentage of them have issues. By comparing the differences between the 95% graph and the average memory usage graph, you'll be able to find the process that is the perpetrator.
If you are also running Ruby apps, you can, for instance, see the split between Puma and Sidekiq processes at this level.
Level 3: Erlang Memory Usage
The third and final level lets you zoom in the deepest. For every Elixir setup with AppSignal, we automatically create an Erlang VM magic dashboard that pulls in IO, number of schedulers, number of processes, and memory usage.
The memory usage graph plots the memory usage of processes, system, binary, ets and code
This can be especially useful for you to discover situations where a Genserver is hogging memory, for example, when a process holds a lot of data in its state that's kept in memory.
Triggers, Alerts, and What Happened Here
When you see a peak in any of these dashboards, you can hover over the peak and click on 'what happened here'. This brings you to an overlay, where you can see what errors and performance incidents happened in the particular timeframe of the peak in that graph.
If you spot a peak that you'd like to be warned about next time, you can add triggers to any graph on any AppSignal dashboard. And you can set a warm up and cool down time, and that way you can manage the signal to noise ratio to your liking.
Sweet Memories (Pun Intended)
With no real-life conferences in sight, we have to live on the sweet memories of the Code Beam conferences where we were able to feed you all some sweet stroopwafels IRL.
If talking about memory and memories has made you interested in AppSignal, just give it a spin with a free 30-day trial. It doesn't have any limits to the amount of users or volume of requests. We have a fresh stash of stroopwafels, so once your trial is set up, just reach out and we'll send you some.