Today we're adding another Magic Dashboard! Here's what we've added, and which metrics are useful to track.
Heroku (host) metrics
Making it magic
While it was possible to set up a dedicated Postgres dashboard from the Heroku Host metrics, it's more convenient if we do it for you. Starting today, we do!
The dashboard will show up as a Magic Dashboard under your dashboards navigation. There's no setup required; if you have enabled Heroku host metrics and use the (paid) Heroku Postgres add-on, it just works.
Going through the metrics
Besides the basic metrics such as table count, database size and active connection count, we also track a number of other metrics that we think are useful. Let's go over them and quickly touch on why you'd want to track these.
Waiting connections: Connections waiting for a lock. If there are too many connections waiting, it could point to over-use of connections by not using a connection pool, for example.
Index/Table cache hit rate: Ratio of index lookups served from a buffer cache. Ideally, this value is always 0.99 or higher. If it drops below 0.99 consistently, you may need to upgrade your database plan or add more RAM. Cache hit rate is a great metric to set an anomaly detection trigger on.
Memory usage: We track both memory used by Postgres and the system itself. Postgres memory includes buffer cache and memory per connection. For multi-tenant plans, system metrics may include other databases and might be misleading.
Load average: Average system load of the Heroku database server, for more information about how to read load averages, we've written a blog post about it a while ago called: "Understanding system load and load averages".
I/O read/write operations: Number of read/write operations in sizes of 16KB Blocks. Each Postgres plan has a limit on IOPS it can perform (see the Heroku docs here), this would be an excellent candidate to set an anomaly detection trigger on.
For more information about these metrics, you can read more on the Heroku documentation page.
Diving deeper into an issue
You can dive into issues from each graph by clicking a peak and clicking 'what happened here' on the graph legend. This will show a "snapshot" of your application with lists of errors, actions performed in that minute/hour and host metrics.
You can also set up triggers to warn you when a metric goes over (or under) a specific value, with our Anomaly Detection.
We hope this dashboard made using Heroku's Postgres metrics easier to use. Please let us know which other dashboards you'd like to see added, or which existing ones are most valuable to you.