howdoicomputer.lol

1.8 KiB

Raw Blame History

title

date

draft

summary

stackem

The different components of the system are thus:

Prometheus
Grafana
Prometheus's node_exporter
Consul

Here is what I want to collect metrics for, from most critical to least:

The base host resources. This includes available memory, CPU, disk space, network traffic, etc. Also includes the ZFS pool.
Nomad.
The services that run on the host via Nomad.

In essence, monitoring starts for the platform used to deliver my applications and then moves upward to the end services I expose to myself and my friends.

base system monitoring

Starting from the base, the Prometheus project supports the collection of metrics for Linux based hosts through their node_exporter project. It's a Golang binary that exposes an absolute treasure trove of data including ZFS stats! It covers my first two monitoring needs.

While I could run the node_exporter via systemd, I instead opted to use the exec driver for Nomad. There is a Nix package for installing the exporter so the job definition just relies on executing the binary itself. Doing it this way means I get visibility into the exporter process and logs via the Nomad dashboard.

everything else

Nomad and Consul both expose their own metrics and so it's easy enough to add a Prometheus stat

1.8 KiB Raw Blame History

stackem

base system monitoring

everything else

1.8 KiB

Raw Blame History