My reaction when I first came across the terms
gauge and the graphs with colors and numbers labeled "mean" and "upper 90" was one of avoidance. It's like I saw them, but I didn't care because I didn't understand them or how they might be useful. Since my job didn't require me to pay attention to them, they remained ignored.
That was about two years ago. As I progressed in my career, I wanted to understand more about our network applications, and that is when I started learning about metrics.
The three stages of my journey to understanding monitoring (so far) are:
- Stage 1: What? (Looks elsewhere)
- Stage 2: Without metrics, we are really flying blind.
- Stage 3: How do we keep from doing metrics wrong?
I am currently in Stage 2 and will share what I have learned so far. I'm moving gradually toward Stage 3, and I will offer some of my resources on that part of the journey at the end of this article.
Let's get started!
Software prerequisitesAll the demos discussed in this article are available on my GitHub repo. You will need to have
docker-composeinstalled to play with them.
Why should I monitor?
The top reasons for monitoring are:
- Understanding normal and abnormal system and service behavior
- Doing capacity planning, scaling up or down
- Assisting in performance troubleshooting
- Understanding the effect of software/hardware changes
- Changing system behavior in response to a measurement
- Alerting when a system exhibits unexpected behavior
Metrics and metric types
For our purposes, a metric is an observed value of