Prometheus Concepts
What Is Prometheus?
Prometheus is an open-source systems monitoring and alerting toolkit. It is designed around metrics, scraping, and time-series analysis.
Prometheus works well when you need to understand how a system behaves over time, especially in infrastructure, Kubernetes, and microservice environments.
Core Architecture
- Exporters expose metrics from hosts, services, and applications.
- The Prometheus server scrapes those metrics on an interval.
- Samples are stored in a time-series database.
- Alertmanager handles alert routing and notification.
- Grafana is commonly used for visualization.
Data Model
Prometheus stores data as time series.
- A metric name identifies the measurement.
- Labels add dimensions such as job, instance, service, or namespace.
- Each sample is a value with a timestamp.
That model is powerful, but it becomes expensive if labels are too specific.
Metric Types
- Counter: a value that only increases, such as request totals.
- Gauge: a value that can go up or down, such as memory usage.
- Histogram: a distribution used for latency and bucketed observations.
- Summary: an alternative way to track quantiles in-process.
Main Components
- Prometheus server
- Client libraries
- Exporters
- Pushgateway for short-lived jobs
- Alertmanager
- Service discovery integrations
PromQL
PromQL is the query language used to analyze metrics.
Examples:
rate(http_requests_total[5m])
sum by (job) (rate(container_cpu_usage_seconds_total[5m]))
histogram_quantile(0.95, sum by (le) (rate(http_request_duration_seconds_bucket[5m])))
What Prometheus Is Not
- It is not a log store.
- It is not a tracing backend.
- It is not ideal for high-cardinality data such as user IDs or email addresses.
Practical Notes
- Keep metric names and labels consistent.
- Prefer alerting on symptoms that matter to users.
- Use recording rules for expensive queries that are reused often.