Prometheus Concepts

What Is Prometheus?

Prometheus is an open-source systems monitoring and alerting toolkit. It is designed around metrics, scraping, and time-series analysis.

Prometheus works well when you need to understand how a system behaves over time, especially in infrastructure, Kubernetes, and microservice environments.

Core Architecture

Exporters expose metrics from hosts, services, and applications.
The Prometheus server scrapes those metrics on an interval.
Samples are stored in a time-series database.
Alertmanager handles alert routing and notification.
Grafana is commonly used for visualization.

Data Model

Prometheus stores data as time series.

A metric name identifies the measurement.
Labels add dimensions such as job, instance, service, or namespace.
Each sample is a value with a timestamp.

That model is powerful, but it becomes expensive if labels are too specific.

Metric Types

Counter: a value that only increases, such as request totals.
Gauge: a value that can go up or down, such as memory usage.
Histogram: a distribution used for latency and bucketed observations.
Summary: an alternative way to track quantiles in-process.

Main Components

Prometheus server
Client libraries
Exporters
Pushgateway for short-lived jobs
Alertmanager
Service discovery integrations

PromQL

PromQL is the query language used to analyze metrics.

Examples:

rate(http_requests_total[5m])
sum by (job) (rate(container_cpu_usage_seconds_total[5m]))
histogram_quantile(0.95, sum by (le) (rate(http_request_duration_seconds_bucket[5m])))

What Prometheus Is Not

It is not a log store.
It is not a tracing backend.
It is not ideal for high-cardinality data such as user IDs or email addresses.

Practical Notes

Keep metric names and labels consistent.
Prefer alerting on symptoms that matter to users.
Use recording rules for expensive queries that are reused often.