Skip to main content

Counters and Gauges: Essential Rust Metric Types

Counters and gauges are the two simplest and most essential metric types in OpenTelemetry. A counter is a monotonically increasing number (requests processed, errors logged, bytes transmitted); a gauge is a point-in-time snapshot (current memory usage, connected clients, database connections in pool). Understanding when to use each is fundamental to building effective dashboards.

Both have negligible overhead (nanoseconds per operation) and are the building blocks of observability. If you only emit counters and gauges, you already have enough data to alert on outages, measure throughput, and track resource utilization.

Counters: Monotonic Totals

A counter is a metric that only increases. Once incremented, it never decreases. Counters answer the question: "How much total work has been done?"

Creating and Using Counters

use opentelemetry::global;

let meter = global::meter("my_app");

// Create a counter that tracks HTTP requests
let http_requests = meter.u64_counter("http_requests_total")
.with_description("Total HTTP requests handled")
.with_unit("1") // dimensionless
.init();

// Increment it each request
http_requests.add(1, &[]); // increment by 1, no attributes

The counter is created once (usually in your initialization) and reused throughout the application. Every call to .add() increments the running total.

Counters come in two flavors: u64_counter (unsigned 64-bit integer) and f64_counter (floating-point). Use u64_counter for counting discrete events (requests, errors, cache hits). Use f64_counter for continuous quantities like bytes or seconds.

Adding Attributes to Counters

Counters become powerful when you split them by attribute. Instead of one http_requests_total, emit separate totals per HTTP method and status code:

use opentelemetry::KeyValue;

let http_requests = meter.u64_counter("http_requests_total")
.with_description("HTTP requests by method and status")
.init();

// In your request handler, after getting the response:
let attributes = [
KeyValue::new("http.method", "GET"),
KeyValue::new("http.status_code", 200),
];
http_requests.add(1, &attributes);

// For a different request:
let attributes = [
KeyValue::new("http.method", "POST"),
KeyValue::new("http.status_code", 500),
];
http_requests.add(1, &attributes);

When Prometheus scrapes your metrics, it sees:

http_requests_total{http.method="GET",http.status_code="200"} 1523
http_requests_total{http.method="POST",http.status_code="500"} 42
http_requests_total{http.method="POST",http.status_code="201"} 319

You can now graph requests per status code, alert on error rate (500s / total), and measure throughput per method.

Best Practices for Counters

  1. Use consistent attribute names: The OpenTelemetry semantic conventions define standard names like http.method, http.status_code, db.operation (https://opentelemetry.io/docs/specs/semconv/). Follow them so your metrics interoperate with other teams' dashboards.

  2. Keep cardinality bounded: If you add attributes with unbounded values (user IDs, message content), you will explode metric cardinality. The number of unique combinations becomes unmanageable. Never use user ID as a label; instead, aggregate (users_active_total, users_created_total).

  3. Name counters with _total suffix: By convention, counters end in _total (http_requests_total, errors_total). This signals to operators that the metric is monotonically increasing and suitable for rate calculations.

  4. Reset safely: Counters reset only when the process restarts. When monitoring, use rate() functions (rate(requests_total[5m])) to compute requests-per-second, not raw totals.

Gauges: Snapshots of Current State

A gauge is a metric that can increase or decrease, representing a current value at a point in time. Gauges answer: "What is the state right now?"

Creating and Using Gauges

use opentelemetry::global;

let meter = global::meter("my_app");

// Create a gauge for connected clients
let connected_clients = meter.u64_up_down_counter("clients_connected")
.with_description("Number of connected WebSocket clients")
.init();

// When a client connects:
connected_clients.add(1, &[]);

// When a client disconnects:
connected_clients.add(-1, &[]);

Gauges come in two flavors: u64_up_down_counter (for integers) and f64_up_down_counter (for floats). Despite the name, these are standard gauges—they go up and down.

Common Gauge Examples

use std::sync::atomic::{AtomicU64, Ordering};
use std::sync::Arc;

// Memory usage
let memory_gauge = meter.u64_observable_gauge("memory_usage_bytes")
.with_description("Process memory usage in bytes")
.init();

// Cache hit ratio (float gauge)
let cache_hit_ratio = meter.f64_up_down_counter("cache_hit_ratio")
.with_description("Proportion of cache hits (0.0 to 1.0)")
.init();
cache_hit_ratio.add(0.95, &[]); // 95% hit rate

// Database connection pool utilization
let pool_connections = meter.u64_observable_gauge("db_connections_active")
.with_description("Active database connections in pool")
.init();

The name pattern for gauges varies: some end in the unit (_bytes, _ratio), others in the entity (_connections, _clients). The key is consistency within your application.

Observable Gauges: Callback-Based Measurement

For metrics that are expensive or complex to calculate, use observable gauges, which lazily compute values when scraped:

use opentelemetry::metrics::ObservableGauge;

// Measure memory lazily (only when Prometheus scrapes)
let memory_gauge = meter.u64_observable_gauge("memory_usage_bytes")
.with_description("Current process memory")
.init();

// Register a callback that runs during scrape
meter.register_callback(
move |observer| {
let current_memory = current_process_memory_bytes();
observer.observe_u64(&memory_gauge, current_memory, &[]);
},
)
.expect("Failed to register memory callback");

fn current_process_memory_bytes() -> u64 {
// Call sys info, parse /proc/self/status, or use psutil equivalent
// For demo: return a dummy value
42_000_000 // 42 MB
}

Observable gauges are called only during metric collection (e.g., when Prometheus scrapes). This is more efficient than updating a gauge constantly in your business logic.

Counters vs. Gauges: Decision Matrix

QuestionUse CounterUse Gauge
Does the value only increase?YesNo
Should I use rate() to calculate per-second?YesNo
Is it a total?YesNo
Is it a current/instantaneous value?NoYes
Does it reset on process restart?Yes (implicitly)Yes (resets to 0 by default, or you manage it)
Example: HTTP requests handledCounterGauge
Example: Connected usersCounter (use counter for total, gauge for current)Gauge
Example: Bytes transmittedCounterNo
Example: Queue lengthNoGauge
Example: ErrorsCounterNo

Complete Working Example

Here is a small HTTP server using axum that emits counters and gauges:

use axum::{Router, routing::get, extract::ConnectInfo, response::Html};
use opentelemetry::global;
use opentelemetry_prometheus::PrometheusBuilder;
use std::net::SocketAddr;
use std::sync::atomic::{AtomicU64, Ordering};
use std::sync::Arc;
use tokio::sync::Mutex;

#[tokio::main]
async fn main() {
let prometheus = PrometheusBuilder::new()
.install_simple()
.expect("Prometheus failed");

let meter = global::meter("http_server");
let requests_counter = meter.u64_counter("http_requests_total")
.with_description("Total HTTP requests")
.init();

let connected_clients: Arc<AtomicU64> = Arc::new(AtomicU64::new(0));
let clients_gauge = meter.u64_up_down_counter("clients_connected")
.with_description("Connected clients")
.init();

let app = Router::new()
.route("/", get(|| async { Html("Hello") }))
.route("/metrics", get(|| async { prometheus.render() }))
.with_state((requests_counter, connected_clients.clone(), clients_gauge));

let listener = tokio::net::TcpListener::bind("127.0.0.1:8080")
.await
.expect("Failed to bind");

// Simulate clients connecting/disconnecting
tokio::spawn(async move {
loop {
tokio::time::sleep(tokio::time::Duration::from_secs(2)).await;
connected_clients.fetch_add(1, Ordering::SeqCst);
connected_clients.fetch_sub(1, Ordering::SeqCst);
}
});

axum::serve(listener, app).await.expect("Server error");
}

Visit http://localhost:8080/metrics and see live counters and gauges.

Key Takeaways

  • Counters track monotonically increasing totals (requests, errors, bytes); use rate() in dashboards.
  • Gauges track current/instantaneous values (connected users, pool size, memory); use as-is in dashboards.
  • Add attributes to split metrics by dimension (method, status code, service) for richer dashboards.
  • Keep attribute cardinality bounded; never use unbounded values like user IDs.
  • Observable gauges lazily compute expensive metrics only during scrape.

Frequently Asked Questions

What happens if a gauge gets negative?

Gauges can be negative (e.g., debt, queue backlog). However, many physical quantities should never be negative (connections, memory). Use u64_up_down_counter if you expect only non-negative values and want the type system to help.

Can I sum counters across multiple instances?

Yes. When Prometheus scrapes multiple instances of your service, use sum() to aggregate. For example, sum(http_requests_total) without (instance) sums requests across all replicas. Counters are designed for this.

How often should I update a gauge?

Gauges should be updated whenever the value meaningfully changes, but not constantly (that is wasteful). For background jobs, update every few seconds. For critical metrics like error rates, update as they occur.

Are counters and gauges the only metric types?

No. Histograms (distributions of values) are a third type covered in the next article. Histograms are more powerful than counters/gauges but slightly more complex.

How do attribute cardinality limits work?

Most backends have a cardinality limit (e.g., Prometheus defaults to 10,000 unique metric combinations). Exceeding this can crash scrapes or drop metrics. Always bound attributes; use aggregation if you have many dimensions.

Further Reading