BenchSpy - Standard Prometheus Metrics

Now that we've seen how to query and assert load-related metrics, let's explore how to query and assert on resource usage by our Application Under Test (AUT).

If you're unsure why this is important, consider the following situation: the p95 latency of a new release matches the previous version, but memory consumption is 34% higher. Not ideal, right?

Step 1: Prometheus Configuration

Since WASP has no built-in integration with Prometheus, we need to pass its configuration separately:

promConfig := benchspy.NewPrometheusConfig("node[^0]")

This constructor loads the URL from the environment variable PROMETHEUS_URL and adds a single regex pattern to match containers by name. In this case, it excludes the bootstrap Chainlink node (named node0 in the CTFv2 stack).

warning

This example assumes that you have both the observability stack and basic node set running. If you have the CTF CLI, you can start it by running: ctf b ns.

note

Matching containers by name should work both for most k8s and Docker setups using CTFv2 observability stack.

Step 2: Fetching and Storing a Baseline Report

As in previous examples, we'll use built-in Prometheus metrics and fetch and store a baseline report:

baseLineReport, err := benchspy.NewStandardReport(
    "91ee9e3c903d52de12f3d0c1a07ac3c2a6d141fb",
    // notice the different standard query executor type
    benchspy.WithStandardQueries(benchspy.StandardQueryExecutor_Prometheus),
    benchspy.WithPrometheusConfig(promConfig),
    // Required to calculate test time range based on generator start/end times.
    benchspy.WithGenerators(gen),
)
require.NoError(t, err, "failed to create baseline report")

fetchCtx, cancelFn := context.WithTimeout(context.Background(), 60*time.Second)
defer cancelFn()

fetchErr := baseLineReport.FetchData(fetchCtx)
require.NoError(t, fetchErr, "failed to fetch baseline report")

path, storeErr := baseLineReport.Store()
require.NoError(t, storeErr, "failed to store baseline report", path)

note

Standard metrics for Prometheus differ from those used by Loki or Direct query executors. Prometheus metrics focus on resource usage by the AUT, while Loki/Direct metrics measure load characteristics.

Standard Prometheus metrics include:

median_cpu_usage
median_mem_usage
max_cpu_usage
p95_cpu_usage
p95_mem_usage
max_mem_usage

These are calculated at the container level, based on total usage (user + system).

Step 3: Handling Prometheus Result Types

Unlike Loki and Generator, Prometheus results can have various data types:

scalar
string
vector
matrix

This makes asserting results a bit more complex.

Converting Results to `model.Value`

First, convert results to the model.Value interface using convenience functions:

currentAsValues := benchspy.MustAllPrometheusResults(currentReport)
previousAsValues := benchspy.MustAllPrometheusResults(previousReport)

Casting to Specific Types

Next, determine the data type returned by your query and cast it accordingly:

// Fetch a single metric
currentMedianCPUUsage := currentAsValues[string(benchspy.MedianCPUUsage)]
previousMedianCPUUsage := previousAsValues[string(benchspy.MedianCPUUsage)]

assert.Equal(t, currentMedianCPUUsage.Type(), previousMedianCPUUsage.Type(), "types of metrics should be the same")

// In this case, we know the query returns a Vector
currentMedianCPUUsageVector := currentMedianCPUUsage.(model.Vector)
previousMedianCPUUsageVector := previousMedianCPUUsage.(model.Vector)

Since these metrics are not related to load generation, the convenience function a map[string](model.Value), where key is resource metric name.

warning

All standard Prometheus metrics bundled with BenchSpy return model.Vector. However, if you use custom queries, you must manually verify their return types.

Skipping Assertions for Resource Usage

We skip the assertion part because, unless you're comparing resource usage under stable loads, significant differences between reports are likely. For example:

The first report might be generated right after the node set starts.
The second report might be generated after the node set has been running for some time.

What’s Next?

In the next chapter, we’ll explore custom Prometheus queries.

note

You can find the full example here.

Chainlink Testing Framework