Intro
The Chainlink Testing Framework is a toolset designed for end-to-end testing of Chainlink products, focusing on functionality, resiliency, and performance.
This documentation is intended for:
- Chainlink engineers writing end-to-end tests in Golang
- Engineers using other languages who want to integrate with the Chainlink platform
To get started with writing tests, refer to the Framework chapter, where we guide you from basic to more complex scenarios.
If you want to build integration with Chainlink not in Golang, please refer to our Interactive chapter.
Repository contains two major pieces:
Framework
The primary focus of the Chainlink Testing Framework is to reduce the complexity of end-to-end testing, making complex system-level tests appear straightforward. It enables tests to run in any environment and serves as a single source of truth for system behavior as defined by requirements.
Features
-
Straightforward and sequential test composition: Tests are readable and give you precise control over key aspects in a strict step-by-step order.
-
Modular configuration: No arcane knowledge of framework settings is required; the config is simply a reflection of the components being used in the test. Components declare their own configuration—
what you see is what you get
. -
Component isolation: Components are decoupled via input/output structs, without exposing internal details.
-
Replaceability and extensibility: Since components are decoupled via outputs, any deployment component can be swapped with a real service without altering the test code.
-
Quick local environments: A common setup can be launched in just
15
seconds 🚀 *. -
Caching: Any component can use cached configs to skip setup for even faster test development.
-
Integrated observability stack: get all the info you need to develop end-to-end tests: metrics, logs, traces, profiles.
* If all the images are cached, you are using OrbStack with M1/M2/M3 chips and have at least 8CPU dedicated to Docker
🚀 Getting started
Prerequisites
Docker
OrbStack or Docker Desktop, we recommend OrbStack (faster, smaller memory footprint)- Golang
Tested with
Docker version 27.3.1
OrbStack Version: 1.8.2 (1080200)
Test setup
To start writing tests create a directory for your project with go.mod
and add a package
go get github.com/smartcontractkit/chainlink-testing-framework/framework
Download our CLI
OS X arm64
(M1/M2/M3 MacBooks)
curl -L https://github.com/smartcontractkit/chainlink-testing-framework/releases/download/framework%2Fv0.4.1/framework-v0.4.1-darwin-arm64.tar.gz | tar -xz
OS X amd64
(old Intel chips)
curl -L https://github.com/smartcontractkit/chainlink-testing-framework/releases/download/framework%2Fv0.4.1/framework-v0.4.1-darwin-amd64.tar.gz | tar -xz
Linux arm64
curl -L https://github.com/smartcontractkit/chainlink-testing-framework/releases/download/framework%2Fv0.4.1/framework-v0.4.1-linux-arm64.tar.gz | tar -xz
Linux amd64
curl -L https://github.com/smartcontractkit/chainlink-testing-framework/releases/download/framework%2Fv0.4.1/framework-v0.4.1-linux-amd64.tar.gz | tar -xz
Allow it to run in System Settings -> Security Settings
(OS X)
Create an .envrc
file and do source .envrc
(we recommend to use direnv, so you don't need to load it every time)
export TESTCONTAINERS_RYUK_DISABLED=true # do not remove containers while we develop locally
Now you are ready to write your first test
Tools setup (Optional)
This setup is optional, and it explains how to create a local observability stack for on-chain and off-chain components.
Spin up your local obserability stack (Grafana LGTM)
ctf obs up
More docs
Spin up your Blockscout
stack
ctf bs up
More docs
Interactive
If you're a non-technical user or want to integrate with Chainlink using a language other than Golang, please follow our Interactive setup guide.
Writing your first test
The Chainlink Testing Framework (CTF) is a modular, data-driven tool that lets you explicitly define and configure various Chainlink components.
Let's spin up a simple component.
Create your configuration in smoke.toml
[blockchain_a]
# choose between "anvil", "geth" or "besu"
type = "anvil"
Create your test in smoke_test.go
package mymodule_test
import (
"github.com/smartcontractkit/chainlink-testing-framework/framework"
"github.com/smartcontractkit/chainlink-testing-framework/framework/components/blockchain"
"github.com/stretchr/testify/require"
"testing"
)
type Config struct {
BlockchainA *blockchain.Input `toml:"blockchain_a" validate:"required"`
}
func TestMe(t *testing.T) {
in, err := framework.Load[Config](t)
require.NoError(t, err)
bc, err := blockchain.NewBlockchainNetwork(in.BlockchainA)
require.NoError(t, err)
t.Run("test something", func(t *testing.T) {
require.NotEmpty(t, bc.Nodes[0].HostHTTPUrl)
})
}
Run the test
CTF_CONFIGS=smoke.toml go test -v -run TestMe
Remove containers (read more about cleanup here)
ctf d rm
Summary:
- We defined configuration for
BlockchainNetwork
- We've used one CTF component in test and checked if it's working
Now let's connect the Chainlink node!
Learn more about EVM Client component.
Connecting Chainlink Node
Now let's have an example of Chainlink node connected to some local blockchain.
Create your configuration in smoke.toml
[blockchain_a]
type = "anvil"
docker_cmd_params = ["-b", "1"]
[cl_node]
[cl_node.db]
image = "postgres:12.0"
[cl_node.node]
image = "public.ecr.aws/chainlink/chainlink:v2.17.0"
Create your test in smoke_test.go
package capabilities_test
import (
"fmt"
"github.com/smartcontractkit/chainlink-testing-framework/framework"
"github.com/smartcontractkit/chainlink-testing-framework/framework/components/blockchain"
"github.com/smartcontractkit/chainlink-testing-framework/framework/components/clnode"
"github.com/stretchr/testify/require"
"testing"
)
type Config struct {
BlockchainA *blockchain.Input `toml:"blockchain_a" validate:"required"`
CLNode *clnode.Input `toml:"cl_node" validate:"required"`
}
func TestNode(t *testing.T) {
in, err := framework.Load[Config](t)
require.NoError(t, err)
bc, err := blockchain.NewBlockchainNetwork(in.BlockchainA)
require.NoError(t, err)
networkCfg, err := clnode.NewNetworkCfgOneNetworkAllNodes(bc)
require.NoError(t, err)
in.CLNode.Node.TestConfigOverrides = networkCfg
output, err := clnode.NewNodeWithDB(in.CLNode)
require.NoError(t, err)
t.Run("test something", func(t *testing.T) {
fmt.Printf("node url: %s\n", output.Node.HostURL)
require.NotEmpty(t, output.Node.HostURL)
})
}
Select your configuration by setting CTF_CONFIGS=smoke.toml
and run it
go test -v -run TestNode
Check node url: ...
in logs, open it and login using default credentials:
notreal@fakeemail.ch
fj293fbBnlQ!f9vNs
Summary:
- We defined configuration for
BlockchainNetwork
andNodeWithDB
(Chainlink + PostgreSQL) - We connected them together by creating common network config in
NewNetworkCfgOneNetworkAllNodes
- We explored the Chainlink node UI
Let's proceed with another example of using node sets
Chainlink Node Set Environment Test
Let's create a full-fledged set of Chainlink nodes connected to some blockchain.
Create a configuration file smoke.toml
[blockchain_a]
type = "anvil"
docker_cmd_params = ["-b", "1"]
[nodeset]
nodes = 5
override_mode = "all"
[nodeset.db]
image = "postgres:12.0"
[[nodeset.node_specs]]
[nodeset.node_specs.node]
image = "public.ecr.aws/chainlink/chainlink:v2.17.0"
Create a file smoke_test.go
package yourpackage_test
import (
"github.com/smartcontractkit/chainlink-testing-framework/framework"
"github.com/smartcontractkit/chainlink-testing-framework/framework/components/blockchain"
ns "github.com/smartcontractkit/chainlink-testing-framework/framework/components/simple_node_set"
"github.com/stretchr/testify/require"
"testing"
)
type Config struct {
BlockchainA *blockchain.Input `toml:"blockchain_a" validate:"required"`
NodeSet *ns.Input `toml:"nodeset" validate:"required"`
}
func TestNodeSet(t *testing.T) {
in, err := framework.Load[Config](t)
require.NoError(t, err)
bc, err := blockchain.NewBlockchainNetwork(in.BlockchainA)
require.NoError(t, err)
out, err := ns.NewSharedDBNodeSet(in.NodeSet, bc)
require.NoError(t, err)
t.Run("test something", func(t *testing.T) {
for _, n := range out.CLNodes {
require.NotEmpty(t, n.Node.HostURL)
require.NotEmpty(t, n.Node.HostP2PURL)
}
})
}
Run it
CTF_CONFIGS=smoke.toml go test -v -run TestNodeSet
Check the logs to access the UI
12:41AM INF UI=["http://127.0.0.1:10000","http://127.0.0.1:10001", ...]
Use credentials to authorize:
notreal@fakeemail.ch
fj293fbBnlQ!f9vNs
Summary:
- We deployed fully-fledged set of Chainlink nodes connected to some blockchain and faked external data provider
- We explored the Chainlink node UI
Chainlink Node Set Environment Test
Let's use some external capability binaries in our tests and extend the previous one.
We'll use a private repository example, so you should be authorized with gh
gh auth login
gh auth setup-git
Download an example capability binary
export export GOPRIVATE=github.com/smartcontractkit/capabilities
go get github.com/smartcontractkit/capabilities/kvstore && go install github.com/smartcontractkit/capabilities/kvstore
Create a configuration file smoke.toml
[blockchain_a]
type = "anvil"
docker_cmd_params = ["-b", "1"]
[nodeset]
nodes = 5
override_mode = "all"
[nodeset.db]
image = "postgres:12.0"
[[nodeset.node_specs]]
[nodeset.node_specs.node]
# path to your capability binaries
capabilities = ["./kvstore"]
# default capabilities directory
# capabilities_container_dir = "/home/capabilities"
image = "public.ecr.aws/chainlink/chainlink:v2.17.0"
Run it
CTF_CONFIGS=smoke.toml go test -v -run TestNodeSet
Now you can configure your capability using clclient.CreateJobRaw($raw_toml)
.
Capabilities are uploaded to /home/capabilities
by default.
Summary:
- We deployed a node set with some capabilities
Local Docker Image Builds
In addition to this common setup you can also provide your local image path and quickly rebuild it automatically before starting the test.
Create a configuration file smoke.toml
[blockchain_a]
type = "anvil"
docker_cmd_params = ["-b", "1"]
[nodeset]
nodes = 5
override_mode = "all"
[nodeset.db]
image = "postgres:12.0"
[[nodeset.node_specs]]
[nodeset.node_specs.node]
# Dockerfile path is relative to "docker_ctx"
docker_file = "core/chainlink.Dockerfile"
docker_ctx = "../.."
These paths will work for e2e/capabilities
in our main repository
Also check how you can add rebuild to your components.
Summary:
- We learned how we can quickly re-build local docker image for CL node
Chainlink Node Set Compatibility Testing Environment
The difference between this and basic node set configuration is that here you can provide any custom configuration for CL nodes.
Create a configuration file smoke.toml
[blockchain_a]
type = "anvil"
docker_cmd_params = ["-b", "1"]
[nodeset]
nodes = 5
override_mode = "each"
[nodeset.db]
image = "postgres:12.0"
[[nodeset.node_specs]]
[nodeset.node_specs.node]
image = "public.ecr.aws/chainlink/chainlink:v2.17.0"
user_config_overrides = " [Log]\n level = 'info'\n "
user_secrets_overrides = ""
[[nodeset.node_specs]]
[nodeset.node_specs.node]
image = "public.ecr.aws/chainlink/chainlink:v2.17.0"
user_config_overrides = " [Log]\n level = 'info'\n "
user_secrets_overrides = ""
[[nodeset.node_specs]]
[nodeset.node_specs.node]
image = "public.ecr.aws/chainlink/chainlink:v2.17.0"
user_config_overrides = " [Log]\n level = 'info'\n "
user_secrets_overrides = ""
[[nodeset.node_specs]]
[nodeset.node_specs.node]
image = "public.ecr.aws/chainlink/chainlink:v2.17.0"
user_config_overrides = " [Log]\n level = 'info'\n "
user_secrets_overrides = ""
[[nodeset.node_specs]]
[nodeset.node_specs.node]
image = "public.ecr.aws/chainlink/chainlink:v2.17.0"
user_config_overrides = " [Log]\n level = 'info'\n "
user_secrets_overrides = ""
You can reuse smoke_test.go
from previous setup
Run it
CTF_CONFIGS=smoke.toml go test -v -run TestNodeSet
Summary:
- We deployed fully-fledged set of Chainlink nodes connected to some blockchain and faked external data provider
- We understood how we can test different versions of Chainlink nodes for compatibility and override configs
Developing Components
To build a scalable framework that enables the reuse of our product deployments (contracts or services in Docker), we need to establish a clear component structure.
package mycomponent
import (
"fmt"
"github.com/smartcontractkit/chainlink-testing-framework/framework"
)
type Input struct {
// inputs fields that component exposes for configuration
...
// outputs are embedded into inputs so framework can automatically save them
Out *Output `toml:"out"`
}
type Output struct {
UseCache bool `toml:"use_cache"`
// outputs that will be dumped to config and cached
}
func NewComponent(input *Input) (*Output, error) {
if input.Out != nil && input.Out.UseCache {
return input.Out, nil
}
// component logic here
// deploy a docker container(s)
// or deploy a set of smart contracts
input.Out = &Output{
UseCache: true,
// other fields
...
}
return out, nil
}
Each component can define inputs and outputs, following these rules:
- Outputs should be included within inputs.
- If your component is used for side effects output can be omitted.
input.Out.UseCache
should be added if you'd like to use caching, see more here
Building Local Images
Use framework.BuildImage
or framework.BuildImageOnce
to build the docker image.
Do not use testcontainers.NewDockerProvider()
methods, see issues: #1, #2
Docker components good practices for testcontainers-go:
- Inputs should include at least
image
,tag
andpull_image
field
Image string `toml:"image" validate:"required"`
Tag string `toml:"tag" validate:"required"`
PullImage bool `toml:"pull_image" validate:"required"`
ContainerRequest
must contain labels, network and alias required for local observability stack and deployment isolation
Labels: framework.DefaultTCLabels(),
Networks: []string{framework.DefaultNetworkName},
NetworkAliases: map[string][]string{
framework.DefaultNetworkName: {containerName},
},
- In order to copy files into container use
framework.WriteTmpFile(data string, fileName string)
userSecretsOverridesFile, err := WriteTmpFile(in.Node.UserSecretsOverrides, "user-secrets-overrides.toml")
if err != nil {
return nil, err
}
- Output of docker component must contain all the URLs component exposes for access, both for internal docker usage and external test (host) usage
host, err := framework.GetHost(c)
if err != nil {
return nil, err
}
mp, err := c.MappedPort(ctx, nat.Port(bindPort))
if err != nil {
return nil, err
}
return &NodeOut{
UseCache: true,
DockerURL: fmt.Sprintf("http://%s:%s", containerName, in.Node.Port),
HostURL: fmt.Sprintf("http://%s:%s", host, mp.Port()),
}, nil
An example simple component
An example of complex component
An example of composite component
Fork Testing
We verify our on-chain and off-chain changes using forks of various networks.
Go to example project dir to try the examples yourself.
On-chain Only
In this example, we:
- Create two
anvil
networks, each targeting the desired network (change URLs andanvil
settings as required, see full anvil reference). - Connect two clients to the respective networks.
- Deploy two test contracts.
- Interact with the deployed contracts.
- Demonstrate interactions using the
anvil
RPC client (more client methods examples are here)
Run it
CTF_CONFIGS=fork.toml go test -v -run TestFork
On-chain + Off-chain
The chain setup remains the same as in the previous example, but now we have 5 Chainlink
nodes connected with 2 networks.
Run it
CTF_CONFIGS=fork_plus_offchain.toml go test -v -run TestOffChainAndFork
Be mindful of RPC rate limits, as your provider may enforce restrictions. Use docker_cmd_params
field to configure appropriate rate limiting and retries with the following parameters:
--compute-units-per-second <CUPS>
--fork-retry-backoff <BACKOFF>
--retries <retries>
--timeout <timeout>
If the network imposes limits, the container will panic, triggering messages indicating that the container health check has failed.
Quick Contracts Deployment
You can control the mining pace to accelerate contract deployment. Start anvil with the following configuration:
[blockchain_a]
type = "anvil"
Set the miner
speed,
// start periodic mining so nodes can receive heads (async)
miner := rpc.NewRemoteAnvilMiner(bcSrc.Nodes[0].HostHTTPUrl, nil)
miner.MinePeriodically(5 * time.Second)
Then use this template
Verifying Contracts
Check out our example of programmatically verifying contracts using Blockscout
and Foundry
. You'll need to provide:
- The path to your Foundry directory
- The path to the contract
- The contract name
err := blockchain.VerifyContract(blockchainComponentOutput, c.Addresses[0].String(),
"example_components/onchain",
"src/Counter.sol",
"Counter",
)
require.NoError(t, err)
CLI
To keep documentation simple we provide CLI docs in "help" format
ctf -h
Configuration
Environment variables
Name | Description | Possible values | Default | Required? |
---|---|---|---|---|
TESTCONTAINERS_RYUK_DISABLED | Testcontainers-Go reaper container, removes all the containers after the test exit | true , false | false | 🚫 |
CTF_CONFIGS | Path(s) to test config files. Can be more than one, ex.: smoke.toml,smoke_1.toml,smoke_2.toml. First filepath will hold all the merged values | Any valid TOML file path | - | ✅ |
CTF_LOG_LEVEL | Harness log level | info , debug , trace | info | 🚫 |
CTF_PROMTAIL_DEBUG | Set true if you are integrating with remote Loki push API to debug Promtail | true , false | false | 🚫 |
CTF_IGNORE_CRITICAL_LOGS | Ignore all logs that has CRIT,FATAL or PANIC levels (Chainlink nodes only!) | true , false | false | 🚫 |
CTF_CHAINLINK_IMAGE | Flag to override Chainlink Docker image in format $repository:$tag | $repository:$tag | - | 🚫 |
CTF_JD_IMAGE | Job distributor service image in format $repository:$tag | $repository:$tag | - | 🚫 |
CTF_CLNODE_DLV | Use debug entrypoint to allow Delve debugger connection, works only with "plugins" image of CL node | true , false | false | 🚫 |
LOKI_URL | URL to Loki push api, should be like${host}/loki/api/v1/push | URL | http://host.docker.internal:3030/loki/api/v1/push | 🚫 |
LOKI_TENANT_ID | Streams all components logs to Loki , see params below | string | promtail | 🚫 |
LOKI_BASIC_AUTH | Basic auth in format $user:$password | $user:$password | - | 🚫 |
RESTY_DEBUG | Log all Resty client HTTP calls | true , false | false | 🚫 |
Test Configuration
Since end-to-end system-level test configurations can become complex, we use a single, generic method to marshal any configuration.
Here’s an example of how you can extend your configuration.
type Cfg struct {
// Usually you'll have basic components
BlockchainA *blockchain.Input `toml:"blockchain_a" validate:"required"`
MockerDataProvider *fake.Input `toml:"data_provider" validate:"required"`
NodeSet *ns.Input `toml:"nodeset" validate:"required"`
// And some custom test-related data
MyCustomTestData string `toml:"my_custom_test_data" validate:"required"`
}
func TestSmoke(t *testing.T) {
in, err := framework.Load[Cfg](t)
require.NoError(t, err)
in.MyCustomTestData // do something
...
}
We use validator to make sure anyone can properly configure your test.
All basic components configuration is described in our docs, but it’s recommended to comment on your configuration in TOML.
Additionally, use validate:"required"
or validate:"required,oneof=anvil geth"
for fields with strict value options on your custom configuration.
# My custom config does X
[MyCustomConfig]
# Can be a number
a = 1
# Can be Y or Z
b = "Z"
Overriding Test Configuration
To override any test configuration, we merge multiple files into a single struct.
You can specify multiple file paths using CTF_CONFIGS=path1,path2,path3
.
The framework will apply these configurations from right to left and marshal them to a single test config structure.
Use it to structure the variations of your test, ex.:
export CTF_CONFIGS=smoke-test-feature-a-simulated-network.toml
export CTF_CONFIGS=smoke-test-feature-a-simulated-network.toml,smoke-test-feature-a-testnet.toml
export CTF_CONFIGS=smoke-test-feature-a.toml
export CTF_CONFIGS=smoke-test-feature-a.toml,smoke-test-feature-b.toml
export CTF_CONFIGS=load-profile-api-service-1.toml
export CTF_CONFIGS=load-profile-api-service-1.toml,load-profile-api-service-2.toml
This helps reduce duplication in the configuration.
note
We designed overrides to be as simple as possible, as frameworks like envconfig and viper offer extensive flexibility but can lead to inconsistent configurations prone to drift.
This feature is meant to override test setup configurations, not test logic. Avoid using TOML to alter test logic.
Tests should remain straightforward, readable, and perform a single set of actions (potentially across different CI/CD environments). If variations in test logic are required, consider splitting them into separate tests.
warning
When override slices remember that you should replace the full slice, it won't be extended by default!
Overriding Components Configuration
The same override logic applies across components, files, and configuration fields in code, configs are applied in order:
- Implicit component defaults that are defined inside application
- Component defaults defined in the framework or external component, ex.: CLNode
Test.*Override
user_.*_overrides
Use Test.*Override
in test code
to override component configurations, and user_.*_overrides
in TOML
for the same purpose.
Exposing Components (Data and ports)
We use static port ranges and volumes for all components to simplify Docker port management for developers.
This approach allows us to apply chaos testing to any container, ensuring it reconnects and retains the data needed for your tests.
When deploying a component, you can explicitly configure port ranges if the default ports don’t meet your needs.
Defaults are:
- NodeSet (Node HTTP API):
10000..100XX
- NodeSet (Node P2P API):
12000..120XX
- NodeSet (Delve debugger):
40000..400XX
(if you are using debug image) - Shared
PostgreSQL
volume is calledpostgresql_data
[nodeset]
# HTTP API port range start, each new node get port incremented (host machine)
http_port_range_start = 10000
# P2P API port range start, each new node get port incremented (host machine)
p2p_port_range_start = 12000
- PostgreSQL:
13000
(we do not allow to have multiple databases for now, for simplicity)
[nodeset.node_specs.db]
# PostgreSQL volume name
volume_name = "a"
# PostgreSQL port (host machine)
port = 13000
When you run ctf d rm
database volume will be removed.
One node set is enough for any kind of testing, if you need more nodes consider extending your existing node set:
[nodeset]
nodes = 10
Custom ports
You can also define a custom set of ports for any node.
[nodeset]
nodes = 5
override_mode = "each"
[nodeset.db]
image = "postgres:12.0"
[[nodeset.node_specs]]
[nodeset.node_specs.node]
# here we defined 2 new ports to listen and mapped them to our host machine
# syntax is "host:docker", if you provide only host port then we map 1-to-1
custom_ports = ["14000:15000", "20000"]
image = "public.ecr.aws/chainlink/chainlink:v2.16.0"
Debugging Tests
All container logs are saved in a directory named logs
, which will appear in the same directory where you ran the test after the test is completed.
For verifying your on-chain components, refer to the Blockscout documentation on smart contract verification. This guide provides detailed instructions on uploading your ABI and verifying your contracts.
Use CTF_LOG_LEVEL=trace|debug|info|warn
to debug the framework.
Use RESTY_DEBUG=true
to debug any API calls.
Use SETH_LOG_LEVEL=trace|debug|info|warn
to debug Seth.
Using Delve Debugger
If you are using Chainlink
image with Delve available in path you can use ports 40000..400XX
to connect to any node.
Components Cleanup
Managing state is challenging, especially in end-to-end testing, we use ryuk and following simple rules:
-
If
TESTCONTAINERS_RYUK_DISABLED=true
, no cleanup occurs — containers, volumes, and networks remain on your machine.Feel free to use
ctf d rm
to remove containers when you are ready. -
If
TESTCONTAINERS_RYUK_DISABLED
is unset, the test environment will be automatically cleaned up a few seconds after the test completes.
Keep in mind that all components are mapped to static ports, so without cleanup, only one environment can run at a time.
This design choice simplifies debugging.
A simplified command is available to prune unused volumes, containers, and build caches. Use it when you’re running low on space on your machine.
ctf d c
The framework manages cleanup for both on-chain and off-chain Docker components. However, if your test involves actions like configuring Chainlink jobs, it's best practice to make these actions idempotent, so they can be applied reliably in any environment.
Component caching
We use component caching to accelerate test development and enforce idempotent test actions development.
Each component is isolated by means of inputs and outputs.
If cached config has any outputs with use_cache = true
it will be used instead of deploying a component again.
export CTF_CONFIGS=smoke-cache.toml
Mocking Services
The framework aims to equip you with all the necessary tools to write end-to-end system-level tests, while still allowing the flexibility to mock third-party services that are not critical to your testing scope.
Configuration
[fake]
# port to start Gin server
port = 9111
Usage
See full example.
host.docker.internal
is docker platform dependent!
Use framework.HostDockerInternal()
to reference host.docker.internal
in your tests, so they can work in GHA CI
Copying Files
You can copy files to containers by using the ContainerName
from the output and specifying the source src
and destination dst
paths.
However, using this API is discouraged and will be deprecated in the future, as it violates the principles of "black-box" testing. If your service relies on this functionality, consider designing a configuration or API to address the requirement instead.
bc, err := blockchain.NewBlockchainNetwork(&blockchain.Input{
...
})
require.NoError(t, err)
err = dockerClient.CopyFile(bc.ContainerName, "local_file.txt", "/home")
require.NoError(t, err)
External Environment
Using remote components
Because components are decoupled through outputs, you can use a cached config and switch outputs to any deployed infrastructure, such as staging.
This allows you to reuse the same testing logic for behavior validation.
For example, to integrate with remote k8s
environment you can use CTF_CONFIGS=smoke_external.toml
and override all the outputs of components to connect to your remote env.
[blockchain_a]
[blockchain_a.out]
chain_id = "1337"
use_cache = true
[[blockchain_a.out.nodes]]
# set up your RPC URLs
http_url = "http://127.0.0.1:8545"
ws_url = "ws://127.0.0.1:8545"
[nodeset]
[[nodeset.node_specs]]
...
[nodeset.out]
use_cache = true
[[nodeset.out.cl_nodes]]
use_cache = true
[nodeset.out.cl_nodes.node]
# set up your user/password for API authorization
api_auth_user = 'notreal@fakeemail.ch'
api_auth_password = 'fj293fbBnlQ!f9vNs'
# set up each node URLs
p2p_url = "http://127.0.0.1:12000"
url = "http://127.0.0.1:10000"
[nodeset.out.cl_nodes.postgresql]
# set up a database URL so tests can connect to your database if needed
url = "postgresql://chainlink:thispasswordislongenough@127.0.0.1:13000/db_0?sslmode=disable"
# more nodes in this array, configuration is the same ...
Troubleshooting
Can't run anvil
, issue with Rosetta
2024/11/27 15:20:27 ⏳ Waiting for container id 79f8a68c07cc image: f4hrenh9it/foundry:latest. Waiting for: &{Port:8546 timeout:0x14000901278 PollInterval:100ms skipInternalCheck:false}
2024/11/27 15:20:27 container logs (all exposed ports, [8546/tcp], were not mapped in 5s: port 8546/tcp is not mapped yet
wait until ready: check target: retries: 1, port: "", last err: container exited with code 133):
rosetta error: Rosetta is only intended to run on Apple Silicon with a macOS host using Virtualization.framework with Rosetta mode enabled
Solution
Update your docker to Docker version 27.3.1, build ce12230
Can't see any docker metrics with Cadvisor
Cadvisor container can't mount some layers
E1209 13:14:49.908130 1 manager.go:1116] Failed to create existing container: /docker/aa40875c382af890861447fa8aaf6908978041b9077bb81029971d23929f8c4d: failed to identify the read-write layer ID for container "aa40875c382af890861447fa8aaf6908978041b9077bb81029971d23929f8c4d". - open /rootfs/var/lib/docker/image/overlayfs/layerdb/mounts/aa40875c382af890861447fa8aaf6908978041b9077bb81029971d23929f8c4d/mount-id: no such file or directory
Disable containerd
images ( Settings -> General ), see issue.
Local Observability Stack
You can use a local observability stack, framework is connected to it by default
ctf obs up
To remove it use
ctf obs down
Read more about how to check logs and profiles
Metrics
We use Prometheus to collect metrics of Chainlink nodes and other services.
Check Prometheus UI.
Check Grafana example query.
Queries:
- All metrics of all containers
{job="ctf"}
Docker Resources
We are using cadvisor to monitor resources of test containers.
Cadvisor UI can be found here
Logs
We are using Loki
for logging, check localhost:3000
Queries:
- Particular node logs
{job="ctf", container=~"node0"}
- All nodes logs
{job="ctf", container=~"node.*"}
- Filter by log level
{job="ctf", container=~"node.*"} |= "WARN|INFO|DEBUG"
Profiling
We are using Pyroscope for profiling.
Go to localhost:4040 and choose Chainlink application
Blockscout
You can use local Blockscout instance to debug EVM smart contracts.
ctf bs up
Your Blockscout
instance is up on localhost
To remove it, we also clean up all Blockscout databases to prevent stale data when restarting your tests.
ctf bs down
Selecting Blockchain Node
By default, we connect to the first anvil
node, but you can select the node explicitly
ctf bs -r http://host.docker.internal:8545 d
ctf bs -r http://host.docker.internal:8555 d
Blockscout isn’t ideal for local, ephemeral environments, as it won’t re-index blocks and transactions on test reruns. The easiest approach is to set up Blockscout first, initialize the test environment, switch to the cache config, and run tests without restarting RPC nodes.
Otherwise, use ctf bs r
each time you restart your test with a fresh docker environment.
Components
CTF contains a lot of useful components, some of them are off-chain services like Chainlink Node
, NodeSet
CTF contains three groups of components:
- Off-chain services like
CL Node
,NodeSet
,JobDistributor
, etc. - On-chain wrappers for chainlink-deployments repository
- Test components, blockchain simulators, fakes, etc
Blockchain components
Here we keep blockchain simulators or real nodes we use for integration testing
EVM Blockchain Clients
We support 3 EVM clients at the moment: Geth, Anvil and Besu
Configuration
[blockchain_a]
# Blockchain node type, can be "anvil", "geth" or "besu
type = "anvil"
# Chain ID
chain_id = "1337"
# Anvil command line params, ex.: docker_cmd_params = ['--block-time=1', '...']
docker_cmd_params = []
# Docker image and tag
image = "f4hrenh9it/foundry:latest"
# External port to expose (HTTP API)
port = "8545"
# External port to expose (WS API)
port_ws = "8546"
# Pulls the image every time if set to 'true', used like that in CI. Can be set to 'false' to speed up local runs
pull_image = false
# Outputs are the results of deploying a component that can be used by another component
[blockchain_a.out]
# If 'use_cache' equals 'true' we skip component setup when we run the test and return the outputs
use_cache = true
# Chain ID
chain_id = "1337"
# Chain family, "evm", "solana", "cosmos", "op", "arb"
family = "evm"
[[blockchain_a.out.nodes]]
# URLs to access the node(s) inside docker network, used by other components
docker_internal_http_url = "http://anvil-14411:8545"
docker_internal_ws_url = "ws://anvil-14411:8545"
# URLs to access the node(s) on your host machine or in CI
http_url = "http://127.0.0.1:33955"
ws_url = "ws://127.0.0.1:33955"
Usage
package my_test
import (
"os"
"testing"
"github.com/smartcontractkit/chainlink-testing-framework/framework"
"github.com/smartcontractkit/chainlink-testing-framework/framework/components/blockchain"
"github.com/stretchr/testify/require"
)
type Config struct {
BlockchainA *blockchain.Input `toml:"blockchain_a" validate:"required"`
}
func TestDON(t *testing.T) {
in, err := framework.Load[Config](t)
require.NoError(t, err)
// deploy anvil blockchain simulator
bc, err := blockchain.NewBlockchainNetwork(in.BlockchainA)
require.NoError(t, err)
}
Test Private Keys
For Geth
and Anvil
we use the same key
Public: 0xf39fd6e51aad88f6f4ce6ab8827279cfffb92266
Private: ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80
For Besu
keys are
Public: 0xfe3b557e8fb62b89f4916b721be55ceb828dbd73
Private: 0x8f2a55949038a9610f50fb23b5883af3b4ecb3c3bb792cbcefbd1542c692be63
Public: 0x627306090abaB3A6e1400e9345bC60c78a8BEf57
Private: 0xc87509a1c067bbde78beb793e6fa76530b6382a4c0241e5e4a9ec0a0f44dc0d3
Public: 0xf17f52151EbEF6C7334FAD080c5704D77216b732
Private: 0xae6ae8e5ccbfb04590405997ee2d52d2b330726137b875053c36d94e974d162f
More docs for Besu
can be found here
Solana Blockchain Client
Since Solana
doesn't have official image for arm64
we built it, images we use are:
amd64 solanalabs/solana:v1.18.26 - used in CI
arm64 f4hrenh9it/solana:latest - used locally
Configuration
[blockchain_a]
type = "solana"
# public key for mint
public_key = "9n1pyVGGo6V4mpiSDMVay5As9NurEkY283wwRk1Kto2C"
# contracts directory, programs
contracts_dir = "."
# optional, in case you need some custom image
# image = "solanalabs/solana:v1.18.26"
# optional
# To deploy a solana program, we can provide a mapping of program name to program id.
# The program name has to match the name of the .so file in the contracts_dir directory.
# This example assumes there is a mcm.so in the mounted contracts_dir directory
# value is the program_id to set for the deployed program, any valid public key can be used here.
# this allows lookup via program_id during testing.
# Alternative, we can use `solana deploy` instead of this field to deploy a program, but
# program_id will be auto generated.
[blockchain_a.solana_programs]
mcm = "6UmMZr5MEqiKWD5jqTJd1WCR5kT8oZuFYBLJFi1o6GQX"
Usage
package examples
import (
"context"
"fmt"
"github.com/blocto/solana-go-sdk/client"
"github.com/smartcontractkit/chainlink-testing-framework/framework"
"github.com/smartcontractkit/chainlink-testing-framework/framework/components/blockchain"
"github.com/stretchr/testify/require"
"testing"
)
type CfgSolana struct {
BlockchainA *blockchain.Input `toml:"blockchain_a" validate:"required"`
}
func TestSolanaSmoke(t *testing.T) {
in, err := framework.Load[CfgSolana](t)
require.NoError(t, err)
bc, err := blockchain.NewBlockchainNetwork(in.BlockchainA)
require.NoError(t, err)
t.Run("test something", func(t *testing.T) {
// use internal URL to connect chainlink nodes
_ = bc.Nodes[0].DockerInternalHTTPUrl
// use host URL to deploy contracts
c := client.NewClient(bc.Nodes[0].HostHTTPUrl)
latestSlot, err := c.GetSlotWithConfig(context.Background(), client.GetSlotConfig{Commitment: "processed"})
require.NoError(t, err)
fmt.Printf("Latest slot: %v\n", latestSlot)
})
}
Test Private Keys
Public: 9n1pyVGGo6V4mpiSDMVay5As9NurEkY283wwRk1Kto2C
Private: [11,2,35,236,230,251,215,68,220,208,166,157,229,181,164,26,150,230,218,229,41,20,235,80,183,97,20,117,191,159,228,243,130,101,145,43,51,163,139,142,11,174,113,54,206,213,188,127,131,147,154,31,176,81,181,147,78,226,25,216,193,243,136,149]
Chainlink
Here we store Chainlink
components: Node, NodeSet, JobDistributor and other services.
Node
Here we provide full configuration parameters for Node
Configuration
[cl_node]
[cl_node.db]
# PostgreSQL image version and tag
image = "postgres:12.0"
# Pulls the image every time if set to 'true', used like that in CI. Can be set to 'false' to speed up local runs
pull_image = false
[cl_node.node]
# custom ports that plugins may need to expose and map to the host machine
custom_ports = [14000, 14001]
# A list of paths to capability binaries
capabilities = ["./capability_1", "./capability_2"]
# Default capabilities directory inside container
capabilities_container_dir = "/home/capabilities"
# Image to use, you can either provide "image" or "docker_file" + "docker_ctx" fields
image = "public.ecr.aws/chainlink/chainlink:v2.17.0"
# Path to your Chainlink Dockerfile
docker_file = "../../core/chainlink.Dockerfile"
# Path to docker context that should be used to build from
docker_ctx = "../.."
# Optional name for image we build, default is "ctftmp"
docker_image_name = "ctftmp"
# Pulls the image every time if set to 'true', used like that in CI. Can be set to 'false' to speed up local runs
pull_image = false
# Overrides Chainlink node TOML configuration
# can be multiline, see example
user_config_overrides = """
[Log]
level = 'info'
"""
# Overrides Chainlink node secrets TOML configuration
# you can only add fields, overriding existing fields is prohibited by Chainlink node
user_secrets_overrides = """
[AnotherSecret]
mySecret = 'a'
"""
# Outputs are the results of deploying a component that can be used by another component
[cl_node.out]
# If 'use_cache' equals 'true' we skip component setup when we run the test and return the outputs
use_cache = true
# Describes deployed or external Chainlink node
[cl_node.out.node]
# API user name
api_auth_user = 'notreal@fakeemail.ch'
# API password
api_auth_password = 'fj293fbBnlQ!f9vNs'
# Host Docker URLs the test uses
# in case of using external component you can replace these URLs with another deployment
p2p_url = "http://127.0.0.1:32812"
url = "http://127.0.0.1:32847"
# Describes deployed or external Chainlink node
[cl_node.out.postgresql]
# PostgreSQL connection string
# in case of using external database can be overriden
url = "postgresql://chainlink:thispasswordislongenough@127.0.0.1:32846/chainlink?sslmode=disable"
Usage
package yourpackage_test
import (
"fmt"
"github.com/smartcontractkit/chainlink-testing-framework/framework"
"github.com/smartcontractkit/chainlink-testing-framework/framework/components/blockchain"
"github.com/smartcontractkit/chainlink-testing-framework/framework/components/clnode"
"github.com/stretchr/testify/require"
"testing"
)
type Step2Cfg struct {
BlockchainA *blockchain.Input `toml:"blockchain_a" validate:"required"`
CLNode *clnode.Input `toml:"cl_node" validate:"required"`
}
func TestMe(t *testing.T) {
in, err := framework.Load[Step2Cfg](t)
require.NoError(t, err)
bc, err := blockchain.NewBlockchainNetwork(in.BlockchainA)
require.NoError(t, err)
networkCfg, err := clnode.NewNetworkCfgOneNetworkAllNodes(bc)
require.NoError(t, err)
in.CLNode.Node.TestConfigOverrides = networkCfg
output, err := clnode.NewNodeWithDB(in.CLNode)
require.NoError(t, err)
t.Run("test something", func(t *testing.T) {
fmt.Printf("node url: %s\n", output.Node.HostURL)
require.NotEmpty(t, output.Node.HostURL)
})
}
NodeSet
Here we provide full configuration parameters for NodeSet
Configuration
This component requires some Blockchain to be deployed, add this to config
[blockchain_a]
# Blockchain node type, can be "anvil" or "geth"
type = "anvil"
# Chain ID
chain_id = "1337"
# Anvil command line params, ex.: docker_cmd_params = ['--block-time=1', '...']
docker_cmd_params = []
# Docker image and tag
image = "f4hrenh9it/foundry:latest"
# External port to expose
port = "8545"
# Pulls the image every time if set to 'true', used like that in CI. Can be set to 'false' to speed up local runs
pull_image = false
# Outputs are the results of deploying a component that can be used by another component
[blockchain_a.out]
chain_id = "1337"
# If 'use_cache' equals 'true' we skip component setup when we run the test and return the outputs
use_cache = true
[[blockchain_a.out.nodes]]
# URLs to access the node(s) inside docker network, used by other components
docker_internal_http_url = "http://anvil-14411:8545"
docker_internal_ws_url = "ws://anvil-14411:8545"
# URLs to access the node(s) on your host machine or in CI
http_url = "http://127.0.0.1:33955"
ws_url = "ws://127.0.0.1:33955"
Then configure NodeSet
[nodeset]
# amount of Chainlink nodes to spin up
nodes = 5
# Override mode: can be "all" or "each"
# defines how we override configs, either we apply first node fields to all of them
# or we define each node custom configuration (used in compatibility testing)
override_mode = "all"
# HTTP API port range start, each new node get port incremented (host machine)
http_port_range_start = 10000
# P2P API port range start, each new node get port incremented (host machine)
p2p_port_range_start = 12000
[nodeset.db]
# PostgreSQL image version and tag
image = "postgres:12.0"
# Pulls the image every time if set to 'true', used like that in CI. Can be set to 'false' to speed up local runs
pull_image = false
# PostgreSQL volume name
volume_name = ""
[[nodeset.node_specs]]
[nodeset.node_specs.node]
# custom ports that plugins may need to expose and map to the host machine
custom_ports = [14000, 14001]
# A list of paths to capability binaries
capabilities = ["./capability_1", "./capability_2"]
# Default capabilities directory inside container
capabilities_container_dir = "/home/capabilities"
# Image to use, you can either provide "image" or "docker_file" + "docker_ctx" fields
image = "public.ecr.aws/chainlink/chainlink:v2.17.0"
# Path to your Chainlink Dockerfile
docker_file = "../../core/chainlink.Dockerfile"
# Path to docker context that should be used to build from
docker_ctx = "../.."
# Optional name for image we build, default is "ctftmp"
docker_image_name = "ctftmp"
# Pulls the image every time if set to 'true', used like that in CI. Can be set to 'false' to speed up local runs
pull_image = false
# Overrides Chainlink node TOML configuration
# can be multiline, see example
user_config_overrides = """
[Log]
level = 'info'
"""
# Overrides Chainlink node secrets TOML configuration
# you can only add fields, overriding existing fields is prohibited by Chainlink node
user_secrets_overrides = """
[AnotherSecret]
mySecret = 'a'
"""
# Outputs are the results of deploying a component that can be used by another component
[nodeset.out]
# If 'use_cache' equals 'true' we skip component setup when we run the test and return the outputs
use_cache = true
# Describes deployed or external Chainlink nodes
[[nodeset.out.cl_nodes]]
use_cache = true
# Describes deployed or external Chainlink node
[nodeset.out.cl_nodes.node]
# API user name
api_auth_user = 'notreal@fakeemail.ch'
# API password
api_auth_password = 'fj293fbBnlQ!f9vNs'
# Host Docker URLs the test uses
# in case of using external component you can replace these URLs with another deployment
p2p_url = "http://127.0.0.1:32996"
url = "http://127.0.0.1:33096"
# Describes PostgreSQL instance
[nodeset.out.cl_nodes.postgresql]
# PostgreSQL connection string
# in case of using external database can be overriden
url = "postgresql://chainlink:thispasswordislongenough@127.0.0.1:33094/chainlink?sslmode=disable"
# Can have more than one node, fields are the same, see above ^^
[[nodeset.out.cl_nodes]]
[nodeset.out.cl_nodes.node]
[nodeset.out.cl_nodes.postgresql]
...
Usage
package capabilities_test
import (
"github.com/smartcontractkit/chainlink-testing-framework/framework"
"github.com/smartcontractkit/chainlink-testing-framework/framework/components/blockchain"
ns "github.com/smartcontractkit/chainlink-testing-framework/framework/components/simple_node_set"
"github.com/stretchr/testify/require"
"testing"
)
type Config struct {
BlockchainA *blockchain.Input `toml:"blockchain_a" validate:"required"`
NodeSet *ns.Input `toml:"nodeset" validate:"required"`
}
func TestMe(t *testing.T) {
in, err := framework.Load[Config](t)
require.NoError(t, err)
bc, err := blockchain.NewBlockchainNetwork(in.BlockchainA)
require.NoError(t, err)
out, err := ns.NewSharedDBNodeSet(in.NodeSet, bc)
require.NoError(t, err)
t.Run("test something", func(t *testing.T) {
for _, n := range out.CLNodes {
require.NotEmpty(t, n.Node.HostURL)
require.NotEmpty(t, n.Node.HostP2PURL)
}
})
}
Testing Maturity Model
Read this doc to understand the rationale behind our testing approach and to explore the stages of maturity in end-to-end testing.
The following chapters detail specific testing types and best practices for system-level end-to-end testing.
Interactive
For non-technical users or those building with Chainlink products outside of Golang, we offer an interactive method to deploy a NodeSet.
If you're on OS X, we recommend to use OrbStack, otherwise Docker Desktop
Download our CLI
OS X arm64
(M1/M2/M3 MacBooks)
curl -L https://github.com/smartcontractkit/chainlink-testing-framework/releases/download/framework%2Fv0.4.1/framework-v0.4.1-darwin-arm64.tar.gz | tar -xz
OS X amd64
(old Intel chips)
curl -L https://github.com/smartcontractkit/chainlink-testing-framework/releases/download/framework%2Fv0.4.1/framework-v0.4.1-darwin-amd64.tar.gz | tar -xz
Linux arm64
curl -L https://github.com/smartcontractkit/chainlink-testing-framework/releases/download/framework%2Fv0.4.1/framework-v0.4.1-linux-arm64.tar.gz | tar -xz
Linux amd64
curl -L https://github.com/smartcontractkit/chainlink-testing-framework/releases/download/framework%2Fv0.4.1/framework-v0.4.1-linux-amd64.tar.gz | tar -xz
Allow it to run in System Settings -> Security Settings
(OS X)
./ctf b ns
Use Tab/Shift+Tab
to select an option, use <-
, ->
to change the option, enter
to submit.
Press Ctrl+C
to remove the services.
Libraries
CTF monorepository contains a set of libraries:
- WASP - Scalable protocol-agnostic load testing library for
Go
- Havoc - Chaos testing library
- Seth - Ethereum client library with transaction tracing and gas bumping
Seth
Reliable and debug-friendly Ethereum client
Content
- Goals
- Features
- Examples
- Setup
- Configuration
- Automated gas price estimation
- DOT Graphs of transactions
- Using multiple private keys
- Experimental features
- Gas bumping for slow transactions
- CLI
- Manual gas price estimation
- Block Stats
- Single transaction tracing
- Bulk transaction tracing
- RPC traffic logging
- Read-only mode
- ABI Finder
- Contract Map
- Contract Store
Goals
- Be a thin, debuggable and battle tested wrapper on top of
go-ethereum
- Decode all transaction inputs/outputs/logs for all ABIs you are working with, automatically
- Simple synchronous API
- Do not handle
nonces
on the client side, trust the server - Do not wrap
bind
generated contracts, small set of additional debug API - Resilient: should execute transactions even if there is a gas spike or an RPC outage (failover)
- Well tested: should provide a suite of e2e tests that can be run on testnets to check integration
Features
- Decode named inputs
- Decode named outputs
- Decode anonymous outputs
- Decode logs
- Decode indexed logs
- Decode old string reverts
- Decode new typed reverts
- EIP-1559 support
- Multi-keys client support
- CLI to manipulate test keys
- Simple manual gas price estimation
- Fail over client logic
- Decode collided event hashes
- Tracing support (4byte)
- Tracing support (callTracer)
- Tracing support (prestate)
- Tracing decoding
- Tracing tests
- More tests for corner cases of decoding/tracing
-
Saving of deployed contracts mapping (
address -> ABI_name
) for live networks - Reading of deployed contracts mappings for live networks
- Automatic gas estimator (experimental)
- Block stats CLI
- Check if address has a pending nonce (transaction) and panic if it does
- DOT graph output for tracing
- Gas bumping for slow transactions
You can read more about how ABI finding and contract map works here and about contract store here here.
Examples
Check examples folder
Lib provides a small amount of helpers for decoding handling that you can use with vanilla go-ethereum
generated wrappers
// Decode waits for transaction and decode all the data/errors
Decode(tx *types.Transaction, txErr error) (*DecodedTransaction, error)
// NewTXOpts returns a new sequential transaction options wrapper,
// sets opts.GasPrice and opts.GasLimit from seth.toml or override with options
NewTXOpts(o ...TransactOpt) *bind.TransactOpts
// NewCallOpts returns a new call options wrapper
NewCallOpts(o ...CallOpt) *bind.CallOpts
By default, we are using the root
key 0
, but you can also use any of the private keys passed as part of Network
configuration in seth.toml
or ephemeral keys.
// NewCallKeyOpts returns a new sequential call options wrapper from the key N
NewCallKeyOpts(keyNum int, o ...CallOpt) *bind.CallOpts
// NewTXKeyOpts returns a new transaction options wrapper called from the key N
NewTXKeyOpts(keyNum int, o ...TransactOpt) *bind.TransactOpts
Start Geth
in a separate terminal, then run the examples
make GethSync
cd examples
go test -v
Setup
We are using nix
Enter the shell
nix develop
Building test contracts
We have go-ethereum
and foundry tools inside nix
shell
make build
Testing
To run tests on a local network, first start it
make AnvilSync
Or use latest Geth
make GethSync
You can use default hardhat
key ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80
to run tests
Run the decode tests
make network=Anvil root_private_key=ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80 test
make network=Geth root_private_key=ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80 test
Check other params in seth.toml, select any network and use your key for testnets
User facing API tests are here
make network=Anvil root_private_key=ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80 test_api
make network=Geth root_private_key=ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80 test_api
CLI tests
make network=Anvil root_private_key=ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80 test_cli
make network=Geth root_private_key=ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80 test_cli
Tracing tests
make network=Anvil root_private_key=ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80 test_trace
make network=Geth root_private_key=ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80 test_trace
Config
Simplified configuration
If you do not want to set all the parameters, you can use a simplified progammatical configuration. Here's an example:
cfg := seth.DefaultConfig("ws://localhost:8546", []string{"ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80"})
client, err := seth.NewClientWithConfig(cfg)
if err != nil {
log.Fatal(err)
}
This config uses what we consider reasonable defaults, such as:
- 5 minute transaction confirmation timeout
- 1 minute RPC node dial timeout
- enabled EIP-1559 dynamic fees and automatic gas prices estimation (with 200 blocks history; will auto-disable itself if RPC doesn't support EIP-1559)
- tracing only of reverted transaction to console and DOT graphs
- checking of RPC node health on client creation
- no ephemeral keys
ClientBuilder
You can also use a ClientBuilder
to build a config programmatically. Here's an extensive example:
client, err := NewClientBuilder().
// network
WithNetworkName("my network").
// if empty we will ask the RPC node for the chain ID
WithNetworkChainId(1337).
WithRpcUrl("ws://localhost:8546").
WithPrivateKeys([]string{"ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80"}).
WithRpcDialTimeout(10*time.Second).
WithTransactionTimeouts(1*time.Minute).
// addresses
WithEphemeralAddresses(10, 10).
// tracing
WithTracing(seth.TracingLevel_All, []string{seth.TraceOutput_Console}).
// protections
WithProtections(true, true, seth.MustMakeDuration(2*time.Minute)).
// artifacts folder
WithArtifactsFolder("some_folder").
// folder with gethwrappers for ABI decoding
WithGethWrappersFolders([]string{"./gethwrappers/ccip", "./gethwrappers/keystone"}).
// nonce manager
WithNonceManager(10, 3, 60, 5).
// EIP-1559 and gas estimations
WithEIP1559DynamicFees(true).
WithDynamicGasPrices(120_000_000_000, 44_000_000_000).
WithGasPriceEstimations(true, 10, seth.Priority_Fast).
// gas bumping: retries, max gas price, bumping strategy function
WithGasBumping(5, 100_000_000_000, PriorityBasedGasBumpingStrategyFn).
Build()
if err != nil {
log.Fatal(err)
}
By default, it uses the same values as simplified configuration, but you can override them by calling the appropriate methods. Builder includes only options
that we thought to be most useful, it's not a 1:1 mapping of all fields in the Config
struct. Therefore, if you need to set some more advanced options, you should create the Config
struct directly,
use TOML config or manually set the fields on the Config
struct returned by the builder.
It' also possible to use the builder to create a new config from an existing one:
client, err := NewClientBuilderWithConfig(&existingConfig).
UseNetworkWithChainId(1337).
WithEIP1559DynamicFees(false).
Build()
if err != nil {
log.Fatal(err)
}
This can be useful if you already have a config, but want to modify it slightly. It can also be useful if you read TOML config with multiple Networks
and you want to specify which one you want to use.
Supported env vars
Some crucial data is stored in env vars, create .envrc
and use source .envrc
, or use direnv
export SETH_LOG_LEVEL=info # global logger level
export SETH_CONFIG_PATH=seth.toml # path to the toml config
export SETH_NETWORK=Geth # selected network
export SETH_ROOT_PRIVATE_KEY=ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80 # root private key
alias seth="SETH_CONFIG_PATH=seth.toml go run cmd/seth/seth.go" # useful alias for CLI
Find the log level options here
Alternatively if you don't have a network defined in the TOML you can still use the CLI by providing these 2 key env vars:
export SETH_URL=https://rpc.fuji.testnet.anyswap.exchange
export SETH_CHAIN_ID=43113
go run cmd/seth/seth.go ... # your command
In that case you should still pass network name with -n
flag.
TOML configuration
Set up your ABI directory (relative to seth.toml
)
abi_dir = "contracts/abi"
Setup your BIN directory (relative to seth.toml
)
bin_dir = "contracts/bin"
Decide whether you want to generate any ephemeral
keys:
# Set number of ephemeral keys to be generated (0 for no ephemeral keys). Each key will receive a proportion of native tokens from root private key's balance with the value equal to `(root_balance / ephemeral_keys_number) - transfer_fee * ephemeral_keys_number`.
ephemeral_addresses_number = 10
You can enable auto-tracing for all transactions meeting configured level, which means that every time you use Decode()
we will decode the transaction and also trace all calls made within the transaction, together with all inputs, outputs, logs and events. Three tracing levels are available:
all
- trace all transactionsreverted
- trace only reverted transactions (that's default setting used if you don't settracing_level
)none
- don't trace any transactions
Example:
tracing_level = "reverted"
Additionally, you can decide where tracing/decoding data goes to. There are three options:
console
- we will print all tracing data to the consolejson
- we will save tracing data for each transaction to a JSON filedot
- we will save tracing data for each transaction to a DOT file (graph)
trace_outputs = ["console", "json", "dot"]
For info on viewing DOT files please check the DOT graphs section below.
Example:
These two options should be used with care, when tracing_level
is set to all
as they might generate a lot of data.
If you want to check if the RPC is healthy on start, you can enable it with:
check_rpc_health_on_start = false
It will execute a simple check of transferring 10k wei from root key to root key and check if the transaction was successful.
You can also enable pending nonce protection that will check if given key has any pending transactions. By default, we will wait 1 minute for all transactions to be mined. If any of them is still pending, we will panic. You can enable it with:
pending_nonce_protection_enabled = true
pending_nonce_protection_timeout = "5m"
If you want to use HTTP instead of WS you can do so by setting to true:
force_http = false
You can add more networks like this:
[[Networks]]
name = "Fuji"
transaction_timeout = "30s"
# gas limit should be explicitly set only if you are connecting to a node that's incapable of estimating gas limit itself (should only happen for very old versions)
# gas_limit = 9_000_000
# hardcoded gas limit for sending funds that will be used if estimation of gas limit fails
transfer_gas_fee = 21_000
# legacy transactions
gas_price = 1_000_000_000
# EIP-1559 transactions
eip_1559_dynamic_fees = true
gas_fee_cap = 25_000_000_000
gas_tip_cap = 1_800_000_000
urls_secret = ["..."]
# if set to true we will dynamically estimate gas for every transaction (explained in more detail below)
gas_price_estimation_enabled = true
# how many last blocks to use, when estimating gas for a transaction
gas_price_estimation_blocks = 1000
# priority of the transaction, can be "fast", "standard" or "slow" (the higher the priority, the higher adjustment factor and buffer will be used for gas estimation) [default: "standard"]
gas_price_estimation_tx_priority = "slow"
If you don't we will use the default settings for Default
network.
ChainID is not needed, as it's fetched from the node.
If you want to save addresses of deployed contracts, you can enable it with:
save_deployed_contracts_map = true
If you want to re-use previously deployed contracts you can indicate file name in seth.toml
:
contract_map_file = "deployed_contracts_mumbai.toml"
Both features only work for live networks. Otherwise, they are ignored, and nothing is saved/read from for simulated networks.
Automatic Gas Estimator
This section explains how to configure and understand the automatic gas estimator, which is crucial for executing transactions on Ethereum-based networks. Here’s what you need to know:
Configuration Requirements
Before using the automatic gas estimator, it's essential to set the default gas-related parameters for your network:
- Non-EIP-1559 Networks: Set the
gas_price
to define the cost per unit of gas if your network doesn't support EIP-1559. - EIP-1559 Networks: If your network supports EIP-1559, set the following:
eip_1559_dynamic_fees
: Enables dynamic fee structure.gas_fee_cap
: The maximum fee you're willing to pay per gas.gas_tip_cap
: An optional tip to prioritize your transaction within a block (although if it's set to0
there's a high chance your transaction will take longer to execute as it will be less attractive to miners, so do set it).
These settings act as a fallback if the gas estimation fails. Additionally, always specify transfer_gas_fee
for the fee associated with token transfers.
If you do not know if your network supports EIP-1559, but you want to give it a try it's recommended that you also set gas_price
as a fallback. When we try to use EIP-1559 during gas price estimation, but it fails, we will fallback to using non-EIP-1559 logic. If that one fails as well, we will use hardcoded gas_price
value.
How Gas Estimation Works
Gas estimation varies based on whether the network is a private Ethereum Network or a live network.
- Private Ethereum Networks: no estimation is needed. We always use hardcoded values.
For real networks, the estimation process differs for legacy transactions and those compliant with EIP-1559:
Legacy Transactions
- Initial Price: Query the network node for the current suggested gas price.
- Priority Adjustment: Modify the initial price based on
gas_price_estimation_tx_priority
. Higher priority increases the price to ensure faster inclusion in a block. - Congestion Analysis: Examine the last X blocks (as specified by
gas_price_estimation_blocks
) to determine network congestion, calculating the usage rate of gas in each block and giving recent blocks more weight. Disabled ifgas_price_estimation_blocks
equals0
. - Buffering: Add a buffer to the adjusted gas price to increase transaction reliability during high congestion.
EIP-1559 Transactions
- Tip Fee Query: Ask the node for the current recommended tip fee.
- Fee History Analysis: Gather the base fee and tip history from recent blocks to establish a fee baseline.
- Fee Selection: Use the greatest of the node's suggested tip or the historical average tip for upcoming calculations.
- Priority and Adjustment: Increase the base and tip fees based on transaction priority (
gas_price_estimation_tx_priority
), which influences how much you are willing to spend to expedite your transaction. - Final Fee Calculation: Sum the base fee and adjusted tip to set the
gas_fee_cap
. - Congestion Buffer: Similar to legacy transactions, analyze congestion and apply a buffer to both the fee cap and the tip to secure transaction inclusion.
Understanding and setting these parameters correctly ensures that your transactions are processed efficiently and cost-effectively on the network.
When fetching historical base fee and tip data, we will use the last gas_price_estimation_blocks
blocks. If it's set to 0
we will default to 100
last blocks. If the blockchain has less than 100
blocks we will use all of them.
Finally, gas_price_estimation_tx_priority
is also used, when deciding, which percentile to use for base fee and tip for historical fee data. Here's how that looks:
case Priority_Fast:
baseFee = stats.GasPrice.Perc99
historicalGasTipCap = stats.TipCap.Perc99
case Priority_Standard:
baseFee = stats.GasPrice.Perc50
historicalGasTipCap = stats.TipCap.Perc50
case Priority_Slow:
baseFee = stats.GasPrice.Perc25
historicalGasTipCap = stats.TipCap.Perc25
Adjustment factor
All values are multiplied by the adjustment factor, which is calculated based on gas_price_estimation_tx_priority
:
case Priority_Fast:
return 1.2
case Priority_Standard:
return 1.0
case Priority_Slow:
return 0.8
For fast transactions we will increase gas price by 20%, for standard we will use the value as is and for slow we will decrease it by 20%.
Buffer percents
If gas_price_estimation_blocks
is higher than 0
we further adjust the gas price by adding a buffer to it, based on congestion rate:
case Congestion_Low:
return 1.10, nil
case Congestion_Medium:
return 1.20, nil
case Congestion_High:
return 1.30, nil
case Congestion_VeryHigh:
return 1.40, nil
For low congestion rate we will increase gas price by 10%, for medium by 20%, for high by 30% and for very high by 40%. We cache block header data in an in-memory cache, so we don't have to fetch it every time we estimate gas. The cache has capacity equal to gas_price_estimation_blocks
and every time we add a new element, we remove one that is least frequently used and oldest (with block number being a constant and chain always moving forward it makes no sense to keep old blocks). It's important to know that in order to use congestion metrics we need to fetch at least 80% of the requested blocks. If that fails, we will skip this part of the estimation and only adjust the gas price based on priority.
For both transaction types if any of the steps fails, we fall back to hardcoded values.
DOT graphs
There are multiple ways of visualising DOT graphs:
xdot
application [recommended]- VSCode Extensions
- online viewers
xdot
To install simply run homebrew install xdot
and then run xdot <path_to_dot_file>
. This tool seems to be the best for the job, since the viewer is interactive and supports tooltips, which in our case contain extra tracing information.
VSCode Extensions
There are multiple extensions that can be used to view DOT files in VSCode. We recommend using Graphviz Preview. The downside is that it doesn't support tooltips.
Goland
We were unable to find any (working) plugins for DOT graph visualization. If you do know any, please let us know.
Online viewers
There's at least a dozen of them available, but none of them support tooltips and most can't handle our multi-line labels. These two are known to work, though:
Using multiple keys
If you want to use existing multiple keys (instead of ephemeral ones) you can pass them as part of the network configuration. In that case it's recommended to not read them from TOML file. If you need to read them for the filesystem/os it's best if you use environment variables.
Once you've read them in a safe manner you should programmatically add them to Seth's Config struct (which safe parts can be read from TOML file). You can either add them directly to Network
, if it's already set up, or you can add them to Networks
slice to the network you intend to use.
For example you could start by reading the TOML configuration first:
cfg, err := seth.ReadCfg()
if err != nil {
log.Fatal(err)
}
Then read the private keys in a safe manner. For example from a secure vault or environment variables:
var privateKeys []string
var err error
privateKeys, err = some_utils.ReadPrivateKeysFromEnv()
if err != nil {
log.Fatal(err)
}
and then add them to the Network
you plan to use. Let's assume it's called Sepolia
:
for i, network := range cfg.Networks {
if network.Name == "Sepolia" {
cfg.Networks[i].PrivateKeys = privateKeys
}
}
Or if you aren't using [[Networks]]
in your TOML config and have just a single Network
:
cfg.Network.PrivateKeys = privateKeys
Or... you can use the convenience function AppendPksToNetwork()
to have them added to both the Network
and Networks
slice:
added := cfg.AppendPksToNetwork(privateKeys, "Sepolia")
if !added {
log.Fatal("Network Sepolia not found in the config")
}
Finally, proceed to create a new Seth instance:
seth, err := seth.NewClientWithConfig(cfg)
if err != nil {
log.Fatal(err)
}
A working example can be found here as TestSmokeExampleMultiKeyFromEnv
test.
Currently, there's no safe way to pass multiple keys to CLI. In that case TOML is the only way to go, but you should be mindful that if you commit the TOML file with keys in it, you should assume they are compromised and all funds on them are lost.
Experimental features
In order to enable an experimental feature you need to pass its name in config. It's a global config, you cannot enable it per-network. Example:
# other settings before...
tracing_level = "reverted"
trace_outputs = ["console"]
experiments_enabled = ["slow_funds_return", "eip_1559_fee_equalizer"]
Here's what they do:
slow_funds_return
will work only incore
and when enabled it changes tx priority toslow
and increases transaction timeout to 30 minutes.eip_1559_fee_equalizer
in case of EIP-1559 transactions if it detects that historical base fee and suggested/historical tip are more than 3 orders of magnitude apart, it will use the higher value for both (this helps in cases where base fee is almost 0 and transaction is never processed).
Gas bumping for slow transactions
Seth has built-in gas bumping mechanism for slow transactions. If a transaction is not mined within a certain time frame (Network
's transaction timeout), Seth will automatically bump the gas price and resubmit the transaction. This feature is disabled by default and can be enabled by setting the [gas_bumps] retries
to a non-zero number:
[gas_bumps]
retries = 5
Once enabled, by default the amount, by which gas price is bumped depends on gas_price_estimation_tx_priority
setting and is calculated as follows:
Priority_Fast
: 30% increasePriority_Standard
: 15% increasePriority_Slow
: 5% increase- everything else: no increase
You can cap max gas price by settings (in wei):
[gas_bumps]
max_gas_price = 1000000000000
Once the gas price bump would go above the limit we stop bumping and use the last gas price that was below the limit.
How gas price is calculated depends on transaction type:
- for legacy transactions it's just the gas price
- for EIP-1559 transactions it's the sum of gas fee cap and tip cap
- for Blob transactions (EIP-4844) it's the sum of gas fee cap and tip cap and max fee per blob
- for AccessList transactions (EIP-2930) it's just the gas price
Please note that Blob and AccessList support remains experimental and is not tested.
If you want to use a custom bumping strategy, you can use a function with GasBumpStrategyFn type. Here's an example of a custom strategy that bumps the gas price by 100% for every retry:
var customGasBumpStrategyFn = func(gasPrice *big.Int) *big.Int {
return new(big.Int).Mul(gasPrice, big.NewInt(2))
}
To use this strategy, you need to pass it to the WithGasBumping
function in the ClientBuilder
:
var hundredGwei in64 = 100_000_000_000
client, err := builder.
// other settings...
WithGasBumping(5, hundredGwei, customGasBumpStrategyFn).
Build()
Or set it directly on Seth's config:
// assuming sethClient is already created
sethClient.Config.GasBumps.StrategyFn = customGasBumpStrategyFn
Since strategy function only accepts a single parameter, if you want to base its behaviour on anything else than that you will need to capture these values from the context, in which you define the strategy function. For example, you can use a closure to capture the initial gas price:
gasOracleClient := NewGasOracleClient()
var oracleGasBumpStrategyFn = func(gasPrice *big.Int) *big.Int {
// get the current gas price from the oracle
suggestedGasPrice := gasOracleClient.GetCurrentGasPrice()
// if oracle suggests a higher gas price, use it
if suggestedGasPrice.Cmp(gasPrice) == 1 {
return suggestedGasPrice
}
// otherwise bump by 100%
return new(big.Int).Mul(gasPrice, big.NewInt(2))
}
Same strategy is applied to all types of transactions, regardless whether it's gas price, gas fee cap, gas tip cap or max blob fee.
When enabled, gas bumping is used in two places:
- during contract deployment via
DeployContract
function - inside
Decode()
function
It is recommended to decrease transaction timeout when using gas bumping, as it will be effectively increased by the number of retries. So if you were running with 5 minutes timeout and 0 retries, you should set it to 1 minute and 5 retries or 30 seconds and 10 retries.
Don't worry if while bumping logic executes previous transaction gets mined. In that case sending replacement transaction with higher gas will fail (because it is using the same nonce as original transaction) and we will retry waiting for the mining of the original transaction.
Gas bumping is only applied for submitted transaction. If transaction was rejected by the node (e.g. because of too low base fee) we will not bump the gas price nor try to submit it, because original transaction submission happens outside of Seth.
CLI
You can either define the network you want to interact with in your TOML config and then refer it in the CLI command, or you can pass all network parameters via env vars. Most of the examples below show how to use the former approach.
Manual gas price estimation
In order to adjust gas price for a transaction, you can use seth gas
command
seth -n Fuji gas -b 10000 -tp 0.99
This will analyze last 10k blocks and give you 25/50/75/99th/Max percentiles for base fees and tip fees
-tp 0.99
requests the 99th tip percentile across all the transaction in one block and calculates 25/50/75/99th/Max across all blocks
Block Stats
If you need to get some insights into network stats and create a realistic load/chaos profile with simulators (anvil
as an example), you can use stats
CLI command
Define your network in seth.toml
Edit your seth.toml
[[networks]]
name = "MyCustomNetwork"
urls_secret = ["..."]
[block_stats]
rpc_requests_per_second_limit = 5
Then check the stats for the last N blocks
seth -n MyCustomNetwork stats -s -10
To check stats for the interval (A, B)
seth -n MyCustomNetwork stats -s A -e B
Pass all network parameters via env vars
If you don't have a network defined in the TOML you can still use the CLI by providing the RPC url via cmd arg.
Then check the stats for the last N blocks
seth -u "https://my-rpc.network.io" stats -s -10
To check stats for the interval (A, B)
seth -u "https://my-rpc.network.io" stats -s A -e B
Results can help you to understand if network is stable, what is avg block time, gas price, block utilization and transactions per second.
# Stats
perc_95_tps = 8.0
perc_95_block_duration = '3s'
perc_95_block_gas_used = 1305450
perc_95_block_gas_limit = 15000000
perc_95_block_base_fee = 25000000000
avg_tps = 2.433333333333333
avg_block_duration = '2s'
avg_block_gas_used = 493233
avg_block_gas_limit = 15000000
avg_block_base_fee = 25000000000
# Recommended performance/chaos test parameters
duration = '2m0s'
block_gas_base_fee_initial_value = 25000000000
block_gas_base_fee_bump_percentage = '100.00% (no bump required)'
block_gas_usage_percentage = '3.28822000% gas used (no congestion)'
avg_tps = 3.0
max_tps = 8.0
Single transaction tracing
You can trace a single transaction using seth trace
command. Example with seth
alias mentioned before:
seth -u "https://my-rpc.network.io" trace -t 0x4c21294bf4c0a19de16e0fca74e1ea1687ba96c3cab64f6fca5640fb7b84df65
or if you want to use a predefined-network:
seth -n=Geth trace -t 0x4c21294bf4c0a19de16e0fca74e1ea1687ba96c3cab64f6fca5640fb7b84df65
Bulk transaction tracing
You can trace multiple transactions at once using seth trace
command for a predefined network named Geth
. Example:
seth -n=Geth trace -f reverted_transactions.json
or by passing all the RPC parameter with a flag:
seth -u "https://my-rpc.network.io" trace -f reverted_transactions.json
You need to pass a file with a list of transaction hashes to trace. The file should be a JSON array of transaction hashes, like this:
[
"0x...",
"0x...",
"0x...",
...
]
(Note that currently Seth automatically creates reverted_transactions_<network>_<date>.json
with all reverted transactions, so you can use this file as input for the trace
command.)
RPC Traffic logging
With SETH_LOG_LEVEL=trace
we will also log to console all traffic between Seth and RPC node. This can be useful for debugging as you can see all the requests and responses.
Read-only mode
It's possible to use Seth in read-only mode only for transaction confirmation and tracing. Following operations will fail:
- contract deployment (we need a pk to sign the transaction)
- new transaction options (we need the pk/address to check nonce)
- RPC health check (we need a pk to send a transaction to ourselves)
- pending nonce protection (we need an address to check pending transactions)
- ephemeral keys (we need a pk to fund them)
- gas bumping (we need a pk to sign the transaction)
The easiest way to enable read-only mode is to client via ClientBuilder
:
client, err := builder.
WithNetworkName("my network").
WithRpcUrl("ws://localhost:8546").
WithEphemeralAddresses(10, 1000).
WithPrivateKeys([]string{"ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80"}).
WithReadOnlyMode().
Build()
when builder is called with WithReadOnlyMode()
it will disable all the operations mentioned above and all the configuration settings related to them.
Additionally, when the client is build anc cfg.ReadOnly = true
is set, we will validate that:
- no addresses and private keys are passed
- no ephemeral addresses are to be created
- RPC health check is disabled
- pending nonce protection is disabled
- gas bumping is disabled
ABI Finder
In order to be able to decode and trace transactions and calls between smart contracts we need their ABIs. Unfortunately, it might happen that two or more contracts have methods with the same signatures, which might result in incorrect tracing. To make that problem less severe we have decided to add a single point of entry for contract deployment in Seth to track what contract is deployed at which address and thus minimise incorrect tracing due to potentially ambiguous method signatures.
There are two possible starting points:
-
We don’t know what contract (ABI) is located at a given address. That is the case, when the contract either wasn’t uploaded via Seth or we haven’t supplied Seth with a contract map as part of its configuration (more on that later).
a. We sequentially iterate over all known ABIs (map:
name -> ABI_name
) checking whether it has a method with a given signature. Once we get a first match we upsert that (address -> ABI_name
) data into the contract map and return the ABI.The caveat here is that if the method we are searching for is present in more than one ABI we might associate the address with an incorrect address (we use the first match).
b. If no match is found we return an error.
-
We know what ABI is located at a given address. That is the case, when we have either uploaded the contract via Seth, provided Seth with a contract map or already traced a transaction to that address and found an ABI with matching method signature.
a. We fetch the corresponding ABI and check if it indeed contains the method we are looking for (as mentioned earlier in some cases it might not be the case, because we associated the ABI with address using a different method that it shared with some other contract).
b. If it does, we return the ABI.
c. If it doesn’t we iterate again over all known ABIs, in the same way as in
1a
. If we find a match we update the (address -> ABI_name
) association in the contract map and return the ABI.It is possible that this will happen multiple times, if we have multiple contracts with multiple identical methods, but given a sufficiently diverse set of methods that were called we should eventually arrive at a fully correct contract map.
d. If no match is found we will return an error.
Example
Contract Map
We support in-memory contract map and a TOML file-based one that keeps the association of (address -> ABI_name
). The latter map is only used for non-simulated networks. Every time we deploy a contract we save (address -> ABI_name
) entry in the in-memory map. If the network is not a simulated one we also save it in a file. That file can later be pointed to in Seth configuration and we will load the contract map from it (currently without validating whether we have all the ABIs mentioned in the file).
When saving contract deployment information we can either generate filename for you (if you didn’t configure Seth to use a particular file) using the pattern of deployed_contracts_${network_name}_${timestamp}.toml
or use the filename provided in Seth TOML configuration file.
It has to be noted that the file-based contract map is currently updated only, when new contracts are deployed. There’s no mechanism for updating it if we found the mapping invalid (which might be the case if you manually created the entry in the file).
Contract Store
Contract store is a component that stores ABIs and contracts' bytecodes. In theory, Seth can be used without it, but it would have very limited usage as transaction decoding and tracing cannot work without ABIs. Thus in practice, we enforce a non-empty Contract Store durin Seth initialisation.
Another use of Contract Store is simplified contract deployment. For that we also need the contract's bytecode. The contract store can be used to store the bytecode of the contract and then deploy it using the DeployContractFromContractStore(auth *bind.TransactOpts, name string, backend bind.ContractBackend, params ...interface{})
method. When Seth is intialisied with the contract store and no bytecode files (*.bin
) are provided, it will log a warning, but initialise successfully nonetheless.
If bytecode file wasn't provided, you need to use DeployContract(auth *bind.TransactOpts, name string, abi abi.ABI, bytecode []byte, backend bind.ContractBackend, params ...interface{})
method, which expects you to provide contract name (best if equal to the name of the ABI file), bytecode and the ABI.
WASP - Overview
WASP is a lightweight load testing harness that makes it easy to compose load profiles of varying complexity using arbitrary Go
code.
Main Features
- Predictable performance footprint
- Flexible, non-opinionated reporting that pushes data to
Loki
- Supports both stateless and stateful protocols
- Compatible with open and closed load models
- Includes a default Grafana dashboard with reusable components
Sounds interesting? Let's get started!
WASP - Getting Started
Prerequisites
Setup
To start writing tests, create a directory for your project with a go.mod
file, then add the WASP package:
go get github.com/smartcontractkit/chainlink-testing-framework/wasp
warning
To execute any of the tests from the next chapters, you need access to Loki and Grafana.
You can find instructions on setting them up locally here.
That was simple, wasn't it? Time to write your first test.
WASP - First Test (RPS Test)
Requirements
- Go installed
- Loki
Let's start by creating a simple test that sends 5 HTTP requests per second for 60 seconds.
We will use a Gun
, which is designed for stateless protocols like HTTP or measuring throughput.
A Gun
only needs to implement this single-method interface:
type Gun interface {
Call(l *Generator) *Response
}
Defining the Gun
First, let's define a struct that will hold our Gun
implementation:
type ExampleGun struct {
target string
client *resty.Client
Data []string
}
Our Gun
will send a GET
request to the target URL. If the request is successful, we return a *wasp.Response
containing the response data. If it fails or responds with an HTTP status other than 200
, we also return a *wasp.Response
with the response and an error.
Here’s the implementation of the Gun
interface:
func (m *ExampleGun) Call(l *wasp.Generator) *wasp.Response {
var result map[string]interface{}
r, err := m.client.R().
SetResult(&result).
Get(m.target)
if err != nil {
return &wasp.Response{Data: result, Error: err.Error()}
}
if r.Status() != "200 OK" {
return &wasp.Response{Data: result, Error: "not 200"}
}
return &wasp.Response{Data: result}
}
note
By default, WASP stores all successful and failed responses, but you can use a Sampler
to store only some of the successful ones. You can read more about it here.
Writing the Test
Now that we have a Gun
, let’s write the test:
func TestGun(t *testing.T) {
// start mock HTTP server
srv := wasp.NewHTTPMockServer(nil)
srv.Run()
// define labels to differentiate one run from another
labels := map[string]string{
// check variables in dashboard/dashboard.go
"go_test_name": "TestGun",
"gen_name": "test_gun",
"branch": "my-awesome-branch",
"commit": "f3729fa",
}
// create generator
gen, err := wasp.NewGenerator(&wasp.Config{
LoadType: wasp.RPS,
// plain line profile - 5 RPS for 60s
Schedule: wasp.Plain(5, 60*time.Second),
Gun: NewExampleHTTPGun(srv.URL()),
Labels: labels,
LokiConfig: wasp.NewEnvLokiConfig(),
})
if err != nil {
panic(err)
}
// run the generator and wait until it finishes
gen.Run(true)
}
note
We used the LoadType
of wasp.RPS
since this is the only type of load that a Gun
can handle.
You can read more about load types here.
note
You can learn more about different labels and their functions here.
note
wasp.NewEnvLokiConfig()
configures Loki using environment variables. You can read about them here.
Conclusion
And that's it! You’ve just created your first test using WASP. You can now run it using:
go test -v -run TestGun
You can find the full example here.
What’s Next?
What if you want to test a more complex scenario?
First, we’ll look at a stateful protocol test, like WebSocket
. Then, we’ll explore a hypothetical user journey test, where multiple requests need to be executed.
WASP - Stateful Protocol Testing
WASP allows you to test stateful protocols like WebSocket
. These protocols are more complex than stateless protocols, which is reflected in the slightly higher complexity of the interface to be implemented: the VirtualUser
interface.
type VirtualUser interface {
Call(l *Generator)
Clone(l *Generator) VirtualUser
Setup(l *Generator) error
Teardown(l *Generator) error
Stop(l *Generator)
StopChan() chan struct{}
}
Defining the Virtual User
As before, let's start by defining a struct that will hold our VirtualUser
implementation:
type WSVirtualUser struct {
target string
*wasp.VUControl
conn *websocket.Conn
Data []string
}
Implementing the Clone Method
We will begin by implementing the Clone()
function, used by WASP to create new instances of the VirtualUser
:
func (m *WSVirtualUser) Clone(_ *wasp.Generator) wasp.VirtualUser {
return &WSVirtualUser{
VUControl: wasp.NewVUControl(),
target: m.target,
Data: make([]string, 0),
}
}
Implementing the Setup Method
Next, we implement the Setup()
function, which establishes a connection to the WebSocket server:
func (m *WSVirtualUser) Setup(l *wasp.Generator) error {
var err error
m.conn, _, err = websocket.Dial(context.Background(), m.target, &websocket.DialOptions{})
if err != nil {
l.Log.Error().Err(err).Msg("failed to connect from vu")
_ = m.conn.Close(websocket.StatusInternalError, "")
return err
}
return nil
}
We will omit the Teardown()
function for brevity, but it should be used to close the connection to the WebSocket server.
Additionally, we do not need to implement the Stop()
or StopChan()
functions because they are already implemented in the VUControl
struct.
Implementing the Call Method
Now, we implement the Call()
function, which is used to receive messages from the WebSocket server:
func (m *WSVirtualUser) Call(l *wasp.Generator) {
startedAt := time.Now()
v := map[string]string{}
err := wsjson.Read(context.Background(), m.conn, &v)
if err != nil {
l.Log.Error().Err(err).Msg("failed read ws msg from vu")
l.ResponsesChan <- &wasp.Response{StartedAt: &startedAt, Error: err.Error(), Failed: true}
return
}
l.ResponsesChan <- &wasp.Response{StartedAt: &startedAt, Data: v}
}
As you can see, instead of returning a single response directly from Call()
, we send it to the ResponsesChan
channel.
This is done, so that we can send each response independently to Loki instead of waiting for the whole call to finish.
Writing the Test
Now, let's write the test:
func TestVirtualUser(t *testing.T) {
// start mock WebSocket server
s := httptest.NewServer(wasp.MockWSServer{
Sleep: 50 * time.Millisecond,
})
defer s.Close()
time.Sleep(1 * time.Second)
// some parts omitted for brevity
// create generator
gen, err := wasp.NewGenerator(&wasp.Config{
LoadType: wasp.VU,
// plain line profile - 5 VUs for 60s
Schedule: wasp.Plain(5, 60*time.Second),
VU: NewExampleVirtualUser(url),
Labels: labels,
LokiConfig: wasp.NewEnvLokiConfig(),
})
if err != nil {
panic(err)
}
// run the generator and wait until it finishes
gen.Run(true)
}
Conclusion
That wasn’t so difficult, was it? You can now test your WebSocket server with WASP. You can find a full example here.
What’s Next?
Now, let’s explore how to test a more complex scenario where a VirtualUser
needs to perform various operations.
WASP - Testing User Journeys
Let's explore a more complex scenario where a user needs to authenticate first before performing an action. Additionally, we will introduce a slightly more advanced load profile:
- 1 user for the first 30 seconds
- 2 users for the next 30 seconds
- 3 users for the final 30 seconds
Since this is a "user journey," we will use a VirtualUser
implementation to represent a user.
Defining the Virtual User
First, let's define the VirtualUser
struct:
type VirtualUser struct {
*wasp.VUControl
target string
Data []string
rateLimit int
rl ratelimit.Limiter
client *resty.Client
}
Here, we’ve added a rate limiter to the struct, which will help limit the number of requests per second. This is useful to prevent overloading the server, especially in this test scenario where the requests are simple and fast. Without a limiter, even a small number of Virtual Users (VUs) could result in a very high RPS.
note
Since VirtualUser
does not inherently limit RPS (it depends on the server's processing speed), you should implement a rate-limiting mechanism if needed.
For brevity, we'll skip the implementation of the Clone()
and Teardown()
functions, as they are similar to previous examples. Additionally, no Setup()
is required because we are using HTTP.
Implementing Requests
User Authentication
// requestOne represents user login
func (m *VirtualUser) requestOne(l *wasp.Generator) {
var result map[string]interface{}
r, err := m.client.R().
SetResult(&result).
Get(m.target)
if err != nil {
l.Responses.Err(r, GroupAuth, err)
return
}
l.Responses.OK(r, GroupAuth)
}
Authenticated Action (e.g., Balance Check)
// represents authenticated user action
func (m *VirtualUser) requestTwo(l *wasp.Generator) {
var result map[string]interface{}
r, err := m.client.R().
SetResult(&result).
Get(m.target)
if err != nil {
l.Responses.Err(r, GroupUser, err)
return
}
l.Responses.OK(r, GroupUser)
}
Combining the Requests
func (m *VirtualUser) Call(l *wasp.Generator) {
m.rl.Take() // apply rate limiting
m.requestOne(l)
m.requestTwo(l)
}
Writing the Test
Now, let’s write the test. Pay attention to how the three phases of the load profile are defined under Schedule
:
func TestScenario(t *testing.T) {
srv := wasp.NewHTTPMockServer(nil)
srv.Run()
_, err := wasp.NewProfile().
Add(wasp.NewGenerator(&wasp.Config{
T: t,
LoadType: wasp.VU,
VU: NewExampleScenario(srv.URL()),
Schedule: wasp.Combine(
wasp.Plain(1, 30*time.Second),
wasp.Plain(2, 30*time.Second),
wasp.Plain(3, 30*time.Second),
),
LokiConfig: wasp.NewEnvLokiConfig(),
})).Run(true)
require.NoError(t, err)
}
Load Profile
We generate load in three phases:
- 1 user for the first 30 seconds
- 2 users for the next 30 seconds
- 3 users for the final 30 seconds
Conclusion
And that's it! We’ve created a test that simulates a user journey with authentication and an action requiring authentication, while varying the load during execution. You can find the full example code here.
But what if you wanted to combine multiple load generators—or even mix a Gun
with a VirtualUser
? Could you do that? Find out in the next chapter.
WASP - Using Profiles
In this section, we’ll explore the most complex scenario: using a VirtualUser
to represent a user and a Gun
to generate background load. We’ll also introduce a new load segment type that varies over time.
To bind together different Generators
(such as a Gun
or a VirtualUser
), we’ll use a Profile
. Think of it as the highest-level abstraction that fully describes the load profile.
Gun
and VirtualUser
We’ll skip defining both the Gun
and the VirtualUser
, as they are nearly identical to previous examples. However, for the VirtualUser
, we’ll use a handy wrapper for sending Response
back to the channel. See if you can spot it:
// represents user login
func (m *VirtualUser) requestOne(l *wasp.Generator) {
var result map[string]interface{}
r, err := m.client.R().
SetResult(&result).
Get(m.target)
if err != nil {
l.Responses.Err(r, GroupAuth, err)
return
}
l.Responses.OK(r, GroupAuth)
}
note
You might have noticed GroupAuth
constant passed to OK()
and Err()
methods. This is used to group responses in the dashboard.
You can read more about it here.
Background Load Schedule for the Gun
For the RPS Gun
, we’ll define a schedule where:
- During the first 10 seconds, the RPS increases by 2.5 every second.
- For the next 40 seconds, it remains steady at 5 RPS.
epsilonSchedule := wasp.Combine(
// wasp.Steps(from, increase, steps, duration)
wasp.Steps(1, 1, 4, 10*time.Second), // Start at 1 RPS, increment by 1 RPS in 4 steps over 10 seconds (1 increment every 2.5 seconds)
// wasp.Plain(count, duration)
wasp.Plain(5, 40*time.Second)) // Hold 5 RPS for 40 seconds
Virtual User Schedule
For the VirtualUser
, the schedule will:
- Start with 1 user for the first 10 seconds.
- Add 1 user every 3 seconds for 30 seconds.
- Gradually reduce from 10 users to 0 over 10 seconds.
thetaSchedule := wasp.Combine(
// wasp.Plain(count, duration)
wasp.Plain(1, 10*time.Second), // 1 user for the first 10 seconds
// wasp.Steps(from, increase, steps, duration)
wasp.Steps(1, 1, 9, 30*time.Second), // Increment by 1 user every ~3 seconds over 30 seconds
// wasp.Steps(from, increase, steps, duration)
wasp.Steps(10, -1, 10, 10*time.Second)) // Decrement by 1 user every second over 10 seconds
Defining the Profile
We’ll now define our Profile
to combine both the Gun
and the VirtualUser
:
_, err := wasp.NewProfile().
Add(wasp.NewGenerator(&wasp.Config{
T: t,
LoadType: wasp.VU,
GenName: "Theta",
Schedule: thetaSchedule,
VU: NewExampleScenario(srv.URL()),
LokiConfig: wasp.NewEnvLokiConfig(),
})).
Add(wasp.NewGenerator(&wasp.Config{
T: t,
LoadType: wasp.RPS,
GenName: "Epsilon",
Schedule: epsilonSchedule,
Gun: NewExampleHTTPGun(srv.URL()),
LokiConfig: wasp.NewEnvLokiConfig(),
})).
Run(true)
Conclusion
And that’s it! You’ve created a complex load profile that simulates a growing number of users alongside a background load that varies over time. Notice the .Run(true)
method, which blocks until all the Profile
's generators have finished.
You can find the full example here.
note
The error
returned by the .Run(true)
function of a Profile
might indicate the following:
- load generator did not start and instead returned an error
- or if
Profile
was created withWithGrafana
option that enabledCheckDashboardAlertsAfterRun
that at some alerts fired. To check for errors during load generation, you need to call theErrors()
method on eachGenerator
within theProfile
.
note
Currently, it’s not possible to have a "waiting" schedule that doesn’t generate any load.
To implement such logic, start one Profile
in a goroutine and use time.Sleep
before starting the second Profile
or start use a combination .Run(false)
and .Wait()
methods.
Both examples of can be found here.
What’s Next?
Now that you know how to generate load, it’s time to learn how to monitor and assert on it. Let’s move on to the next section: Testing Alerts.
WASP - Configuration
WASP can be configured using environment variables for most commonly used settings. However, to fully leverage the flexibility of WASP, you may need to use programmatic configuration for more advanced features.
Required Environment Variables
At a minimum, you need to provide the following environment variables, as WASP requires Loki:
LOKI_URL
- The Loki endpoint to which logs are pushed (e.g., http://localhost:3100/loki/api/v1/push).LOKI_TOKEN
- The authorization token.
Optionally, you can also provide the following:
LOKI_TENANT_ID
- A tenant ID that acts as a bucket identifier for logs, logically separating them from other sets of logs. If the tenant ID doesn't exist, Loki will create it. Can be empty if log separation is not a concern.LOKI_BASIC_AUTH
- Basic authentication credentials.
Alert Configuration
To enable alert checking, you need to provide the following additional environment variables, as alerts are an integral part of Grafana:
GRAFANA_URL
- The base URL of the Grafana instance.GRAFANA_TOKEN
- An API token with permissions to access the following namespaces:
/api/alertmanager/
,/api/annotations/
,/api/dashboard/
,/api/datasources/
,/api/v1/provisioning/
, and/api/ruler/
.
Grafana Dashboard Creation
If you want WASP to create a Grafana dashboard for you, provide the following environment variables:
DATA_SOURCE_NAME
- The name of the data source (currently, onlyLoki
is supported).DASHBOARD_FOLDER
- The folder in which to create the dashboard.DASHBOARD_NAME
- The name of the dashboard.
Log Level Control
You can control the log level using this environment variable:
WASP_LOG_LEVEL
- Sets the log level (trace
,debug
,info
,warn
,error
; defaults toinfo
).
And that's it! You are now ready to start using WASP to its full potential.
note
Remember, WASP offers much more configurability through its programmatic API.
k8s
WASP - Components
In this section, we will briefly describe the main logical components that are part of the WASP library:
Brief overview of how WASP works and how different components interact with each other:
WASP - Dashboard
warning
The API used to check and create alerts is unstable, and the information related to them may be out of date.
WASP comes with a built-in dashboard that allows you to monitor test runs in real-time.
The dashboard includes several built-in metrics that integrate seamlessly with the AlertChecker
component.
It is built using the Grabana library, which you can use to further customize and extend the dashboard by adding your own rows and panels.
note
To create new dashboards, you need to set certain dashboard-specific variables as described in the Configuration section.
Predefined Alerts
WASP comes with predefined alerts for:
- 99th percentile of the response time
- Response errors
- Response timeouts
You can use these predefined metrics to add simple alerts to your dashboard for conditions such as:
- Values above or below a threshold
- Averages above or below a threshold
- Percentages of the total above or below a threshold
For a complete list of available conditions, refer to Grabana's ConditionEvaluator.
Custom Alerts
Custom alerts can be composed of:
- The simple conditions mentioned above
- Arbitrary Loki queries
Custom alerts use Grabana's timeseries.Alert and must be timeseries-based.
note
Adding a built-in alert will also add a new row to the dashboard to display the monitored metric.
In contrast, custom alerts do not automatically add rows to the dashboard to prevent clutter.
Each generator has its own metrics, matched by the generator name.
Default Dashboard Panels
The default dashboard includes the following panels:
- Current RPS/VUs (depending on the generator)
- Responses per second
- Total successful requests
- Total failed requests
- Total timeout requests
- RPS/VUs per schedule segment
- Responses per second
- Latency quantiles over groups (p99, p90, p50)
- Response latencies over time
- Logs size per second
- Sampling statistics
- Failed & timed-out responses
- Logs of the statistics-pushing service
Where applicable, these panels group results by generator name (gen_name
label) and call group (call_group
label).
note
You can read more about using labels in the Use labels section.
Creating a New Dashboard with Alerts
For a practical example of how to create a new dashboard with alerts, see the Testing Alerts section.
WASP - Generator
A Generator is a component that encapsulates all the characteristics of load generation, including:
- Load type:
- RPS (requests per second)
- VUs (virtual users)
- Load schedule
- Call logic
- Response data
- Sampling
- Timeouts
warning
RPS load type can only be used with a Gun
, while VUs can only be used with a VirtualUser
.
Choosing Between Gun
and VirtualUser
Gun
- Best for stateless protocols (e.g., HTTP).
- Simplistic in nature; ideal for executing a single operation that does not require setup or teardown.
- Operates using an open model, where:
- The number of requests is fixed.
- The load adjusts to meet the target RPS, regardless of the system's response time.
- There's no feedback from the system.
- Recommended for scenarios focused on measuring throughput.
VirtualUser
- Designed for stateful protocols (e.g.,
WebSocket
) or workflows involving multiple operations (e.g., authenticating, executing tasks, and logging out). - More complex, with dedicated methods for setup and teardown.
- Operates using a closed model, where:
- New iterations start only after the previous one completes.
- The RPS fluctuates based on the system's response time. Longer response times reduce RPS.
- Feedback from the system is used to adjust the load.
Closed vs. Open Models
-
A
Gun
follows an open model:- It controls the rate of requests being sent.
- The system's response time does not impact the load generation rate.
-
A
VirtualUser
follows a closed model:- It controls the rate of receiving responses.
- The system's response time directly impacts the load generation rate. If the system slows down, iterations take longer, reducing the RPS.
Summary
In simpler terms:
- A
Gun
limits the load during the sending phase, making it ideal for throughput measurements. - A
VirtualUser
limits the load during the receiving phase, reflecting the system's performance under load.
This distinction helps you decide which tool to use based on the protocol type and the goals of your test.
WASP - Loki
Loki is a component responsible for pushing batches of statistics to a Loki instance. It operates in the background using a promtail
client.
Key features include:
- Optional basic authentication support.
- Configurable batch sizes.
- Support for configurable backoff retries, among others.
By default, a test will fail on the first error encountered while pushing data. You can modify this behavior by setting the maximum allowed errors:
- Setting the value to
-1
disables error checks entirely.
note
While it is technically possible to execute a WASP test without Loki, doing so means you won't have access to load-generation-related metrics.
Unless your sole interest is in the metrics sent by your application, it is highly recommended to use Loki.
For this reason, Loki is considered an integral part of the WASP stack.
WASP - Profile
A Profile allows you to combine load from different generators. Each generator operates according to its own schedule, and you can even mix RPS and VU load types, as shown in this example.
note
The error
returned by the .Run(true)
function of a Profile
might indicate the following:
- load generator did not start and instead returned an error
- or if
Profile
was created withWithGrafana
option that enabledCheckDashboardAlertsAfterRun
that at some alerts fired. To check for errors during load generation, you need to call theErrors()
method on eachGenerator
within theProfile
.
Timing Considerations
It is not possible to have different generators in the same Profile
start load generation at different times.
If you need staggered start times, you must use separate Profiles
and handle timing in your Go
code.
An example demonstrating this can be found here.
WASP - Sampler
By default, WASP saves all successful, failed, and timed-out responses in each Generator.
The Sampler component allows you to programmatically set the sampling ratio for successful responses (a value between 0 and 100).
For example, if you set the sampling ratio to 10, only 10% of successful responses will be saved:
samplerCfg := &SamplerConfig{
SuccessfulCallResultRecordRatio: 10,
}
g := &Generator{
sampler: NewSampler(samplerCfg),
// other fields
}
WASP - Schedule
The Schedule component allows you to define the load characteristics for each Generator.
A schedule is composed of various Segments
, each characterized by the number of requests (or VUs) and duration.
Helper Functions for Defining Schedules
WASP provides several helper functions to define schedules in a human-readable way:
Plain
: Defines a segment with a stable load.Steps
: Defines a segment with increasing or decreasing load, using a specified step size.
Custom Schedules
You can also define schedules programmatically by directly composing segments to suit your specific requirements.
BenchSpy
BenchSpy (short for Benchmark Spy) is a WASP-coupled tool designed for easy comparison of various performance metrics.
Key Features
- Three built-in data sources:
Loki
Prometheus
Direct
- Standard/pre-defined metrics for each data source.
- Ease of extensibility with custom metrics.
- Ability to load the latest performance report based on Git history.
BenchSpy does not include any built-in comparison logic beyond ensuring that performance reports are comparable (e.g., they measure the same metrics in the same way), offering complete freedom to the user for interpretation and analysis.
Why could you need it?
BenchSpy
was created with two main goals in mind:
- measuring application performance programmatically,
- finding performance-related changes or regression issues between different commits or releases.
BenchSpy - Getting Started
The following examples assume you have access to the following applications:
- Grafana
- Loki
- Prometheus
note
The easiest way to run these locally is by using CTFv2's observability stack.
Be sure to install the CTF CLI
first, as described in the CTFv2 Getting Started guide.
Since BenchSpy is tightly coupled with WASP, we highly recommend that you get familiar with it first if you haven't already.
Ready? Let's get started!
BenchSpy - Your First Test
Let's start with the simplest case, which doesn't require any part of the observability stack—only WASP
and the application you are testing.
BenchSpy
comes with built-in QueryExecutors
, each of which also has predefined metrics that you can use. One of these executors is the DirectQueryExecutor
, which fetches metrics directly from WASP
generators,
which means you can run it with Loki.
note
Not sure whether to use Loki
or Direct
query executors? Read this!
Test Overview
Our first test will follow this logic:
- Run a simple load test.
- Generate a performance report and store it.
- Run the load test again.
- Generate a new report and compare it to the previous one.
We'll use very simplified assertions for this example and expect the performance to remain unchanged.
Step 1: Define and Run a Generator
Let's start by defining and running a generator that uses a mocked service:
gen, err := wasp.NewGenerator(&wasp.Config{
T: t,
GenName: "vu",
CallTimeout: 100 * time.Millisecond,
LoadType: wasp.VU,
Schedule: wasp.Plain(10, 15*time.Second),
VU: wasp.NewMockVU(&wasp.MockVirtualUserConfig{
CallSleep: 50 * time.Millisecond,
}),
})
require.NoError(t, err)
gen.Run(true)
Step 2: Generate a Baseline Performance Report
With load data available, let's generate a baseline performance report and store it in local storage:
baseLineReport, err := benchspy.NewStandardReport(
// random hash, this should be the commit or hash of the Application Under Test (AUT)
"v1.0.0",
// use built-in queries for an executor that fetches data directly from the WASP generator
benchspy.WithStandardQueries(benchspy.StandardQueryExecutor_Direct),
// WASP generators
benchspy.WithGenerators(gen),
)
require.NoError(t, err, "failed to create baseline report")
fetchCtx, cancelFn := context.WithTimeout(context.Background(), 60*time.Second)
defer cancelFn()
fetchErr := baseLineReport.FetchData(fetchCtx)
require.NoError(t, fetchErr, "failed to fetch data for baseline report")
path, storeErr := baseLineReport.Store()
require.NoError(t, storeErr, "failed to store baseline report", path)
note
There's a lot to unpack here, and you're encouraged to read more about the built-in QueryExecutors
and the standard metrics they provide as well as about the StandardReport
here.
For now, it's enough to know that the standard metrics provided by StandardQueryExecutor_Direct
include:
- Median latency
- P95 latency (95th percentile)
- Max latency
- Error rate
Step 3: Run the Test Again and Compare Reports
With the baseline report ready, let's run the load test again. This time, we'll use a wrapper function to automatically load the previous report, generate a new one, and ensure they are comparable.
// define a new generator using the same config values
newGen, err := wasp.NewGenerator(&wasp.Config{
T: t,
GenName: "vu",
CallTimeout: 100 * time.Millisecond,
LoadType: wasp.VU,
Schedule: wasp.Plain(10, 15*time.Second),
VU: wasp.NewMockVU(&wasp.MockVirtualUserConfig{
CallSleep: 50 * time.Millisecond,
}),
})
require.NoError(t, err)
// run the load
newGen.Run(true)
fetchCtx, cancelFn = context.WithTimeout(context.Background(), 60*time.Second)
defer cancelFn()
// currentReport is the report that we just created (baseLineReport)
currentReport, previousReport, err := benchspy.FetchNewStandardReportAndLoadLatestPrevious(
fetchCtx,
// commit or tag of the new application version
"v2.0.0",
benchspy.WithStandardQueries(benchspy.StandardQueryExecutor_Direct),
benchspy.WithGenerators(newGen),
)
require.NoError(t, err, "failed to fetch current report or load the previous one")
note
In a real-world case, once you've generated the first report, you should only need to use the benchspy.FetchNewStandardReportAndLoadLatestPrevious
function.
What's Next?
Now that we have two reports, how do we ensure that the application's performance meets expectations? Find out in the next chapter.
BenchSpy - Simplest Metrics
As mentioned earlier, BenchSpy
doesn't include any built-in comparison logic. It's up to you to decide how to compare metrics, as there are various ways to approach it and different data formats returned by queries.
For example, if your query returns a time series, you could:
- Compare each data point in the time series individually.
- Compare aggregates like averages, medians, or min/max values of the time series.
Working with Built-in QueryExecutors
Each built-in QueryExecutor
returns a different data type, and we use the interface{}
type to reflect this. Since Direct
executor always returns float64
we have added a convenience function
that checks whether any of the standard metrics has degraded more than the threshold. If the performance has improved, no error will be returned.
hasErrors, errors := benchspy.CompareDirectWithThresholds(
// maximum differences in percentages for:
1.0, // median latency
1.0, // p95 latency
1.0, // max latency
1.0, // error rate
currentReport,
previousReport,
)
require.False(t, hasErrors, fmt.Sprintf("errors found: %v", errors))
If there are errors they will be returned as map[string][]errors
, where key is the name of a generator.
note
Both Direct
and Loki
query executors support following standard performance metrics out of the box:
median_latency
p95_latency
max_latency
error_rate
The function also prints a table with the differences between two reports, regardless whether they were meaningful:
Generator: vu1
==============
+-------------------------+---------+---------+---------+
| METRIC | V1 | V2 | DIFF % |
+-------------------------+---------+---------+---------+
| median_latency | 50.1300 | 50.1179 | -0.0242 |
+-------------------------+---------+---------+---------+
| 95th_percentile_latency | 50.7387 | 50.7622 | 0.0463 |
+-------------------------+---------+---------+---------+
| max_latency | 55.7195 | 51.7248 | -7.1692 |
+-------------------------+---------+---------+---------+
| error_rate | 0.0000 | 0.0000 | 0.0000 |
+-------------------------+---------+---------+---------+
Wrapping Up
And that's it! You've written your first test that uses WASP
to generate load and BenchSpy
to ensure that the median latency, 95th percentile latency, max latency and error rate haven't changed significantly between runs. You accomplished this without even needing a Loki instance. But what if you wanted to leverage the power of LogQL
? We'll explore that in the next chapter.
note
You can find the full example here.
BenchSpy - Standard Loki Metrics
warning
This example assumes you have access to Loki and Grafana instances. If you don't, learn how to launch them using CTFv2's observability stack.
In this example, our Loki workflow will differ from the previous one in just a few details:
- The generator will include a Loki configuration.
- The standard query executor type will be
benchspy.StandardQueryExecutor_Loki
. - All results will be cast to
[]string
. - We'll calculate medians for all metrics.
Ready?
Step 1: Define a New Load Generator
Let's start by defining a new load generator:
label := "benchspy-std"
gen, err := wasp.NewGenerator(&wasp.Config{
T: t,
// read Loki config from environment
LokiConfig: wasp.NewEnvLokiConfig(),
GenName: "vu",
// set unique labels
Labels: map[string]string{
"branch": label,
"commit": label,
},
CallTimeout: 100 * time.Millisecond,
LoadType: wasp.VU,
Schedule: wasp.Plain(10, 15*time.Second),
VU: wasp.NewMockVU(&wasp.MockVirtualUserConfig{
CallSleep: 50 * time.Millisecond,
}),
})
require.NoError(t, err)
Step 2: Run the Generator and Save the Baseline Report
gen.Run(true)
baseLineReport, err := benchspy.NewStandardReport(
"v1.0.0",
// notice the different standard query executor type
benchspy.WithStandardQueries(benchspy.StandardQueryExecutor_Loki),
benchspy.WithGenerators(gen),
)
require.NoError(t, err, "failed to create baseline report")
fetchCtx, cancelFn := context.WithTimeout(context.Background(), 60*time.Second)
defer cancelFn()
fetchErr := baseLineReport.FetchData(fetchCtx)
require.NoError(t, fetchErr, "failed to fetch data for baseline report")
path, storeErr := baseLineReport.Store()
require.NoError(t, storeErr, "failed to store baseline report", path)
Step 3: Skip to Metrics Comparison
Since the next steps are very similar to those in the first test, we’ll skip them and go straight to metrics comparison.
By default, the LokiQueryExecutor
returns results as the []string
data type. Let’s use dedicated convenience functions to cast them from interface{}
to string slices:
allCurrentAsStringSlice := benchspy.MustAllLokiResults(currentReport)
allPreviousAsStringSlice := benchspy.MustAllLokiResults(previousReport)
require.NotEmpty(t, allCurrentAsStringSlice, "current report is empty")
require.NotEmpty(t, allPreviousAsStringSlice, "previous report is empty")
currentAsStringSlice := allCurrentAsStringSlice[gen.Cfg.GenName]
previousAsStringSlice := allPreviousAsStringSlice[gen.Cfg.GenName]
An explanation is needed here: this function separates metrics for each generator, hence it returns a map[string]map[string][]string
. Let's break it down:
- outer map's key is generator name
- inner map's key is metric name and the value is a series of measurements In our case there's only a single generator, but in a complex test there might be a few.
Step 4: Compare Metrics
Now, let’s compare metrics. Since we have []string
, we’ll first convert it to []float64
, calculate the median, and ensure the difference between the averages is less than 1%. Again, this is just an example—you should decide the best way to validate your metrics. Here we are explicitly aggregating them using an average to get a single number representation of each metric, but for your case a median or percentile or yet some other aggregate might be more appropriate.
var compareAverages = func(t *testing.T, metricName string, currentAsStringSlice, previousAsStringSlice map[string][]string, maxPrecentageDiff float64) {
require.NotEmpty(t, currentAsStringSlice[metricName], "%s results were missing from current report", metricName)
require.NotEmpty(t, previousAsStringSlice[metricName], "%s results were missing from previous report", metricName)
currentFloatSlice, err := benchspy.StringSliceToFloat64Slice(currentAsStringSlice[metricName])
require.NoError(t, err, "failed to convert %s results to float64 slice", metricName)
currentMedian, err := stats.Mean(currentFloatSlice)
require.NoError(t, err, "failed to calculate median for %s results", metricName)
previousFloatSlice, err := benchspy.StringSliceToFloat64Slice(previousAsStringSlice[metricName])
require.NoError(t, err, "failed to convert %s results to float64 slice", metricName)
previousMedian, err := stats.Mean(previousFloatSlice)
require.NoError(t, err, "failed to calculate median for %s results", metricName)
var diffPrecentage float64
if previousMedian != 0.0 && currentMedian != 0.0 {
diffPrecentage = (currentMedian - previousMedian) / previousMedian * 100
} else if previousMedian == 0.0 && currentMedian == 0.0 {
diffPrecentage = 0.0
} else {
diffPrecentage = 100.0
}
assert.LessOrEqual(t, math.Abs(diffPrecentage), maxPrecentageDiff, "%s medians are more than 1% different", metricName, fmt.Sprintf("%.4f", diffPrecentage))
}
compareAverages(
t,
string(benchspy.MedianLatency),
currentAsStringSlice,
previousAsStringSlice,
1.0,
)
compareAverages(t, string(benchspy.Percentile95Latency), currentAsStringSlice, previousAsStringSlice, 1.0)
compareAverages(t, string(benchspy.MaxLatency), currentAsStringSlice, previousAsStringSlice, 1.0)
compareAverages(t, string(benchspy.ErrorRate), currentAsStringSlice, previousAsStringSlice, 1.0)
warning
Standard Loki metrics are all calculated using a 10 seconds moving window, which results in smoothing of values due to aggregation. To learn what that means in details, please refer to To Loki or Not to Loki chapter.
Also, due to the HTTP API endpoint used, namely the query_range
, all query results are always returned as a slice. Execution of instant queries
that return a single data point is currently not supported.
What’s Next?
In this example, we used standard metrics, which are the same as in the first test. Now, let’s explore how to use your custom LogQL queries.
note
You can find the full example here.
BenchSpy - Custom Loki Metrics
In this chapter, we’ll explore how to use custom LogQL
queries in the performance report. For this more advanced use case, we’ll manually compose the performance report.
The load generation part is the same as in the standard Loki metrics example and will be skipped.
Defining Custom Metrics
Let’s define two illustrative metrics:
vu_over_time
: The rate of virtual users generated by WASP, using a 10-second window.responses_over_time
: The number of AUT's responses, using a 1-second window.
lokiQueryExecutor := benchspy.NewLokiQueryExecutor(
map[string]string{
"vu_over_time": fmt.Sprintf("max_over_time({branch=~\"%s\", commit=~\"%s\", go_test_name=~\"%s\", test_data_type=~\"stats\", gen_name=~\"%s\"} | json | unwrap current_instances [10s]) by (node_id, go_test_name, gen_name)", label, label, t.Name(), gen.Cfg.GenName),
"responses_over_time": fmt.Sprintf("sum(count_over_time({branch=~\"%s\", commit=~\"%s\", go_test_name=~\"%s\", test_data_type=~\"responses\", gen_name=~\"%s\"} [1s])) by (node_id, go_test_name, gen_name)", label, label, t.Name(), gen.Cfg.GenName),
},
gen.Cfg.LokiConfig,
)
note
These LogQL
queries use the standard labels that WASP
applies when sending data to Loki.
Creating a StandardReport
with Custom Queries
Now, let’s create a StandardReport
using our custom queries:
baseLineReport, err := benchspy.NewStandardReport(
"v1.0.0",
// notice the different functional option used to pass Loki executor with custom queries
benchspy.WithQueryExecutors(lokiQueryExecutor),
benchspy.WithGenerators(gen),
)
require.NoError(t, err, "failed to create baseline report")
Wrapping Up
The rest of the code remains unchanged, except for the names of the metrics being asserted. You can find the full example here.
Now it’s time to look at the last of the bundled QueryExecutors
. Proceed to the next chapter to read about Prometheus.
note
You can find the full example here.
BenchSpy - Standard Prometheus Metrics
Now that we've seen how to query and assert load-related metrics, let's explore how to query and assert on resource usage by our Application Under Test (AUT).
If you're unsure why this is important, consider the following situation: the p95 latency of a new release matches the previous version, but memory consumption is 34% higher. Not ideal, right?
Step 1: Prometheus Configuration
Since WASP
has no built-in integration with Prometheus
, we need to pass its configuration separately:
promConfig := benchspy.NewPrometheusConfig("node[^0]")
This constructor loads the URL from the environment variable PROMETHEUS_URL
and adds a single regex pattern to match containers by name. In this case, it excludes the bootstrap Chainlink node (named node0
in the CTFv2
stack).
warning
This example assumes that you have both the observability stack and basic node set running.
If you have the CTF CLI, you can start it by running: ctf b ns
.
note
Matching containers by name should work both for most k8s and Docker setups using CTFv2
observability stack.
Step 2: Fetching and Storing a Baseline Report
As in previous examples, we'll use built-in Prometheus metrics and fetch and store a baseline report:
baseLineReport, err := benchspy.NewStandardReport(
"91ee9e3c903d52de12f3d0c1a07ac3c2a6d141fb",
// notice the different standard query executor type
benchspy.WithStandardQueries(benchspy.StandardQueryExecutor_Prometheus),
benchspy.WithPrometheusConfig(promConfig),
// Required to calculate test time range based on generator start/end times.
benchspy.WithGenerators(gen),
)
require.NoError(t, err, "failed to create baseline report")
fetchCtx, cancelFn := context.WithTimeout(context.Background(), 60*time.Second)
defer cancelFn()
fetchErr := baseLineReport.FetchData(fetchCtx)
require.NoError(t, fetchErr, "failed to fetch baseline report")
path, storeErr := baseLineReport.Store()
require.NoError(t, storeErr, "failed to store baseline report", path)
note
Standard metrics for Prometheus differ from those used by Loki
or Direct
query executors.
Prometheus metrics focus on resource usage by the AUT, while Loki
/Direct
metrics measure load characteristics.
Standard Prometheus metrics include:
median_cpu_usage
median_mem_usage
max_cpu_usage
p95_cpu_usage
p95_mem_usage
max_mem_usage
These are calculated at the container level, based on total usage (user + system).
Step 3: Handling Prometheus Result Types
Unlike Loki and Generator, Prometheus results can have various data types:
scalar
string
vector
matrix
This makes asserting results a bit more complex.
Converting Results to model.Value
First, convert results to the model.Value
interface using convenience functions:
currentAsValues := benchspy.MustAllPrometheusResults(currentReport)
previousAsValues := benchspy.MustAllPrometheusResults(previousReport)
Casting to Specific Types
Next, determine the data type returned by your query and cast it accordingly:
// Fetch a single metric
currentMedianCPUUsage := currentAsValues[string(benchspy.MedianCPUUsage)]
previousMedianCPUUsage := previousAsValues[string(benchspy.MedianCPUUsage)]
assert.Equal(t, currentMedianCPUUsage.Type(), previousMedianCPUUsage.Type(), "types of metrics should be the same")
// In this case, we know the query returns a Vector
currentMedianCPUUsageVector := currentMedianCPUUsage.(model.Vector)
previousMedianCPUUsageVector := previousMedianCPUUsage.(model.Vector)
Since these metrics are not related to load generation, the convenience function a map[string](model.Value)
, where key is resource metric name.
warning
All standard Prometheus metrics bundled with BenchSpy
return model.Vector
.
However, if you use custom queries, you must manually verify their return types.
Skipping Assertions for Resource Usage
We skip the assertion part because, unless you're comparing resource usage under stable loads, significant differences between reports are likely. For example:
- The first report might be generated right after the node set starts.
- The second report might be generated after the node set has been running for some time.
What’s Next?
In the next chapter, we’ll explore custom Prometheus queries.
note
You can find the full example here.
BenchSpy - Custom Prometheus Metrics
Similar to what we did with Loki, we can use custom metrics with Prometheus.
Most of the code remains the same as in the previous example. However, the differences begin with the need to manually create a PrometheusQueryExecutor
with our custom queries:
// No need to pass the name regex pattern, as we provide it directly in the queries
// Remeber that you are free to use any other matching values or labels or none at all
promConfig := benchspy.NewPrometheusConfig()
customPrometheus, err := benchspy.NewPrometheusQueryExecutor(
map[string]string{
// Scalar value
"95p_cpu_all_containers": "scalar(quantile(0.95, rate(container_cpu_usage_seconds_total{name=~\"node[^0]\"}[5m])) * 100)",
// Matrix value
"cpu_rate_by_container": "rate(container_cpu_usage_seconds_total{name=~\"node[^0]\"}[1m])[30m:1m]",
},
promConfig,
)
Passing Custom Queries to the Report
Next, pass the custom queries as a query executor:
baseLineReport, err := benchspy.NewStandardReport(
"91ee9e3c903d52de12f3d0c1a07ac3c2a6d141fb",
// notice the different functional option used to pass Prometheus executor with custom queries
benchspy.WithQueryExecutors(customPrometheus),
benchspy.WithGenerators(gen),
// notice that no Prometehus config is passed here
)
require.NoError(t, err, "failed to create baseline report")
note
When using custom Prometheus queries, you don’t need to pass the PrometheusConfig
to NewStandardReport()
, as the URL already been set during the creation of the PrometheusQueryExecutor
.
Fetching and Casting Metrics
Fetching the current and previous reports remains unchanged, as does casting Prometheus metrics to their specific types:
currentAsValues := benchspy.MustAllPrometheusResults(currentReport)
previousAsValues := benchspy.MustAllPrometheusResults(previousReport)
assert.Equal(t, len(currentAsValues), len(previousAsValues), "number of metrics in results should be the same")
Handling Different Data Types
Here’s where things differ. While all standard query results are instances of model.Vector
, the two custom queries introduce new types:
model.Matrix
*model.Scalar
These differences are reflected in the further casting process before accessing the final metrics:
current95CPUUsage := currentAsValues["95p_cpu_all_containers"]
previous95CPUUsage := previousAsValues["95p_cpu_all_containers"]
assert.Equal(t, current95CPUUsage.Type(), previous95CPUUsage.Type(), "types of metrics should be the same")
assert.IsType(t, current95CPUUsage, &model.Scalar{}, "current metric should be a scalar")
currentCPUByContainer := currentAsValues["cpu_rate_by_container"]
previousCPUByContainer := previousAsValues["cpu_rate_by_container"]
assert.Equal(t, currentCPUByContainer.Type(), previousCPUByContainer.Type(), "types of metrics should be the same")
assert.IsType(t, currentCPUByContainer, model.Matrix{}, "current metric should be a scalar")
current95CPUUsageAsMatrix := currentCPUByContainer.(model.Matrix)
previous95CPUUsageAsMatrix := currentCPUByContainer.(model.Matrix)
assert.Equal(t, len(current95CPUUsageAsMatrix), len(previous95CPUUsageAsMatrix), "number of samples in matrices should be the same")
warning
When casting to Prometheus' final types, it’s crucial to remember the distinction between pointer and value receivers:
Pointer receivers:
*model.String
*model.Scalar
Value receivers:
model.Vector
model.Matrix
And that's it! You know all you need to know to unlock the full power of BenchSpy
!
note
You can find the full example here.
BenchSpy - To Loki or Not to Loki?
You might be wondering whether to use the Loki
or Direct
query executor if all you need are basic latency metrics.
Rule of Thumb
You should opt for the Direct
query executor if all you need is a single number, such as the median latency or error rate, and you're not interested in:
- Comparing time series directly,
- Examining minimum or maximum values over time, or
- Performing advanced calculations on raw data,
Why Choose Direct
?
The Direct
executor returns a single value for each standard metric using the same raw data that Loki would use. It accesses data stored in the WASP
generator, which is later pushed to Loki.
This means you can:
- Run your load test without a Loki instance.
- Avoid calculating metrics like the median, 95th percentile latency, or error ratio yourself.
By using Direct
, you save resources and simplify the process when advanced analysis isn't required.
warning
Metrics calculated by the two query executors may differ slightly due to differences in their data processing and calculation methods:
Direct
QueryExecutor: This method processes all individual data points from the raw dataset, ensuring that every value is taken into account for calculations like averages, percentiles, or other statistics. It provides the most granular and precise results but may also be more sensitive to outliers and noise in the data.Loki
QueryExecutor: This method aggregates data using a default window size of 10 seconds. Within each window, multiple raw data points are combined (e.g., through averaging, summing, or other aggregation functions), which reduces the granularity of the dataset. While this approach can improve performance and reduce noise, it also smooths the data, which may obscure outliers or small-scale variability.
Why This Matters for Percentiles:
Percentiles, such as the 95th percentile (p95), are particularly sensitive to the granularity of the input data:
- In the
Direct
QueryExecutor, the p95 is calculated across all raw data points, capturing the true variability of the dataset, including any extreme values or spikes.- In the
Loki
QueryExecutor, the p95 is calculated over aggregated data (i.e. using the 10-second window). As a result, the raw values within each window are smoothed into a single representative value, potentially lowering or altering the calculated p95. For example, an outlier that would significantly affect the p95 in theDirect
calculation might be averaged out in theLoki
window, leading to a slightly lower percentile value.
Direct caveats:
buffer limitations:
WASP
generator use a StringBuffer with fixed size to store the responses. Once full capacity is reached oldest entries are replaced with incoming ones. The size of the buffer can be set in generator's config. By default, it is limited to 50k entries to lower resource consumption and potential OOMs.sampling:
WASP
generators support optional sampling of successful responses. It is disabled by deafult, but if you do enable it, then the calculations would no longer be done over a full dataset.
Key Takeaway:
The difference arises because
Direct
prioritizes precision by using raw data, whileLoki
prioritizes efficiency and scalability by using aggregated data. When interpreting results, it’s essential to consider how the smoothing effect ofLoki
might impact the representation of variability or extremes in the dataset. This is especially important for metrics like percentiles, where such details can significantly influence the outcome.
BenchSpy - Real-World Example
Now that we've covered all possible usages, you might wonder how to write a test that compares performance between different releases of your application. Here’s a practical example to guide you through the process.
Typical Workflow
- Write a performance test.
- At the end of the test, generate a performance report, store it, and commit it to Git.
- Modify the test to fetch both the latest report and create a new one.
- Write assertions to validate your performance metrics.
Writing the Performance Test
We'll use a simple mock for the application under test. This mock waits for 50 ms
before returning a 200 response code.
generator, err := wasp.NewGenerator(&wasp.Config{
T: t,
GenName: "vu",
CallTimeout: 100 * time.Millisecond,
LoadType: wasp.VU,
Schedule: wasp.Plain(10, 15*time.Second),
VU: wasp.NewMockVU(&wasp.MockVirtualUserConfig{
CallSleep: 50 * time.Millisecond,
}),
})
require.NoError(t, err)
generator.Run(true)
Generating the First Report
Here, we'll generate a performance report for version v1.0.0
using the Direct
query executor. The report will be saved to a custom directory named test_reports
. This report will later be used to compare the performance of new versions.
fetchCtx, cancelFn := context.WithTimeout(context.Background(), 60*time.Second)
defer cancelFn()
baseLineReport, err := benchspy.NewStandardReport(
"v1.0.0",
benchspy.WithStandardQueries(benchspy.StandardQueryExecutor_Direct),
benchspy.WithReportDirectory("test_reports"),
benchspy.WithGenerators(gen),
)
require.NoError(t, err, "failed to create baseline report")
fetchErr := baseLineReport.FetchData(fetchCtx)
require.NoError(t, fetchErr, "failed to fetch data for original report")
path, storeErr := baseLineReport.Store()
require.NoError(t, storeErr, "failed to store current report", path)
Modifying Report Generation
With the baseline report for v1.0.0
stored, we'll modify the test to support future releases. The code from the previous step will change as follows:
currentVersion := os.Getenv("CURRENT_VERSION")
require.NotEmpty(t, currentVersion, "No current version provided")
fetchCtx, cancelFn := context.WithTimeout(context.Background(), 60*time.Second)
defer cancelFn()
currentReport, previousReport, err := benchspy.FetchNewStandardReportAndLoadLatestPrevious(
fetchCtx,
currentVersion,
benchspy.WithStandardQueries(benchspy.StandardQueryExecutor_Direct),
benchspy.WithReportDirectory("test_reports"),
benchspy.WithGenerators(generator),
)
require.NoError(t, err, "failed to fetch current report or load the previous one")
This function fetches the current report (for version passed as environment variable CURRENT_VERSION
) while loading the latest stored report from the test_reports
directory.
Adding Assertions
Let’s assume you want to ensure that none of the performance metrics degrade by more than 1% between releases (and that error rate has not changed at all). Here's how you can write assertions using a convenient function for the Direct
query executor:
hasErrors, errors := benchspy.CompareDirectWithThresholds(
1.0, // Max 1% worse median latency
1.0, // Max 1% worse p95 latency
1.0, // Max 1% worse maximum latency
0.0, // No increase in error rate
currentReport, previousReport)
require.False(t, hasErrors, fmt.Sprintf("errors found: %v", errors))
Conclusion
You’re now ready to use BenchSpy
to ensure that your application’s performance does not degrade below your specified thresholds!
note
Here you can find an example test where performance has degraded significantly,
because mock's latency has been increased from 50ms
to 60ms
.
This test passes because it is designed to expect performance degradation. Of course, in a real application, your goal should be to prevent such regressions. 😊
BenchSpy - Reports
BenchSpy
was designed with composability in mind. It includes a StandardReport
implementation that should cover the majority of use cases while also allowing for the creation of fully customized reports with ease.
Learn more about the StandardReport
in the next chapter.
BenchSpy - Standard Report
StandardReport
supports three types of data sources:
Direct
Loki
Prometheus
Each allows you to use either pre-defined metrics or create custom ones.
Pre-defined (Standard) Metrics
Direct and Loki
Both query executors focus on the characteristics of the load generated by WASP
. Their datasets are nearly identical because the Direct
executor queries load-specific data before it is sent to Loki. The Loki
executor, however, offers richer querying options (via LogQL
) and access to the actual load profile (rather than the configured one).
Predefined metrics for both include:
- Median latency
- 95th percentile latency
- Max latency
- Error rate
Latency is the round-trip time from sending a request to receiving a response from the Application Under Test (AUT).
Error rate is the ratio of failed responses to total responses. This includes requests that timed out or returned errors from Gun
or Vu
implementations.
Prometheus
Prometheus metrics focus on resource consumption by the AUT rather than load generation. These metrics include:
- Median CPU usage
- 95th percentile CPU usage
- Max CPU usage
- Median memory usage
- 95th percentile memory usage
- Max memory usage
Prometheus queries focus on total consumption, which combines the usage of the underlying system and your application.
How to Use Pre-defined Metrics
To use pre-defined metrics, call the NewStandardReport
method:
report, err := benchspy.NewStandardReport(
"v1.1.1",
// Query executor types for which standard metrics should be generated
benchspy.WithStandardQueries(
benchspy.StandardQueryExecutor_Prometheus,
benchspy.StandardQueryExecutor_Loki,
),
// Prometheus configuration is required if using standard Prometheus metrics
benchspy.WithPrometheusConfig(benchspy.NewPrometheusConfig("node[^0]")),
// WASP generators
benchspy.WithGenerators(gen),
)
require.NoError(t, err, "failed to create the report")
Custom Metrics
WASP Generator
Since WASP
stores AUT responses in each generator, you can create custom metrics based on this data. For example, to calculate the timeout ratio:
var generator *wasp.Generator
var timeouts = func(responses *wasp.SliceBuffer[*wasp.Response]) (float64, error) {
if len(responses.Data) == 0 {
return 0, nil
}
timeoutCount := 0.0
inTimeCount := 0.0
for _, response := range responses.Data {
if response.Timeout {
timeoutCount++
} else {
inTimeCount++
}
}
return timeoutCount / (timeoutCount + inTimeCount), nil
}
directExecutor, err := NewDirectQueryExecutor(generator, map[string]DirectQueryFn{
"timeout_ratio": timeouts,
})
require.NoError(t, err, "failed to create Direct Query Executor")
Loki
Creating custom LogQL
queries is even simpler. Use NewLokiQueryExecutor
with a map of desired queries:
var generator *wasp.Generator
lokiQueryExecutor := benchspy.NewLokiQueryExecutor(
map[string]string{
"responses_over_time": fmt.Sprintf("sum(count_over_time({my_label=~\"%s\", test_data_type=~\"responses\", gen_name=~\"%s\"} [5s])) by (node_id, gen_name)", label, gen.Cfg.GenName),
},
generator.Cfg.LokiConfig,
)
note
To write effective LogQL
queries for WASP
, familiarize yourself with generator labeling and the test_data_types
used by WASP
.
Prometheus
Adding custom PromQL
queries is equally straightforward:
promConfig := benchspy.NewPrometheusConfig()
prometheusExecutor, err := benchspy.NewPrometheusQueryExecutor(
map[string]string{
"cpu_rate_by_container": "rate(container_cpu_usage_seconds_total{name=~\"chainlink.*\"}[5m])[30m:1m]",
},
promConfig,
)
require.NoError(t, err)
Using Custom Queries with StandardReport
To use custom queries in a StandardReport
, pass the custom QueryExecutors
created above using the WithQueryExecutors
option instead of the StandardQueryExecutorType
:
report, err := benchspy.NewStandardReport(
"v1.2.3",
benchspy.WithQueryExecutors(directExecutor, lokiQueryExecutor, prometheusExecutor),
benchspy.WithGenerators(gen),
)
require.NoError(t, err, "failed to create baseline report")
BenchSpy - Adding a New QueryExecutor
QueryExecutor interface
As mentioned earlier, the StandardReport
supports three different data sources:
Direct
Loki
Prometheus
Each of these implements the QueryExecutor
interface:
type QueryExecutor interface {
// Kind returns the type of the QueryExecutor
Kind() string
// Validate checks if the QueryExecutor has all the necessary data and configuration to execute the queries
Validate() error
// Execute executes the queries and populates the QueryExecutor with the results
Execute(ctx context.Context) error
// Results returns the results of the queries, where the key is the query name and the value is the result
Results() map[string]interface{}
// IsComparable checks whether both QueryExecutors can be compared (e.g., they have the same type, queries, etc.),
// and returns an error if any differences are found
IsComparable(other QueryExecutor) error
// TimeRange sets the time range for the queries
TimeRange(startTime, endTime time.Time)
}
When creating a new QueryExecutor
, most functions in this interface are straightforward. Below, we focus on two that may require additional explanation.
Kind
The Kind
function should return a unique string identifier for your QueryExecutor
. This identifier is crucial because StandardReport
uses it when unmarshalling JSON files with stored reports.
Additionally, you need to extend the StandardReport.UnmarshalJSON
function to support your new executor.
note
If your QueryExecutor
includes interfaces, interface{}
or any
types, or fields that should not or cannot be serialized, ensure you implement custom MarshalJSON
and UnmarshalJSON
functions. Existing executors can provide useful examples.
TimeRange
The TimeRange
method is called by StandardReport
just before invoking Execute()
for each executor. This method sets the time range for queries (required for Loki and Prometheus).
By default, StandardReport
calculates the test time range automatically by analyzing the schedules of all generators and determining the earliest start time and latest end time. This eliminates the need for manual calculations.
With these details in mind, you should have a clear path to implementing your own QueryExecutor
and integrating it seamlessly with BenchSpy
's StandardReport
.
NamedGenerator interface
Executors that query load generation metrics should also implement this simple interface:
type NamedGenerator interface {
// GeneratorName returns the name of the generator
GeneratorName() string
}
It is used primarly, when casting results from map[string]interface{}
to target type, while splitting them between different generators.
Currently, this interface is implemented by Direct
and Loki
exectors, but not by Prometheus
.
Defining a New Report
Each BenchSpy
report must implement the Reporter
interface, which handles three primary responsibilities:
- Storage and retrieval (
Storer
interface) - Data fetching (
DataFetcher
interface) - Comparison (
Comparator
interface)
Reporter Interface
Definition
type Reporter interface {
Storer
DataFetcher
Comparator
}
The comparison of actual performance data should not be part of the report itself. It should be done independently, ideally using Go's require
and assert
statements.
Storer Interface
Definition
type Storer interface {
// Store stores the report in persistent storage and returns the path to it, or an error
Store() (string, error)
// Load loads the report from persistent storage based on the test name and commit/tag, or returns an error
Load(testName, commitOrTag string) error
// LoadLatest loads the latest report from persistent storage for the given test name, or returns an error
LoadLatest(testName string) error
}
Usage
- If storing reports locally under Git satisfies your requirements, you can reuse the
LocalStorage
implementation ofStorer
. - If you need to store reports in S3 or a database, you will need to implement the interface yourself.
DataFetcher Interface
Definition
type DataFetcher interface {
// FetchData populates the report with data from the test
FetchData(ctx context.Context) error
}
Purpose
This interface is solely responsible for fetching data from the data source and populating the report with results.
Comparator Interface
Definition
type Comparator interface {
// IsComparable checks whether two reports can be compared (e.g., test configuration, app resources, queries, and metrics are identical),
// and returns an error if any differences are found
IsComparable(otherReport Reporter) error
}
Purpose
This interface ensures that both reports are comparable by verifying:
- Both use generators with identical configurations (e.g., load type, load characteristics).
- Both reports feature the same data sources and queries.
This design provides flexibility and composability, allowing you to store, fetch, and compare reports in a way that fits your specific requirements.
WASP - How to's
In this section we will deal with the most common issues or questions:
- How to start local observability stack?
- How to try some tests quickly?
- How to chose between RPS and VUs?
- How to define NFRs and check alerts?
- How to use labels?
- How to incorporate load tests in your workflow?
- How to reuse dashboard components?
- How to parallelize load?
- How to debug Loki errors?
WASP - How to Start Local Observability Stack
To execute all examples or tests locally, you need to configure a local observability stack. This stack includes Loki, Grafana, and Pyroscope.
Prerequisites
Ensure you have the following installed:
Grafana and Loki
To start the local observability stack, run the following command:
make start
This command will download and run Docker images for Loki and Grafana. Once completed, it will output something like:
Service account id: 2
Grafana token: "<grafana token>"
Setting Up Environment Variables
Next, set up the required environment variables.
warning
Replace <Grafana token>
with the token provided in the previous step.
export LOKI_TOKEN=
export LOKI_URL=http://localhost:3030/loki/api/v1/push
export GRAFANA_URL=http://localhost:3000
export GRAFANA_TOKEN=<Grafana token>
export DATA_SOURCE_NAME=Loki
export DASHBOARD_FOLDER=LoadTests
export DASHBOARD_NAME=Wasp
Accessing Services
- Grafana: http://localhost:3000/
- Loki: http://localhost:3030/
Stopping the Containers
To stop both containers, run:
make stop
Pyroscope
To start Pyroscope, execute:
make pyro_start
note
Pyroscope is available at: http://localhost:4040/
Stopping Pyroscope
To stop Pyroscope, run:
make pyro_stop
WASP - Try It Out Quickly
Prerequisites
Ensure you have the following installed:
Install Dependencies with Nix
To install dependencies with Nix, simply execute:
nix develop
Running Grafana and Loki Tests
Assuming you have already started and configured the local observability stack, you can run sample Loki and Grafana tests with:
make test_loki
Once the test is running, you can view the results in the dashboard.
warning
If deploying to your own Grafana instance, verify the DASHBOARD_FOLDER
and DASHBOARD_NAME
.
Defaults are the LoadTests
directory, and the dashboard is named Wasp
.
Running Pyroscope Tests
If Grafana, Loki, and Pyroscope are already running, you can execute sample Pyroscope tests with:
make test_pyro_rps
or
make test_pyro_vu
During the test, you can view the results in the Pyroscope dashboard.
Additional Debugging Option
You can also add the -trace trace.out
flag when running any of your tests with go test
for additional tracing information.
WASP - How to Choose Between RPS and VUs
When writing a test, you need to decide whether to use RPS or VUs as your load type.
warning
RPS load type can only be used with a Gun
, and VUs can only be used with a VirtualUser
.
RPS (Requests Per Second)
You should use RPS when your goal is to measure throughput. This load type implements an open model, where:
- The number of requests is fixed.
- The load is adjusted to meet the target RPS.
- The system's response time has no impact on the rate of load generation.
In an RPS-based test, new requests are sent continuously, regardless of how long previous requests take.
Example RPS Test Diagram
---
title: RPS test
---
sequenceDiagram
participant Profile(Test)
participant Scheduler
participant Generator(Gun)
participant Promtail
participant Loki
participant Grafana
loop Test Execution
Profile(Test) ->> Generator(Gun): Start with (API, TestData)
loop Schedule
Scheduler ->> Scheduler: Process schedule segment
Scheduler ->> Generator(Gun): Set new RPS target
loop RPS load
Generator(Gun) ->> Generator(Gun): Execute Call() in parallel
Generator(Gun) ->> Promtail: Save CallResult
end
Promtail ->> Loki: Send batch<br/>when ready or timeout
end
Scheduler ->> Scheduler: All segments done<br/>wait all responses<br/>test ends
Profile(Test) ->> Grafana: Check alert groups<br/>FAIL or PASS the test
end
VirtualUser (VUs)
Use VUs when working with stateful protocols or workflows. With VirtualUser
, the requests per second rate cannot be guaranteed because:
- Each iteration begins only after the previous one finishes.
- The load follows a closed model, where the system's response time affects the RPS.
If the system takes longer to respond, iterations take longer, and the RPS decreases.
Example VUs Test Diagram
---
title: VUs test
---
sequenceDiagram
participant Profile(Test)
participant Scheduler
participant Generator(VUs)
participant VU1
participant VU2
participant Promtail
participant Loki
participant Grafana
loop Test Execution
Profile(Test) ->> Generator(VUs): Start with (API, TestData)
loop Schedule
Scheduler ->> Scheduler: Process schedule segment
Scheduler ->> Generator(VUs): Set new VUs target
loop VUs load
Generator(VUs) ->> Generator(VUs): Add/remove VUs
Generator(VUs) ->> VU1: Start/end
Generator(VUs) ->> VU2: Start/end
VU1 ->> VU1: Run loop, execute multiple calls
VU1 ->> Promtail: Save []CallResult
VU2 ->> VU2: Run loop, execute multiple calls
VU2 ->> Promtail: Save []CallResult
Promtail ->> Loki: Send batch<br/>when ready or timeout
end
end
Scheduler ->> Scheduler: All segments done<br/>wait all responses<br/>test ends
Profile(Test) ->> Grafana: Check alert groups<br/>FAIL or PASS the test
end
In this model, VU2
starts only after VU1
has finished its iteration.
Summary
-
Use RPS for:
- Stateless protocols.
- Measuring throughput.
- Scenarios where the load generation rate must remain constant.
-
Use VUs for:
- Stateful protocols.
- Workflows that involve multiple steps.
- Scenarios where response time impacts the load generation rate.
note
Learn more about open and closed models here.
WASP - How to Define NFRs and Check Alerts
WASP allows you to define and monitor Non-Functional Requirements (NFRs) by grouping them into categories for independent checks. Ideally, these NFRs should be defined as dashboard-as-code and committed to your Git repository. However, you can also create them programmatically on an existing or new dashboard.
Defining Alerts
WASP supports two types of alerts:
Built-in Alerts
These alerts are automatically supported by every generator and are based on the following metrics:
- 99th quantile (p99)
- Errors
- Timeouts
You can specify simple alert conditions such as:
- Value above or below
- Average
- Median
These conditions use Grabana's ConditionEvaluator.
Custom Alerts
Custom alerts can be much more complex and can:
- Combine multiple simple conditions.
- Execute Loki queries.
Custom alerts use Grabana's timeseries.Alert and must be timeseries-based.
note
For a programmatic example, check the alerts example.
Checking Alerts
Once you have a dashboard with alerts, you can choose between two approaches:
1. Automatic Checking with Profile
and GrafanaOpts
You can automatically check alerts and add dashboard annotations by utilizing the WithGrafana()
function with a wasp.Profile
. This approach integrates dashboard annotations and evaluates alerts after the test run.
Example:
_, err = wasp.NewProfile().
WithGrafana(grafanaOpts).
Add(wasp.NewGenerator(getLatestReportByTimestampCfg)).
Run(true)
require.NoError(t, err)
Where GrafanaOpts
is defined as:
type GrafanaOpts struct {
GrafanaURL string `toml:"grafana_url"`
GrafanaToken string `toml:"grafana_token_secret"`
WaitBeforeAlertCheck time.Duration `toml:"grafana_wait_before_alert_check"` // Cooldown period before checking for alerts
AnnotateDashboardUIDs []string `toml:"grafana_annotate_dashboard_uids"` // Dashboard UIDs to annotate the start and end of the run
CheckDashboardAlertsAfterRun []string `toml:"grafana_check_alerts_after_run_on_dashboard_uids"` // Dashboard UIDs to check for alerts after the run
}
2. Manual Checking with AlertChecker
You can manually check alerts using the AlertChecker component.
Summary
WASP provides flexibility in defining and monitoring NFRs:
- Use built-in alerts for standard metrics like p99, errors, and timeouts.
- Use custom alerts for complex conditions or Loki queries.
- Automate alert checks using
Profile
withGrafanaOpts
or manually verify them withAlertChecker
.
By combining these tools, you can ensure your application's performance and reliability align with your NFRs.
WASP - How to Use Labels
The default WASP dashboard utilizes the following labels to differentiate between various generators and requests:
go_test_name
gen_name
branch
commit
call_group
go_test_name
This label is used to differentiate between different tests. If an instance of *testing.T
is passed to the generator configuration, the test name will automatically be added to the go_test_name
label:
gen, err := wasp.NewGenerator(&wasp.Config{
LoadType: wasp.RPS,
T: t, // < ----------------------- HERE
Schedule: wasp.Plain(5, 60*time.Second),
Gun: NewExampleHTTPGun(srv.URL()),
Labels: labels,
LokiConfig: wasp.NewEnvLokiConfig(),
})
gen_name
This label differentiates between different generators:
gen, err := wasp.NewGenerator(&wasp.Config{
LoadType: wasp.RPS,
T: t,
GenName: "my_super_generator", // < ----------------------- HERE
Schedule: wasp.Plain(5, 60*time.Second),
Gun: NewExampleHTTPGun(srv.URL()),
Labels: labels,
LokiConfig: wasp.NewEnvLokiConfig(),
})
branch
and commit
These labels are primarily used to distinguish between different branches and commits when running tests in CI.
Since there is no automated way to fetch this data, it must be passed manually:
branch := os.Getenv("BRANCH")
commit := os.Getenv("COMMIT")
labels := map[string]string{
"branch": branch, // < ----------------------- HERE
"commit": commit, // < ----------------------- HERE
}
gen, err := wasp.NewGenerator(&wasp.Config{
LoadType: wasp.RPS,
T: t,
GenName: "my_super_generator",
Schedule: wasp.Plain(5, 60*time.Second),
Gun: NewExampleHTTPGun(srv.URL()),
Labels: labels, // < ----------------------- HERE
LokiConfig: wasp.NewEnvLokiConfig(),
})
note
Make sure to set the BRANCH
and COMMIT
environment variables in your CI workflow.
call_group
This label allows differentiation between different requests within a Virtual User's Call()
function.
Example:
// represents user login
func (m *VirtualUser) requestOne(l *wasp.Generator) {
var result map[string]interface{}
r, err := m.client.R().
SetResult(&result).
Get(m.target)
if err != nil {
l.Responses.Err(r, "auth", err) // < ----------------------- HERE
return
}
l.Responses.OK(r, "auth") // < ----------------------- HERE
}
// represents authenticated user action
func (m *VirtualUser) requestTwo(l *wasp.Generator) {
var result map[string]interface{}
r, err := m.client.R().
SetResult(&result).
Get(m.target)
if err != nil {
l.Responses.Err(r, "user", err) // < ----------------------- HERE
return
}
l.Responses.OK(r, "user") // < ----------------------- HERE
}
In this example, the authentication request and the user action request are differentiated by the call_group
label.
Custom Labels
note
You can add as many custom labels as needed for use with your dashboard.
labels := map[string]string{
"custom_label_1": "value1",
"custom_label_2": "value2",
}
WASP - How to Incorporate Load Tests in Your Workflow
You might be wondering where load tests belong in the Software Development Life Cycle (SDLC) and how to incorporate them effectively. The diagram below illustrates the recommended approach:
---
title: Workload execution. P - Profile, G - Generator, VU - VirtualUser
---
sequenceDiagram
participant Product repo
participant Runner
participant K8s
participant Loki
participant Grafana
participant Devs
Product repo->>Product repo: Define NFR for different workloads<br/>Define application dashboard<br/>Define dashboard alerts<br/>Define load tests
Product repo->>Grafana: Upload app dashboard<br/>Alerts have "requirement_name" label<br/>Each "requirement_name" group is based on some NFR
loop CI runs
Product repo->>Runner: CI runs small load test
Runner->>Runner: Execute load test logic<br/>Run multiple generators
Runner->>Loki: Stream load test data
Runner->>Grafana: Check alerts for "requirement_name": "baseline"
Grafana->>Devs: Notify devs (Dashboard URL/Alert groups)
Product repo->>Runner: CI runs large load test
Runner->>K8s: Split workload into multiple jobs<br/>Monitor job statuses
K8s->>Loki: Stream load test data
Runner->>Grafana: Check alerts for "requirement_name": "stress"
Grafana->>Devs: Notify devs (Dashboard URL/Alert groups)
end
Workflow Explanation
-
Define Requirements and Dashboards:
- Define NFRs (Non-Functional Requirements) for different workloads.
- Create an application dashboard and define alerts grouped by
requirement_name
.
Eachrequirement_name
group corresponds to a specific NFR.
-
Integrate into CI/CD:
- Small Load Tests:
Run small load tests as part of your CI pipeline:- Execute load test logic with multiple generators.
- Stream test data to Loki.
- Check Grafana alerts for the "baseline" group.
- Notify developers with dashboard URLs and alert group details.
- Large Load Tests:
For stress testing:- Split the workload into multiple Kubernetes jobs and monitor their statuses.
- Stream test data to Loki.
- Check Grafana alerts for the "stress" group.
- Notify developers with dashboard URLs and alert group details.
- Small Load Tests:
-
Notification and Iteration:
- Alerts notify developers about potential issues, directing them to relevant dashboards and alert groups.
Key Points
- Scalable Testing: Use small load tests for quick validation in CI and larger tests for stress validation.
- Real-Time Monitoring: Leverage Grafana to monitor metrics and alerts during test execution.
- Team Collaboration: Automate notifications to keep the team informed of any issues.
This approach ensures that load testing is seamlessly integrated into your SDLC and provides actionable insights for maintaining performance and reliability.
WASP - How to Reuse WASP Components in Your Own Dashboard
You can reuse components from the default WASP dashboard to build your custom dashboard. Here’s an example:
import (
waspdashboard "github.com/smartcontractkit/wasp/dashboard"
"github.com/K-Phoen/grabana/dashboard"
)
func BuildCustomLoadTestDashboard(dashboardName string) (dashboard.Builder, error) {
// Custom key-value pairs used to query panels
panelQuery := map[string]string{
"branch": `=~"${branch:pipe}"`,
"commit": `=~"${commit:pipe}"`,
"network_type": `="testnet"`,
}
return dashboard.New(
dashboardName,
waspdashboard.WASPLoadStatsRow("Loki", panelQuery), // WASP component
waspdashboard.WASPDebugDataRow("Loki", panelQuery, true), // WASP component
// Your custom panels and rows go here
)
}
Available Components
You can find all reusable components in the following files:
Key Points
- WASP components like
WASPLoadStatsRow
andWASPDebugDataRow
can be directly included in your dashboard. - Customize panel queries using key-value pairs specific to your setup.
- Extend the default dashboard by adding your own rows and panels alongside WASP components.
WASP - How to Parallelize Load
Parallelizing load can be achieved using the Profile
component, which allows you to combine multiple generators. However, this approach works only if all generators start at the same time. If you need to space out the generators for various reasons, you must use native Go concurrency primitives like goroutines
and channels
.
Concept
To parallelize load:
- Split the load into multiple parts.
- Run each part in separate
goroutines
, either using aProfile
or directly withGenerator
. - Use
channels
to coordinate the timing and synchronization between the goroutines.
Example Scenario
Suppose you want to execute the following scenario:
- Gradually ramp up an RPS load over 10 seconds from 1 to 10 RPS and hold it for 50 seconds.
- When RPS reaches 10, gradually ramp up a VU load over 16 seconds from 2 to 8 VUs (
VU'
) and hold it for 30 seconds. - Once
VU'
reaches 8, introduce another user interaction and ramp up a VU load over 14 seconds from 3 to 9 VUs (VU''
) and hold it for 20 seconds.
Here’s how you can achieve this:
func TestParallelLoad(t *testing.T) {
labels := map[string]string{
"branch": "parallel_load",
"commit": "parallel_load",
}
// Define RPS schedule
rpsSchedule := wasp.Combine(
// wasp.Steps(from, increase, steps, duration)
wasp.Steps(1, 1, 9, 10*time.Second), // Start with 1 RPS, increment by 1 RPS in 9 steps over 10 seconds
// wasp.Plain(count, duration)
wasp.Plain(9, 50*time.Second), // Hold 9 RPS for 50 seconds
)
// Define VU' schedule
vuSchedule := wasp.Combine(
// wasp.Steps(from, increase, steps, duration)
wasp.Steps(2, 1, 8, 16*time.Second), // Start with 2 VUs, increment by 1 VU in 8 steps over 16 seconds
// wasp.Plain(count, duration)
wasp.Plain(10, 30*time.Second), // Hold 10 VUs for 30 seconds
)
// Define VU'' schedule
vu2Schedule := wasp.Combine(
// wasp.Steps(from, increase, steps, duration)
wasp.Steps(3, 1, 6, 14*time.Second), // Start with 3 VUs, increment by 1 VU in 6 steps over 14 seconds
// wasp.Plain(count, duration)
wasp.Plain(9, 20*time.Second), // Hold 9 VUs for 20 seconds
)
// Create generators
rpsGen, err := wasp.NewGenerator(&wasp.Config{
LoadType: wasp.RPS,
Schedule: rpsSchedule,
GenName: "Kappa",
Labels: labels,
Gun: NewExampleHTTPGun(srv.URL()),
LokiConfig: wasp.NewEnvLokiConfig(),
})
require.NoError(t, err)
vuGen, err := wasp.NewGenerator(&wasp.Config{
LoadType: wasp.VU,
Schedule: vuSchedule,
GenName: "Lambda",
Labels: labels,
VU: NewExampleScenario(srv.URL()),
LokiConfig: wasp.NewEnvLokiConfig(),
})
require.NoError(t, err)
vu2Gen, err := wasp.NewGenerator(&wasp.Config{
LoadType: wasp.VU,
Schedule: vu2Schedule,
GenName: "Mu",
Labels: labels,
VU: NewExampleScenario(srv.URL()), // Use the same VirtualUser implementation for simplicity
LokiConfig: wasp.NewEnvLokiConfig(),
})
require.NoError(t, err)
wg := sync.WaitGroup{}
// Run RPS load in a separate goroutine
go func() {
wg.Add(1)
rpsGen.Run(true)
wg.Done()
}()
// Wait for RPS load to stabilize
time.Sleep(10 * time.Second)
// Run VU' load in a separate goroutine
go func() {
wg.Add(1)
vuGen.Run(true)
wg.Done()
}()
// Wait for VU' load to stabilize
time.Sleep(16 * time.Second)
// Run VU'' load
vu2Gen.Run(true)
// Wait for all goroutines to complete
wg.Wait()
// Check for load generation errors
require.Equal(t, 0, len(rpsGen.Errors()), "RPS generator errors")
require.Equal(t, 0, len(vuGen.Errors()), "VU generator errors")
require.Equal(t, 0, len(vu2Gen.Errors()), "VU'' generator errors")
}
Key Points
- Parallel Execution: Run each generator in its own
goroutine
. - Synchronization: Use
time.Sleep
or channels to coordinate timing between load segments. - Error Handling: Always check for errors in load generation, though they may not necessarily indicate a problem (e.g., aggressive load tests).
Full Example
For a complete example, refer to this file.
WASP - How to Debug Loki Push Errors
To troubleshoot Loki push errors, follow these steps:
Step 1: Enable Trace Logging
Set the logging level to trace
to see all messages sent to Loki:
WASP_LOG_LEVEL=trace
Step 2: Adjust Error Limits
If the Loki client fails to deliver a batch, the test will continue until the maximum number of errors is reached.
You can control this using LokiConfig.MaxErrors
:
- Set
LokiConfig.MaxErrors
to a desired limit. - Set it to
-1
to disable error checks entirely.
Often, simply increasing LokiConfig.Timeout
can resolve the issue.
Step 3: Handle Specific Errors
If you encounter errors like the following:
ERR Malformed promtail log message, skipping Line=["level",{},"component","client","host","...","msg","batch add err","tenant","","error",{}]
Take the following actions:
- Increase
LokiConfig.MaxStreams
. - Verify the validity of your Loki configuration.
Since LokiConfig
is specific for each Generator
remember to adjust the configuration for all of your generators.
By following these steps, you can effectively debug and resolve Loki push errors.
Havoc
The havoc
package is designed to facilitate chaos testing within Kubernetes environments using Chaos Mesh. It offers a structured way to define, execute, and manage chaos experiments as code, directly integrated into Go applications or testing suites, simplifying the creation and control of Chaos Mesh experiments.
Features
- Chaos Object Management: Create, update, pause, resume, and delete chaos experiments using Go structures and methods.
- Lifecycle Hooks: Utilize chaos listeners to hook into the lifecycle of chaos experiments.
- Different Experiments: Create and manage different types of chaos experiments to affect network, IO, K8s pods, and more.
- Active Monitoring: Monitor and react to the status of chaos experiments programmatically.
Requirements
- Go
- A Kubernetes cluster with Chaos Mesh installed
Active Monitoring
havoc
enhances chaos experiment observability through structured logging and Grafana annotations by implementing the ChaosListener interface.
The ChaosLogger is the default implementation. It uses zerolog to provide structured, queryable logging of chaos events. It automatically logs key lifecycle events such as creation, start, pause, and termination of chaos experiments with detailed contextual information.
Grafana Annotations
We recommend using Grafana dashboards to monitor your chaos experiments, and provide the SingleLineGrafanaAnnotator, a ChaosListener
that annotates dashboards with chaos experiment events so you can see in real time what your chaos experiment is doing.
You can also use the RangeGrafanaAnnotator to show the full range of a chaos event's duration rather than a single line.
Creating a Chaos Experiment
To create a chaos experiment, define the chaos object options, initialize a chaos experiment with NewChaos, and then call Create to start the experiment.
See this runnable example of defining a chaos experiment.
Preparing to Run Tests on Staging
Ensure you complete the following steps before executing tests on the staging environment:
-
Connect to the VPN
-
AWS Login with Staging Profile
Authenticate to AWS using your staging profile, specifically with the
StagingEKSAdmin
role. Execute the following command:aws sso login --profile staging
-
Verify Authorization
Confirm your authorization status by listing the namespaces in the staging cluster. Run
kubectl get namespaces
. If you see a list of namespaces, this indicates successful access to the staging cluster.
Running Tests
Creating an Image with the Test Binary
Before running tests, you must create a Docker image containing the test binary. To do this, execute the create-test-image
command and provide the path to the test folder you wish to package. This command:
- Compiles test binary under
<path-to-test-folder>
- Creates a docker image with the test binary
- Pushes the docker image to the image registry (e.g. Staging ECR)
go run ./cmd/main.go create-test-image --image-registry-url <staging-ecr-registry-url> --image-name "<image-name>" --image-tag "<image-tag>" "<path-to-test-folder>"
Where image-tag
should be a descriptive name for your test, such as "mercury-load-tests".
Running the Test in Kubernetes
If a Docker image containing the test binary is available in an image registry (such as staging ECR), use run
command to execute the test in K8s.
go run ./cmd/main.go run -c "<path-to-test-runner-toml-config>"
The TOML config should specify the test runner configuration as follows:
namespace = "e2e-tests"
rbac_role_name = "" # RBAC role name for the chart
rbac_service_account_name = "" # RBAC service account name for the chart
image_registry_url = "" # URL to the ECR containing the test binary image, e.g., staging ECR URL
image_name = "k8s-test-runner"
image_tag = "" # The image tag to use, like "mercury-load-tests" (see readme above)
job_count = "1"
test_name = "TestMercuryLoad/all_endpoints"
test_timeout = "24h"
test_config_base64_env_name = "LOAD_TEST_BASE64_TOML_CONTENT"
test_config_file_path = "/Users/lukasz/Documents/test-configs/load-staging-testnet.toml"
resources_requests_cpu = "1000m"
resources_requests_memory = "512Mi"
resources_limits_cpu = "2000m"
resources_limits_memory = "1024Mi"
[envs]
WASP_LOG_LEVEL = "info"
TEST_LOG_LEVEL = "info"
MERCURY_TEST_LOG_LEVEL = "info"
[metadata.labels]
"chain.link/component" = "test-runner"
"chain.link/product" = "<your-product-name>"
"chain.link/team" = "<name–of-the-team-you're-running-the-test-for>"
"chain.link/cost-center" = "test-tooling-<testType>-test"
[NOTE] Make sure to quote labels with "/" as otherwise parsing them will fail.
Where:
test_name
is the name of the test to run (must be included in the test binary).test_config_env_name
is the name of the environment variable used to provide the test configuration for the test (optional).test_config_file_path
is the path to the configuration file for the test (optional).
Using K8s Test Runner on CI
Example
This example demonstrates the process step by step. First, it shows how to download the Kubernetes Test Runner. Next, it details the use of the Test Runner to create a test binary specifically for the Mercury "e2e_tests/staging_prod/tests/load" test package. Finally, it describes executing the test in Kubernetes using a customized test runner configuration.
- name: Download K8s Test Runner
run: |
mkdir -p k8s-test-runner
cd k8s-test-runner
curl -L -o k8s-test-runner.tar.gz https://github.com/smartcontractkit/chainlink-testing-framework/releases/download/v0.2.4/test-runner.tar.gz
tar -xzf k8s-test-runner.tar.gz
chmod +x k8s-test-runner-linux-amd64
Alternatively, you can place the k8s-test-runner package within your repository and unpack it:
- name: Unpack K8s Test Runner
run: |
cd e2e_tests
mkdir -p k8s-test-runner
tar -xzf k8s-test-runner-v0.0.1.tar.gz -C k8s-test-runner
chmod +x k8s-test-runner/k8s-test-runner-linux-amd64
Then:
- name: Build K8s Test Runner Image
if: github.event.inputs.test-type == 'load' && github.event.inputs.rebuild-test-image == 'yes'
run: |
cd e2e_tests/k8s-test-runner
./k8s-test-runner-linux-amd64 create-test-image --image-registry-url "${{ secrets.AWS_ACCOUNT_ID_STAGING }}.dkr.ecr.${{ secrets.AWS_REGION }}.amazonaws.com" --image-tag "mercury-load-test" "../staging_prod/tests/load"
- name: Run Test in K8s
run: |
cd e2e_tests/k8s-test-runner
cat << EOF > config.toml
namespace = "e2e-tests"
rbac_role_name = "" # RBAC role name for the chart
rbac_service_account_name = "" # RBAC service account name for the chart
image_registry_url = "${{ secrets.AWS_ACCOUNT_ID_STAGING }}.dkr.ecr.${{ secrets.AWS_REGION }}.amazonaws.com"
image_name = "k8s-test-runner"
image_tag = "mercury-load-test"
job_count = "1"
chart_path = "./chart"
test_name = "TestMercuryLoad/all_endpoints"
test_timeout = "24h"
resources_requests_cpu = "1000m"
resources_requests_memory = "512Mi"
resources_limits_cpu = "2000m"
resources_limits_memory = "1024Mi"
test_config_base64_env_name = "LOAD_TEST_BASE64_TOML_CONTENT"
test_config_base64 = "${{ steps.conditional-env-vars.outputs.LOAD_TEST_BASE64_TOML_CONTENT }}"
[envs]
WASP_LOG_LEVEL = "info"
TEST_LOG_LEVEL = "info"
MERCURY_TEST_LOG_LEVEL = "info"
[metadata.labels]
"chain.link/component" = "test-runner"
"chain.link/product" = "data-streamsv0.3"
"chain.link/team" = "Data Streams"
"chain.link/cost-center" = "test-tooling-load-test"
EOF
./k8s-test-runner-linux-amd64 run -c config.toml
Release
Run ./package <version>
Sentinel
🚧 Beta Notice: Sentinel is currently in Beta mode. The API is subject to change, and users are advised to stay updated with the latest releases and documentation.
Table of Contents
Overview
Sentinel is a centralized orchestrator that manages multiple blockchain poller services, each responsible for a specific blockchain network (e.g., Ethereum, Optimism, Arbitrum). It provides a unified interface for subscribing to blockchain events, ensuring efficient log polling and event broadcasting to subscribers.
Key Features
- Multi-Chain Support: Manage multiple blockchain networks concurrently.
- Event Broadcasting: Relay blockchain events to subscribers via a thread-safe subscription system.
- Flexible Subscriptions: Dynamically subscribe and unsubscribe to events based on addresses and topics.
- Graceful Lifecycle Management: Start, stop, and clean up resources across services effortlessly.
- Comprehensive Testing: Ensures reliability through extensive unit and integration tests.
- Scalable Architecture: Designed to handle polling multiple chains with multiple users subscribed to multiple events.
System Architecture
How Components Interact
graph TD
Sentinel["Sentinel<br/>(Coordinator)"]
subgraph Ethereum
ChainPollerSvc_Ethereum["ChainPollerSvc<br/>(Ethereum)"]
ChainPoller_Ethereum["ChainPoller<br/>(Log Fetching)"]
SubscriptionManager_Ethereum["Subscription Manager"]
ChainPollerSvc_Ethereum --> ChainPoller_Ethereum
ChainPollerSvc_Ethereum --> SubscriptionManager_Ethereum
ChainPoller_Ethereum --> Blockchain_Ethereum["Blockchain<br/>(Ethereum)"]
end
subgraph Polygon
ChainPollerSvc_Polygon["ChainPollerSvc<br/>(Polygon)"]
ChainPoller_Polygon["ChainPoller<br/>(Log Fetching)"]
SubscriptionManager_Polygon["Subscription Manager"]
ChainPollerSvc_Polygon --> ChainPoller_Polygon
ChainPollerSvc_Polygon --> SubscriptionManager_Polygon
ChainPoller_Polygon --> Blockchain_Polygon["Blockchain<br/>(Polygon)"]
end
subgraph Arbitrum
ChainPollerSvc_Arbitrum["ChainPollerSvc<br/>(Arbitrum)"]
ChainPoller_Arbitrum["ChainPoller<br/>(Log Fetching)"]
SubscriptionManager_Arbitrum["Subscription Manager"]
ChainPollerSvc_Arbitrum --> ChainPoller_Arbitrum
ChainPollerSvc_Arbitrum --> SubscriptionManager_Arbitrum
ChainPoller_Arbitrum --> Blockchain_Arbitrum["Blockchain<br/>(Arbitrum)"]
end
Sentinel --> Ethereum
Sentinel --> Polygon
Sentinel --> Arbitrum
Core Components
-
Sentinel:
- Role: Central coordinator managing multiple
ChainPollerService
instances. - Visibility: External
- Responsibilities:
- Handles adding and removing blockchain chains.
- Manages global subscriptions.
- Orchestrates communication between components.
- Role: Central coordinator managing multiple
-
ChainPollerService:
- Role: Manages the polling process for a specific blockchain.
- Visibility: Internal
- Responsibilities:
- Polls blockchain logs based on filter queries.
- Integrates internal
ChainPoller
andSubscriptionManager
. - Broadcasts fetched logs to relevant subscribers.
-
ChainPoller:
- Role: Fetches logs from blockchain networks.
- Visibility: Internal
- Responsibilities:
- Interacts with the blockchain client to retrieve logs.
- Processes filter queries to fetch relevant logs.
-
SubscriptionManager:
- Role: Manages event subscriptions for a specific chain.
- Visibility: Internal
- Responsibilities:
- Tracks subscriptions to blockchain events.
- Ensures thread-safe management of subscribers.
- Broadcasts logs to all relevant subscribers.
Usage
Initialize Sentinel
Set up a Sentinel
instance:
package main
import (
"github.com/rs/zerolog"
"os"
"github.com/smartcontractkit/chainlink-testing-framework/sentinel"
)
func main() {
// Initialize logger
logger := zerolog.New(os.Stdout).With().Timestamp().Logger()
// Initialize Sentinel
sentinelCoordinator := sentinel.NewSentinel(sentinel.SentinelConfig{
Logger: &logger,
})
defer sentinelCoordinator.Close()
}
Add a Chain
Add a blockchain to monitor:
package main
import (
"time"
"github.com/ethereum/go-ethereum/ethclient"
"github.com/smartcontractkit/chainlink-testing-framework/sentinel/blockchain_client_wrapper"
"github.com/smartcontractkit/chainlink-testing-framework/sentinel/sentinel"
)
func main() {
// Initialize logger and Sentinel as shown above
// Setup blockchain client (e.g., Geth)
client, err := ethclient.Dial("https://mainnet.infura.io/v3/YOUR-PROJECT-ID")
if err != nil {
panic("Failed to connect to blockchain client: " + err.Error())
}
wrappedClient := blockchain_client_wrapper.NewGethClientWrapper(client)
// Add a new chain to Sentinel
err = sentinelCoordinator.AddChain(sentinel.AddChainConfig{
ChainID: 1, // Ethereum Mainnet
PollInterval: 10 * time.Second,
BlockchainClient: wrappedClient,
})
if err != nil {
panic("Failed to add chain: " + err.Error())
}
}
Subscribe to Events
Subscribe to blockchain events:
package main
import (
"fmt"
"github.com/ethereum/go-ethereum/common"
"github.com/smartcontractkit/chainlink-testing-framework/sentinel/api"
)
func main() {
// Initialize logger, Sentinel, and add a chain as shown above
// Define the address and topic to subscribe to
address := common.HexToAddress("0x1234567890abcdef1234567890abcdef12345678")
topic := common.HexToHash("0xabcdefabcdefabcdefabcdefabcdefabcdefabcdef")
// Subscribe to the event
logCh, err := sentinelCoordinator.Subscribe(1, address, topic)
if err != nil {
panic("Failed to subscribe: " + err.Error())
}
defer sentinelCoordinator.Unsubscribe(1, address, topic, logCh)
// Listen for logs in a separate goroutine
go func() {
for log := range logCh {
fmt.Printf("Received log: %+v\n", log)
}
}()
}
Unsubscribe
Unsubscribe from events:
package main
func main() {
// Initialize logger, Sentinel, add a chain, and subscribe as shown above
// Assume logCh is the channel obtained from Subscribe
err = sentinelCoordinator.Unsubscribe(1, address, topic, logCh)
if err != nil {
panic("Failed to unsubscribe: " + err.Error())
}
}
Remove a Chain
Remove a blockchain from monitoring:
package main
func main() {
// Initialize logger, Sentinel, add a chain, and subscribe as shown above
// Remove the chain
err = sentinelCoordinator.RemoveChain(1)
if err != nil {
panic("Failed to remove chain: " + err.Error())
}
}
API Reference
Sentinel
-
NewSentinel(config SentinelConfig) *Sentinel
Initializes a new Sentinel instance. -
AddChain(config AddChainConfig) error
Adds a new blockchain chain to Sentinel. -
RemoveChain(chainID int64) error
Removes an existing chain from Sentinel. -
Subscribe(chainID int64, address common.Address, topic common.Hash) (chan api.Log, error)
Subscribes to a specific event on a given chain. -
Unsubscribe(chainID int64, address common.Address, topic common.Hash, ch chan api.Log) error
Unsubscribes from a specific event.
Testing
Run Tests
Run the comprehensive test suite using:
go test -race ./... -v
Releasing Go modules
The Chainlink Testing Framework (CTF) repository contains multiple independent modules. To release any of them, we follow some best practices about breaking changes.
Release strategy
Use only lightweight tags
Do not move tags between commits. If something need to be fixed increment patch or minor version.
Steps to release:
- When all your PRs are merged to
main
check themain
branch breaking changes badge - If there are no breaking changes for external methods, create a branch and explain all your module changes in
vX.X.X.md
committed under.changeset
dir in your module. If changes are really short, and you run the script locally you can push.changeset
as a part of your final feature PR - If there are accidental breaking changes, and it is possible to make them backward compatible - fix them
- If there are breaking changes, and we must release them change
go.mod
path, add prefix/vX
, merge your PR(s) - When all the changes are merged, and there are no breaking changes in the pipeline then proceed with releasing
- Tag
main
branch in format$pkg/$subpkg/vX.X.X
according to your changes and push it, example:git tag $pkg/$subpkg/v1.1.0 && git push --tags git tag $pkg/$subpkg/v1.1.1 && git push --tags git tag $pkg/$subpkg/v2.0.0 && git push --tags
- Check the release page
Binary releases
If your module have cmd/main.go
we build binary automatically for various platforms and attach it to the release page.
Debugging release pipeline and gorelease
tool
Checkout dummy-release-branch
and release it:
git tag dummy-module/v0.X.0
- Add
vX.X.X.md
in.changeset
git push --no-verify --force && git push --tags
- Check releases
Pipelines:
Check breaking changes locally
We have a simple wrapper to check breaking changes for all the packages. Commit all your changes and run:
go run ./tools/breakingchanges/cmd/main.go
go run ./tools/breakingchanges/cmd/main.go --subdir wasp # check recursively starting with subdir
go run ./tools/breakingchanges/cmd/main.go --ignore tools,wasp,havoc,seth # to ignore some packages
DEPRECATED: This is v1 version and it is not actively maintained
The purpose of this framework is to:
- Interact with different blockchains
- Configure CL jobs
- Deploy using
docker
- Deploy using
k8s
If you're looking to implement a new chain integration for the testing framework, head over to the blockchain directory for more info.
k8s package
We have a k8s package we are using in tests, it provides:
- cdk8s based wrappers
- High-level k8s API
- Automatic port forwarding
You can also use this package to spin up standalone environments.
Local k8s cluster
Read here about how to spin up a local cluster
Install
Set up deps, you need to have node 14.x.x
, helm and yarn
Then use
make install_deps
Optional Nix
We have setup a nix shell which will produce a reliable environment that will behave the same locally and in ci. To use it instead of the above you will need to install nix
To start the nix shell run:
make nix_shell
If you install direnv you will be able to have your environment start the nix shell as soon as you cd into it once you have allowed the directory via:
direnv allow
Running tests in k8s
To read how to run a test in k8s, read here
Usage
With env vars (deprecated)
Create an env in a separate file and run it
export CHAINLINK_IMAGE="public.ecr.aws/chainlink/chainlink"
export CHAINLINK_TAG="1.4.0-root"
export CHAINLINK_ENV_USER="Satoshi"
go run k8s/examples/simple/env.go
For more features follow tutorial
With TOML config
It should be noted that using env vars for configuring CL nodes in k8s is deprecated. TOML config should be used instead:
[ChainlinkImage]
image="public.ecr.aws/chainlink/chainlink"
version="v2.8.0"
Check the example here: env.go
Development
Running standalone example environment
go run k8s/examples/simple/env.go
If you have another env of that type, you can connect by overriding environment name
ENV_NAMESPACE="..." go run k8s/examples/chainlink/env.go
Add more presets here
Add more programmatic examples here
If you have chaosmesh installed in your cluster you can pull and generated CRD in go like that
make chaosmesh
If you need to check your system tests coverage, use that
Chainlink Charts
This repository contains helm charts used by the chainlink organization mostly in QA.
Chart Repository
You can add the published chart repository by pointing helm to the gh-pages
branch with a personal access token (PAT) that has at least read-only access to the repository.
helm repo add chainlink-qa https://raw.githubusercontent.com/smartcontractkit/qa-charts/gh-pages/
helm search repo chainlink
Releasing Charts
The following cases will trigger a chart release once a PR is merged into the main
branch.
Modified packages or new packages get added and pushed to the gh-pages
branch of the qa-charts repository.
- An existing chart is version bumped
- A new chart is added
Removed charts do not trigger a re-publish, the packages have to be removed and the index file regenerated in the gh-pages
branch of the qa-charts repository.
Note: The qa-charts repository is scheduled to look for changes to the charts once every hour. This can be expedited by going to that repo and running the cd action via github UI.
Simulated EVM chains
We have extended support for execution layer clients in simulated networks. Following ones are supported:
Geth
Nethermind
Besu
Erigon
When it comes to consensus layer we currently support only Prysm
.
The easiest way to start a simulated network is to use a builder. It allows to configure the network in a fluent way and then start it. For example:
builder := NewEthereumNetworkBuilder()
cfg, err: = builder.
WithEthereumVersion(EthereumVersion_Eth2).
WithExecutionLayer(ExecutionLayer_Geth).
Build()
Since we support both eth1
(aka pre-Merge) and eth2
(aka post-Merge) client versions, you need to specify which one you want to use. You can do that by calling WithEthereumVersion
method. There's no default provided. The only exception is when you use custom docker images (instead of default ones), because then we can determine which version it is based on the image version.
If you want your test to execute as fast as possible go for eth1
since it's either using a fake PoW or PoA consensus and is much faster than eth2
which uses PoS consensus (where there is a minimum viable length of slot/block, which is 4 seconds; for eth1
it's 1 second). If you want to test the latest features, changes or forks in the Ethereum network and have your tests running on a network which is as close as possible to Ethereum Mainnet, go for eth2
.
Every component has some default Docker image it uses, but builder has a method that allows to pass custom one:
builder := NewEthereumNetworkBuilder()
cfg, err: = builder.
WithEthereumVersion(EthereumVersion_Eth2).
WithConsensusLayer(ConsensusLayer_Prysm).
WithExecutionLayer(ExecutionLayer_Geth).
WithCustomDockerImages(map[ContainerType]string{
ContainerType_Geth: "my-custom-geth-pos-image:my-version"}).
Build()
When using a custom image you can even further simplify the builder by calling only WithCustomDockerImages
method. Based on the image name and version we will determine which execution layer client it is and whether it's eth1
or eth2
client:
builder := NewEthereumNetworkBuilder()
cfg, err: = builder.
WithCustomDockerImages(map[ContainerType]string{
ContainerType_Geth: "ethereum/client-go:v1.13.10"}).
Build()
In the case above we would launch a Geth
client with eth2
network and Prysm
consensus layer.
You can also configure epochs at which hardforks will happen. Currently only Deneb
is supported. Epoch must be >= 1. Example:
builder := NewEthereumNetworkBuilder()
cfg, err: = builder.
WithConsensusType(ConsensusType_PoS).
WithConsensusLayer(ConsensusLayer_Prysm).
WithExecutionLayer(ExecutionLayer_Geth).
WithEthereumChainConfig(EthereumChainConfig{
HardForkEpochs: map[string]int{"Deneb": 1},
}).
Build()
Command line
You can start a simulated network with a single command:
go run docker/test_env/cmd/main.go start-test-env private-chain
By default it will start a network with 1 node running Geth
and Prysm
. It will use default chain id of 1337
and won't wait for the chain to finalize at least one epoch. Once the chain is started it will save the network configuration in a JSON
file, which then you can use in your tests to connect to that chain (and thus save time it takes to start a new chain each time you run your test).
Following cmd line flags are available:
-c, --chain-id int chain id (default 1337)
-l, --consensus-layer string consensus layer (prysm) (default "prysm")
-t, --consensus-type string consensus type (pow or pos) (default "pos")
-e, --execution-layer string execution layer (geth, nethermind, besu or erigon) (default "geth")
-w, --wait-for-finalization wait for finalization of at least 1 epoch (might take up to 5 minutes)
--consensus-client-image string custom Docker image for consensus layer client
--execution-layer-image string custom Docker image for execution layer client
--validator-image string custom Docker image for validator
To connect to that environment in your tests use the following code:
builder := NewEthereumNetworkBuilder()
cfg, err := builder.
WithExistingConfigFromEnvVar().
Build()
if err != nil {
return err
}
net, rpc, err := cfg.Start()
if err != nil {
return err
}
Builder will read the location of chain configuration from env var named PRIVATE_ETHEREUM_NETWORK_CONFIG_PATH
(it will be printed in the console once the chain starts).
net
is an instance of blockchain.EVMNetwork
, which contains characteristics of the network and can be used to connect to it using an EVM client. rpc
variable contains arrays of public and private RPC endpoints, where "private" means URL that's accessible from the same Docker network as the chain is running in.
Logs
By default, we will save logs of all Docker containers running on the host machine, when the test ends (regardless whether it failed or succeeded). They will be available in the ./logs/<test-name><date>
directory. Same goes for dumping the databases of PostgresDBs
used by the Chainlink nodes. These will be saves in the ./db_dumps/<test-name><date>
directory.
Loki and Grafana
If you need to pass Loki or Grafana configuration to your tests you can do that by providing the following config:
[Logging.Loki]
tenant_id="promtail"
url="https://change.me"
basic_auth_secret="my-secret-auth"
bearer_token_secret="bearer-token"
Also, do remember that different URL should be used when running in CI and everywhere else. In CI it should be a public endpoint, while in local environment it should be a private one.
If your test has a Grafana dashboard you should provide the following config:
[Logging.Grafana]
base_url="https://your-grafana-url"
dashboard_url="/my-dashboard"
dashboard_uid="my-dashboard-uid" # optional
Grouping test execution
When running tests in CI you're probably interested in grouping logs by test execution, so that you can easily find the logs in Loki. To do that your job should set RUN_ID
environment variable. In GHA it's recommended to set it to workflow id. If that variable is not set, then a run id will be automatically generated and saved in .run.id
file, so that it can be shared by tests that are part of the same execution, but are running in different processes.
Test Summary
In order to facilitate displaying information in GH's step summary testsummary
package was added. It exposes a single function AddEntry(testName, key string, value interface{})
. When you call it, it either creates a test summary JSON file or appends to it. The result is is a map of keys with values.
Example:
{
"file":[
{
"test_name":"TestOCRv2Basic",
"value":"./logs/TestOCRv2Basic-2023-12-01T18-00-59-TestOCRv2Basic-38ac1e52-d0a6-48"
}
],
"loki":[
{
"test_name":"TestOCRv2Basic",
"value":"https://grafana.ops.prod.cldev.sh/d/ddf75041-1e39-42af-aa46-361fe4c36e9e/ci-e2e-tests-logs?orgId=1\u0026var-run_id=TestOCRv2Basic-38ac1e52-d0a6-48\u0026var-container_id=cl-node-a179ca7d\u0026var-container_id=cl-node-76798f87\u0026var-container_id=cl-node-9ff7c3ae\u0026var-container_id=cl-node-43409b09\u0026var-container_id=cl-node-3b6810bd\u0026var-container_id=cl-node-69fed256\u0026from=1701449851165\u0026to=1701450124925"
}
]
}
In GHA after tests have ended we can use tools like jq
to extract the information we need and display it in step summary.
TOML Config
Basic and universal building blocks for TOML-based config are provided by config
package. For more information do read this.
ECR Mirror
An ecr mirror can be used to push images used often in order to bypass rate limit issues from dockerhub. The list of image mirrors can be found in the matrix here. This currently works with images with version numbers in dockerhub. Support for gcr is coming in the future. The images must also have a version number so putting latest
will not work. We have a separate list for one offs we want that can be added to here that does work with gcr and latest images. Note however for latest
it will only pull it once and will not update it in our mirror if the latest on the public repository has changed, in this case it is preferable to update it manually when you know that you need the new latest and the update will not break your tests.
For images in the mirrors you can use the INTERNAL_DOCKER_REPO environment variable when running tests and it will use that mirror in place of the public repository.
We have two ways to add new images to the ecr. The first two requirements are that you create the ecr repository with the same name as the one in dockerhub out in aws and then add that ecr to the infra permissions (ask TT if you don't know how to do this).
- If it does not have version numbers or is gcr then you can add it here
- You can add to the mirror matrix the new image name and an expression to get the latest versions added when the workflow runs. You can check the postgres one used in there for an example but basically the expression should filter out only the latest image or 2 for that particular version when calling the dockerhub endpoint, example curl call
curl -s "https://hub.docker.com/v2/repositories/${image_name}/tags/?page_size=100" | jq -r '.results[].name' | grep -E ${image_expression}
where image_name could belibrary/postgres
and image_expression could be'^[0-9]+\.[0-9]+$'
. Adding your ecr to this matrix should make sure we always have the latest versions for that expression.
Debugging HTTP and RPC calls
export SETH_LOG_LEVEL=info
export RESTY_DEBUG=true
Loki Client
The LokiClient
allows you to easily query Loki logs from your tests. It supports basic authentication, custom queries, and can be configured for (Resty) debug mode.
Debugging Resty and Loki Client
export LOKI_CLIENT_LOG_LEVEL=info
export RESTY_DEBUG=true
Example usage:
auth := LokiBasicAuth{
Username: os.Getenv("LOKI_LOGIN"),
Password: os.Getenv("LOKI_PASSWORD"),
}
queryParams := LokiQueryParams{
Query: `{namespace="test"} |= "test"`,
StartTime: time.Now().AddDate(0, 0, -1),
EndTime: time.Now(),
Limit: 100,
}
lokiClient := NewLokiClient("https://loki.api.url", "my-tenant", auth, queryParams)
logEntries, err := lokiClient.QueryLogs(context.Background())
Blockchain Clients
This documentation is deprecated, we are using it in Chainlink Integration Tests
If you want to test our new products use v2
This folder contains the bulk of code that handles integrating with different EVM chains. If you're looking to run tests on a new EVM chain, and are having issues with the default implementation, you've come to the right place.
Some Terminology
- L2 Chain: A Layer 2 chain "branching" off Ethereum.
- EVM: Ethereum Virtual Machine that underpins the Ethereum blockchain.
- EVM Compatible: A chain that has some large, underlying differences from how base Ethereum works, but can still be interacted with largely the same way as Ethereum.
- EIP-1559: The Ethereum Improvement Proposal that changed how gas fees are calculated and paid on Ethereum
- Legacy Transactions: Transactions that are sent using the old gas fee calculation method, the one used before EIP-1559.
- Dynamic Fee Transaction: Transactions that are sent using the new gas fee calculation method, the one used after EIP-1559.
How Client Integrations Work
In order to test Chainlink nodes, the chainlink-testing-framework
needs to be able to interact with the chain that the node is running on. This is done through the blockchain.EVMClient
interface. The EVMClient
interface is a wrapper around geth to interact with the blockchain. We conduct all our testing blockchain operations through this wrapper, like sending transactions and monitoring on-chain events. The primary implementation of this wrapper is built for Ethereum. Most others, like the Metis and Optimism integrations, extend and modify the base Ethereum implementation.
Do I Need a New Integration?
If you're reading this, probably. The default EVM integration is designed to work with mainnet Ethereum, which covers most other EVM chain interactions, but it's not guaranteed to work with all of them. If you're on a new chain and the test framework is throwing errors while doing basic things like send transactions, receive new headers, or deploy contracts, you'll likely need to create a new integration. The most common issue with new chains (especially L2s) is gas estimations and lack of support for dynamic transactions.
Creating a New Integration
Take a look at the Metis integration as an example. Metis is an L2, EVM compatible chain. It's largely the same as the base Ethereum integration, so we'll extend from that.
type MetisMultinodeClient struct {
*EthereumMultinodeClient
}
type MetisClient struct {
*EthereumClient
}
Now we need to let other libraries (like our tests in the main Chainlink repo) that this integration exists. So we add the new implementation to the known_networks.go file. We can then add that network to our tests' own known_networks.go file (it's annoying, there are plans to simplify).
Now our Metis integration is the exact same as our base Ethereum one, which doesn't do us too much good. I'm assuming you came here to make some changes, so first let's find out what we need to change. This is a mix of reading developer documentation on the chain you're testing and trial and error. Mostly the latter in later stages. In the case of Metis, like many L2s, they have their own spin on gas fees. They also only support Legacy transactions. So we'll need to override any methods that deal with gas estimations, Fund
, DeployContract
, and ReturnFunds
.
Concurrency
It's a small library that simplifies dividing N tasks between X workers with but two options:
- payload
- early exit on first error
It was created for parallelising chain interaction with Seth, where strict ordering and association of a given private key with a specific contract is required.
note
This library is an overkill if all you need to do is to deploy 10 contract, where each deployment
consists of a single transaction.
But... if your deployment flow is a multi-stepped one and it's crucial that all operations are executed
using the same private key (e.g. due to privilleged access) it might be a good pick. Especially, if
you don't want to extensively test a native WaitGroup
/ErrGroup
-based solution.
No payload
If the task to be executed requires no payload (or it's the same for each task) using the tool is much simpler.
First you need to create an instance of the executor:
l := logging.GetTestLogger(nil)
executor := concurrency.NewConcurrentExecutor[ContractIntstance, contractResult, concurrency.NoTaskType](l)
Where generic parameters represent (from left to right):
- type of execution result
- type of channel that holds the results
- type of task payload
In our case, we want the execution to return ContractInstance
s, that will be stored by this type:
type contractResult struct {
instance ContractIntstance
}
And which won't use any payload, as indicated by a no-op concurrency.NoTaskType
.
Then, we need to define a function that will be executed for each task. For example:
var deployContractFn = func(channel chan contractResult, errorCh chan error, executorNum int) {
keyNum := executorNum + 1 // key 0 is the root key
instance, err := client.deployContractFromKey(keyNum)
if err != nil {
errorCh <- err
return
}
channel <- contractResult{instance: instance}
}
It needs to have the following signature:
type SimpleTaskProcessorFn[ResultChannelType any] func(resultCh chan ResultChannelType, errorCh chan error, executorNum int)
and send results of successful execution to resultCh
and errors to errorCh
.
Once the processing function is defined all that's left is the execution:
results, err := executor.ExecuteSimple(client.getConcurrency(), numberOfContracts, deployContractFn)
Parameters for ExecuteSimple
(without payload) are as follows(from left to right):
- concurrency count (number of parallel executors)
- total number of executions
- function to execute
results
contain a slice with results of each execution with ContractInstance
type.
err
will be non-nil if any of the executions failed. To get all errors you should call executor.GetErrors()
.
With payload
If your tasks need payload, then two things change.
First, you need to pass task type, when creating the executor instance:
executor := concurrency.NewConcurrentExecutor[ContractIntstance, contractResult, contractConfiguration](l)
Here, it's set to dummy:
type contractConfiguration struct{}
Second, the signature of processing function:
type TaskProcessorFn[ResultChannelType, TaskType any] func(resultCh chan ResultChannelType, errorCh chan error, executorNum int, payload TaskType)
Which now includes a forth parameter representing the payload. And that function's implementation (making use of the payload).
note
You can find the full example here.
Client
We support a variety of clients that ease communication with following applications:
AWS Secrets Manager
This simple client makes it even easier to:
- read
- create
- remove secrets from AWS Secrets Manager.
Creating a new instance is straight-forward. You should either use environment variables or shared configuration and credentials.
note
Environment variables take precedence over shared credentials.
Using environment variables
You can pass required configuration as following environment variables:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_REGION
Using shared credentials
If you have shared credentials stored in .aws/credentials
file, then the easiest way to configure the client is by setting
AWS_PROFILE
environment variable with the profile name. If that environment variable is not set, the SDK will try to use default profile.
warning
Remember, that most probably you will need to manually create a new session for that profile before running your application.
note
You can read more about configuring the AWS SDK here.
Once you have an instance of AWS Secrets Manager you gain access to following functions:
CreateSecret(key string, val string, override bool) error
GetSecret(key string) (AWSSecret, error)
RemoveSecret(key string, noRecovery bool) error
Github
This small client makes it easy to get N
latest releases or tags from any Github.com repository. To use it, all you need to have
is a properly scoped access token.
publicRepoClient := NewGithubClient(WITHOUT_TOKEN)
// "smartcontractkit", "chainlink"
latestCLReleases, err := publicRepoClient.ListLatestCLCoreReleases(10)
if err != nil {
panic(err)
}
// "smartcontractkit", "chainlink"
latestCLTags, err := publicRepoClient.ListLatestCLCoreTags(10)
if err != nil {
panic(err)
}
privateRepoClient := NewGithubClient("my-secret-PAT")
myLatestReleases, err := privateRepoClient.ListLatestReleases("my-org", "my-private-repo", 5)
if err != nil {
panic(err)
}
There's really not much more to it...
Grafana
note
Contrary to other clients, you will find Grafana client in a separate package & go module.
Grafana client encapsulate following functionalities:
- Dashboard creation
- Managing dashboard annotations (CRUD)
- Checking alerts
New instance
In order to create a new instance you will need:
- URL
- API token
For example:
url := "http://grafana.io"
apiToken := "such-a-secret-1&11n"
gc := NewGrafanaClient(url, apiToken)
Dashboard creation
You can create a new dashboard defined in JSON with:
//define your dashboard here
dashboardJson := ``
request := PostDashboardRequest {
Dashboard: dashboardJson,
FolderId: 5 // change to your folder id
}
dr, rawResponse, err := gc.PostDashboard(request)
if err != nil {
panic(err)
}
if rawResponse.StatusCode() != 200 {
panic("response code wasn't 200, but " + rawResponse.StatusCode())
}
fmt.Println("Dashboard slug is is " + *dr.Slug)
Posting annotations
You can post annotations in a following way:
annDetails := PostAnnotation {
DashboardUID: "some-uid",
Time: time.Now(),
TimeEnd: time.Now().Add(1 * time.Second)
Text: "my test annotation"
}
r, rawResponse, err := gc.PostAnnotation(annDetails)
if rawResponse.StatusCode() != 200 {
panic("response code wasn't 200, but " + rawResponse.StatusCode())
}
fmt.Println("Created annotation with id: " + r.Id)
Checking alerts
You can check alerts firing for a dashboard with UID:
alerts, rawResponse, err := gc.AlertRulerClient.GetAlertsForDashboard("some-uid")
if rawResponse.StatusCode() != 200 {
panic("response code wasn't 200, but " + rawResponse.StatusCode())
}
for name, value := range alerts {
fmt.Println("Alert named " + name + "was triggered. Details: " + string(value))
}
Troubleshooting
To enable debug mode for the underlaying HTTP client set RESTY_DEBUG
environment variable to true
.
Kafka
This is a wrapper over HTTP client that can only return a list of topics from Kafka instance.
client, err := NewKafkaRestClient(&NewKafkaRestClient{URL: "my-kafka-url"})
if err != nil {
panic(err)
}
topis, err := client.GetTopics()
if err != nil {
panic(err)
}
for _, topic := range topics {
fmt.Println("topic: " + topic)
}
Loki
Loki client simplifies querying of Loki logs with LogQL
.
The way it's designed now implies that:
- you need to create a new client instance for each query
- query results are returned as
string
New instance
To create a new instance you to provide the following at the very least:
- Loki URL
- query to execute
- time range
// scheme is required
lokiUrl := "http://loki-host.io"
// can be empty
tenantId := "promtail"
basicAuth := LokiBasicAuth{
Login: "admin",
Password: "oh-so-secret",
}
queryParams := LokiQueryParams{
Query: "quantile_over_time(0.5, {name='my awesome app'} | json| unwrap duration [10s]) by name",
StartTime: time.Now().Add(1 * time.Hour),
EndTime: time.Now(),
Limit: 1000,
}
lokiClient := client.NewLokiClient(lokiUrl, tenantId, basicAuth, queryParams)
If your instance doesn't have basic auth you should use an empty string:
basicAuth := LokiBasicAuth{}
Executing a query
Once you have the client instance created you can execute the query with:
ctx, cancelFn := context.WithTimeout(context.Background, 3 * time.Minute)
defer cancelFn()
results, err := lokiClient.QueryLogs(ctx)
if err != nil {
panic(err)
}
for _, logEntry := range results {
fmt.Println("At " + logEntry.Timestamp + " found following log: " + logEntry.Log)
}
Log entry types
Loki can return various data types in responses to queries. We will try to convert the following ones to string
:
int
float64
If it's neither of these types nor a string
the client will return an error. Same will happen if nil
is returned.
Troubleshooting
If you find yourself in trouble these two environment variables might help you:
RESTY_DEBUG
set totrue
will enable debug mode for the underlaying HTTP clientLOKI_CLIENT_LOG_LEVEL
controls log level of Loki client (for supported log levels check logging package documentation)
MockServer
This client is reponsible for simplifying interaction with MockServer during test execution (e.g. to dynamically create or modify mocked responses).
Initialising
There are three ways of intializing the client:
- providing the URL
- providing pointer to
Environment
(CTF's representation of a k8s environment) - providing pointer to raw
MockserverConfig
var myk8sEnv *environment.Environment
// ... create k8s environment
k8sMockserver := ConnectMockServer(myk8sEnv)
mockServerUrl := "http://my.mockserver.instance.io"
myMockServer := ConnectMockServerURL(mockServerUrl)
customConfig := &MockserverConfig{
LocalURL: mockServerUrl,
ClusterURL: mockServerUrl,
Headers: map[string]string{"x-secret-auth-header": "such-a-necessary-auth-header"},
}
myCustomMockServerClient := NewMockserverClient(customConfig)
In most cases you should initialise it with *environment.Environment
, unless you need to pass custom headers to make the connection possible.
Typical usages
Arbitrary mock
You can set any desired behaviour by using PutExpectations(body interface{}) error
funciton, where the body should be a JSON string conforming to
MockServer's format that consists of a request matcher and corresponding action. For example:
var myk8sEnv *environment.Environment
returnOk := `{
"httpRequest": {
"method": "GET",
"path": "/status"
},
"httpResponse": {
"statusCode": 200,
"body": {
"message": "Service is running"
}
}
}`
ms := ConnectMockServer(myk8sEnv)
err := ms.PutExpectations(returnOk)
if err != nil {
panic(err)
}
note
You can read more about expecations syntax, including OpenAPI v3 or dynamic expecations with JavaScript here.
Returning a random integer
To return random integer in the response body, together with 200
status code for all requests with a given path, use:
err := ms.SetRandomValuePath("/api/v2/my_endpoint")
if err != nil {
panic(err)
}
Returning a static integer
To return a static integer in the response body, together with 200
status code for all requests with a given path, use:
err := ms.SetValuePath("/api/v2/my_endpoint")
if err != nil {
panic(err)
}
Returning a static string
To return a static string in the response body, together with 200
status code for all requests with a given path, use:
err := ms.SetStringValuePath("/api/v2/my_endpoint", "oh-my")
if err != nil {
panic(err)
}
Returning arbitrary data
To return arbitrary data in the response body, together with 200
status code for all requests with a given path, use:
err := ms.SetAnyValueResponse("/api/v2/my_endpoint", []int{1, 2, 3})
if err != nil {
panic(err)
}
Returning arbitrary response from external adatper
To mock a response for Chainlink's external adapter for all requests with a given path, use:
var mockedResult interface{}
mockedResult = 5
err := ms.SetAnyValuePath("/api/v2/my_endpoint", mockedResult)
if err != nil {
panic(err)
}
It will return a response with following structure:
{
"id": "",
"data": {
"result": 5
}
}
Troubleshooting
Enabling debug mode for the underlaying HTTP client can be achieved by setting RESTY_DEBUG
environment variable to true
.
Postgres Connector
Postgres Connect simplifies connecting to a Postgres DB, by either:
- providing it full db config
- providing it a pointer to
enviroment.Environment
Arbitrary DB
config := &PostgresConfig{
Host: "db-host.io",
Port: 5432,
User: "admin",
Password: "oh-so-secret",
DBName: "database",
//SSLMode:
}
connector, err := NewPostgresConnector(config)
if err != nil {
panic(err)
}
If no SSLMode
is supplied it will default to sslmode=disable
.
K8s
This code assumes it connects to k8s enviroment created with the CTF, where each Chainlink Node has an underlaying Postgres DB instance using CTF's default configuration:
var myk8sEnv *environment.Environment
// ... create k8s environment
node0PgClient, node0err := ConnectDB(0, myk8sEnv)
if node0err != nil {
panic(node0err)
}
node1PgClient, node1err := ConnectDB(1, myk8sEnv)
if node1err != nil {
panic(node1err)
}
Prometheus
This client is basically a wrapper over the official Prometheus client that gas three usesful functions:
- new client creation
- fetching of all firing alerts
- executing an arbitrary query with time range of
(-infinity, now)
New instance
c, err := NewPrometheusClient("prometheus.io")
if err != nil {
panic(err)
}
Get all firing alerts
alerts, err := c.GetAlerts()
if err != nil {
panic(err)
}
for _, alert := range alerts {
fmt.Println("Found alert: " + alert.Value + "in state: " + alert.AlertState)
}
Execute a query
queryResult, err := c.GetQuery(`100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[2m])) * 100)`)
if err != nil {
panic(err)
}
if asV, ok := queryResult.(.model.Vector); ok {
for _, v := range asV {
fmt.Println("Metric data: " +v.Metric)
fmt.Println("Value: " + v.Value)
}
} else {
panic(fmt.Sprintf("Result wasn't a model.Vector, but %T", queryResult))
}
Kubernetes
warning
It is highly recommended to use CRIB for k8s
deployments.
Avoid running long tests that are impractical to execute locally or through CI pipelines.
Proceed at your own risk.
Overview
The CTFv1
tool builds k8s
environments programmatically using either Helm
or cdk8s
charts. This approach introduces significant complexity to the deployment process.
To manage long-running tests, CTFv1
utilizes a remote runner
, which is essentially a Docker container containing the test logic. This container is deployed as a cdk8s
-based chart, creating a k8s
resource of type job
that runs the test in a detached manner. This setup requires custom logic to integrate with the test framework.
What We’ll Cover
- Creating a simplified
k8s
environment. - Adding a basic test that:
- Deploys a smart contract.
- Supports the
remote runner
capability.
- Building a Docker image for the test and configuring the required environment variables.
Are you ready to get started?
Kubernetes - Environments
As mentioned earlier, CTFv1
creates k8s
environments programmatically from existing building blocks. These include:
anvil
blockscout
(cdk8s)chainlink node
geth
goc
(cdk8s)grafana
influxdb
kafka
mock-adapter
mockserver
reorg controller
schema registry
solana validator
starknet validator
wiremock
Unless noted otherwise, all components are based on Helm
charts.
note
The process of creating new environments or modifying existing ones is explained in detail here. This document focuses on a practical example of creating a new k8s
test environment with a basic setup.
It is highly recommended to read that tutorial before proceeding.
Example: Basic Testing Environment
We will create a simple testing environment consisting of:
- 6
Chainlink nodes
- 1 blockchain node (
go-ethereum
, akageth
)
Step 1: Create Chainlink Node TOML Config
In real-world scenarios, you should dynamically generate or load Chainlink node configurations to suit your needs. For simplicity, we will use a hardcoded configuration:
func TestSimpleDONWithLinkContract(t *testing.T) {
tomlConfig := `[Feature]
FeedsManager = true
LogPoller = true
UICSAKeys = true
[Database]
MaxIdleConns = 20
MaxOpenConns = 40
MigrateOnStartup = true
[Log]
Level = "debug"
JSONConsole = true
[Log.File]
MaxSize = "0b"
[WebServer]
AllowOrigins = "*"
HTTPWriteTimeout = "3m0s"
HTTPPort = 6688
SecureCookies = false
SessionTimeout = "999h0m0s"
[WebServer.RateLimit]
Authenticated = 2000
Unauthenticated = 1000
[WebServer.TLS]
HTTPSPort = 0
[OCR]
Enabled = true
[P2P]
[P2P.V2]
ListenAddresses = ["0.0.0.0:6690"]
[[EVM]]
ChainID = "1337"
AutoCreateKey = true
FinalityDepth = 1
MinContractPayment = "0"
[EVM.GasEstimator]
PriceMax = "200 gwei"
LimitDefault = 6000000
FeeCapDefault = "200 gwei"
[[EVM.Nodes]]
Name = "Simulated Geth-0"
WSURL = "ws://geth:8546"
HTTPURL = "http://geth:8544"`
This configuration enables the log poller and OCRv2 features while connecting to an EVM chain with ChainID
1337
. It uses the following RPC URLs:
- WebSocket:
ws://geth:8546
- HTTP:
http://geth:8544
These URLs correspond to the default ports for geth
and match the go-ethereum
service name in the k8s
cluster.
Step 2: Define the Chainlink Deployment
To define the Chainlink deployment, we configure the image, version, and other parameters such as replicas and database settings. Here's the detailed implementation:
chainlinkImageCfg := &ctf_config.ChainlinkImageConfig{
Image: ptr.Ptr("public.ecr.aws/chainlink/chainlink"),
Version: ptr.Ptr("2.19.0"),
}
var overrideFn = func(_ interface{}, target interface{}) {
ctf_config.MustConfigOverrideChainlinkVersion(chainlinkImageCfg, target)
}
cd := chainlink.NewWithOverride(0, map[string]any{
"replicas": 6, // Number of Chainlink nodes
"toml": tomlConfig, // TOML configuration defined earlier
"db": map[string]any{
"stateful": true, // Use stateful databases for tests
},
}, chainlinkImageCfg, overrideFn)
Key Details:
- Image and Version: These are hardcoded here for simplicity but should ideally be configurable for different environments.
- Replicas: We specify 6 Chainlink nodes to simulate a multi-node setup.
- Database Configuration: The database is stateful to allow for persistence during soak tests.
- Override Function: This ensures that the specified image and version are applied to all Chainlink node deployments.
Step 3: Label Resources
To track costs effectively, add required chain.link
labels to all k8s
resources:
productName := "data-feedsv1.0"
nsLabels, err := environment.GetRequiredChainLinkNamespaceLabels(productName, "soak")
if err != nil {
t.Fatal("Error creating namespace labels", err)
}
workloadPodLabels, err := environment.GetRequiredChainLinkWorkloadAndPodLabels(productName, "soak")
if err != nil {
t.Fatal("Error creating workload and pod labels", err)
}
Set the following environment variables:
CHAINLINK_ENV_USER
: Name of the person running the test.CHAINLINK_USER_TEAM
: Name of the team the test is for.
Step 4: Create Environment Config
baseEnvironmentConfig := &environment.Config{
TTL: time.Hour * 2,
NamespacePrefix: "my-namespace-prefix",
Test: t,
PreventPodEviction: true,
Labels: nsLabels,
WorkloadLabels: workloadPodLabels,
PodLabels: workloadPodLabels,
}
Key Fields:
TTL
: Time-to-live for the namespace (auto-removal after this time).NamespacePrefix
: Ensures unique namespace names.PreventPodEviction
: Prevents pods from being evicted or restarted.
Step 5: Define Blockchain Network
To set up the blockchain network, we use predefined properties for a simulated EVM network. Here's the detailed implementation:
nodeNetwork := blockchain.SimulatedEVMNetwork
ethProps := ðereum.Props{
NetworkName: nodeNetwork.Name, // Name of the network
Simulated: nodeNetwork.Simulated, // Indicates that the network is simulated
WsURLs: nodeNetwork.URLs, // WebSocket URLs for the network
HttpURLs: nodeNetwork.HTTPURLs, // HTTP URLs for the network
}
Details:
- Simulated Network: Represents a private, ephemeral blockchain used for testing.
- Dynamic Selection: In real scenarios, use helper functions to dynamically select networks (public, private, or simulated) based on test requirements.
- Custom URLs: The
ethereum
chart requires explicit settings for the network name and URLs.
Step 6: Build the Environment
testEnv := environment.New(baseEnvironmentConfig).
AddHelm(ethereum.New(ethProps)). // Blockchain node
AddHelm(cd) // Chainlink nodes
err = testEnv.Run()
if err != nil {
t.Fatal("Error running environment", err)
}
Step 7: Create Blockchain Client
if !testEnv.Cfg.InsideK8s {
wsURLs := testEnv.URLs[blockchain.SimulatedEVMNetwork.Name]
httpURLs := testEnv.URLs[blockchain.SimulatedEVMNetwork.Name+"_http"]
if len(wsURLs) == 0 || len(httpURLs) == 0 {
t.Fatal("Forwarded Geth URLs should not be empty")
}
nodeNetwork.URLs = wsURLs
nodeNetwork.HTTPURLs = httpURLs
}
sethClient, err := seth.NewClientBuilder().
WithRpcUrl(nodeNetwork.URLs[0]).
WithPrivateKeys([]string{nodeNetwork.PrivateKeys[0]}).
Build()
if err != nil {
t.Fatal("Error creating Seth client", err)
}
Details:
- Local vs. Cluster Environment: When running tests outside the k8s cluster, the service URLs (
ws://geth:8546
,http://geth:8544
) are not directly accessible. Port forwarding ensures local access to these services. - Automatic Port Forwarding: The
Environment
object manages forwarding for key services, including Geth in simulated mode, making these forwarded URLs available in theURLs
map. - Dynamic Rewriting: URLs are dynamically rewritten to switch between in-cluster and local connectivity.
Step 8: Deploy LINK Contract
linkTokenAbi, err := link_token_interface.LinkTokenMetaData.GetAbi()
if err != nil {
t.Fatal("Error getting LinkToken ABI", err)
}
linkDeploymentData, err := sethClient.DeployContract(sethClient.NewTXOpts(), "LinkToken", *linkTokenAbi, common.FromHex(link_token_interface.LinkTokenMetaData.Bin))
if err != nil {
t.Fatal("Error deploying LinkToken contract", err)
}
linkToken, err := link_token_interface.NewLinkToken(linkDeploymentData.Address, sethClient.Client)
if err != nil {
t.Fatal("Error creating LinkToken contract instance", err)
}
totalSupply, err := linkToken.TotalSupply(sethClient.NewCallOpts())
if err != nil {
t.Fatal("Error getting total supply of LinkToken", err)
}
if totalSupply.Cmp(big.NewInt(0)) <= 0 {
t.Fatal("Total supply of LinkToken should be greater than 0")
}
Details:
- Deploy Contract: Deploys the LINK token contract to the simulated blockchain.
- Verify Deployment: Ensures the total supply is greater than zero as a sanity check.
Next Steps
Learn how to run long-duration tests using a remote runner
in the next chapter.
note
This example can be found here.
Kubernetes - Using Remote Runner
In this chapter, we explain how to run a test in k8s
and the changes required in the test logic to support it.
Overview
The general process of running tests with a remote runner
involves:
- Creating a
k8s
environment from a local machine (either your local machine or a CI runner).- The environment launches a
remote runner
using theENV_JOB_IMAGE
environment variable.
- The environment launches a
- The
remote runner
re-executes the same test code from the beginning.- It detects that the environment is already deployed and skips redeploying it.
- After the
remote runner
completes its test execution, control returns to the local test execution.- The local test exits early to prevent duplicate execution.
- If running in
detached mode
, control returns to the local test as soon as the remote test starts.- The local test exits immediately.
- The
remote runner
continues running ink8s
until the test completes.
Following diagram explains it in a visual way:
sequenceDiagram
actor User as User
participant LocalTest as Local Test
participant K8sEnv as Kubernetes Environment
participant RemoteRunner as Remote Runner
User->>LocalTest: Start test execution
LocalTest->>K8sEnv: Create environment
K8sEnv-->>LocalTest: Environment created
LocalTest->>K8sEnv: Start remote runner job
K8sEnv-->>RemoteRunner: Deploy remote runner
alt DETACHED_MODE=true
RemoteRunner-->>K8sEnv: Begin test logic execution
LocalTest-->>User: Exit after remote runner starts
Note right of LocalTest: Detaches after<br>remote runner starts
else DETACHED_MODE=false
RemoteRunner-->>K8sEnv: Begin test logic execution
RemoteRunner->>K8sEnv: Complete test logic execution
K8sEnv-->>LocalTest: Notify test completion
LocalTest-->>User: Exit after test completion
end
Note right of RemoteRunner: Executes the test<br>logic independently
Although this may seem complex, the necessary changes to the test logic ensure that tests execute correctly without duplications. In the following steps, we explain where and why you should add conditionals and early exits to prevent unwanted re-execution.
note
We are developing a new k8s
test runner that eliminates the need for k8s
-specific logic. This will allow tests to run seamlessly both locally and remotely. Documentation for this is available here.
Requirements
The remote runner
requires a Docker image containing your test code. There are multiple ways to build this image, some of which are automated in our repositories. For this documentation, we will build it from scratch.
note
The CTF repository builds a base testing image for each release using this action. This base image includes kubectl
, helm
, and other dependencies. Use it as a base image for your final test image. You only need to copy your compiled Go tests and set the entrypoint to execute them. An example from the Chainlink repository can be found here.
Step 1: Build a Docker Image with Your Tests
Define a Dockerfile
:
# Base image for all k8s test runs
FROM golang:1.23-bullseye
ARG GOARCH
ARG GOOS
ARG BASE_URL
ARG HELM_VERSION
ARG HOME
ARG KUBE_VERSION
ARG NODE_VERSION
# Compile Go binary targeting linux/amd64, used by k8s runners
ENV GOOS="linux"
ENV GOARCH="amd64"
ENV BASE_URL="https://get.helm.sh"
ENV HELM_VERSION="3.10.3"
ENV KUBE_VERSION="v1.25.5"
ENV NODE_VERSION=18
# Install dependencies
RUN apt-get update && apt-get install -y \
ca-certificates wget curl git gnupg zip && \
mkdir -p /etc/apt/keyrings && \
curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key | gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg && \
echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_$NODE_VERSION.x nodistro main" | tee /etc/apt/sources.list.d/nodesource.list && \
apt-get update && apt-get install -y nodejs && \
curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl && \
chmod +x ./kubectl && mv ./kubectl /usr/local/bin && \
ARCH=$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/') && \
wget ${BASE_URL}/helm-v${HELM_VERSION}-linux-${ARCH}.tar.gz -O - | tar -xz && \
mv linux-${ARCH}/helm /usr/bin/helm && chmod +x /usr/bin/helm && rm -rf linux-${ARCH} && \
npm install -g yarn && apt-get clean && \
helm repo add chainlink-qa https://raw.githubusercontent.com/smartcontractkit/qa-charts/gh-pages/ && \
helm repo add bitnami https://charts.bitnami.com/bitnami && helm repo update
# Install AWS CLI v2
RUN ARCH=$(uname -m | sed 's/x86_64/x86_64/;s/aarch64/aarch64/') && \
curl https://awscli.amazonaws.com/awscli-exe-linux-${ARCH}.zip -o "awscliv2.zip" && \
unzip awscliv2.zip && ./aws/install && rm -rf awscliv2.zip
COPY lib/ testdir/
# Compile Go tests
WORKDIR /go/testdir/k8s/examples/link
RUN go test -c . -o link
# Entrypoint to run tests
ENTRYPOINT ["./link"]
Installed Dependencies:
kubectl
(to interact withk8s
)nodejs
helm
(for environment setup)AWS CLI v2
(if interacting with AWSk8s
clusters)
note
If using local Helm charts instead of published ones, set LOCAL_CHARTS=true
. Local here means onces stored in the Docker image
in the lib/charts
folder.
Step 2: Build the Image
Build the Docker image from the root directory of the CTF repository:
docker build -f lib/k8s/examples/link/Dockerfile \
--platform=linux/amd64 \
-t link-test:latest .
note
The --platform=linux/amd64
parameter ensures compatibility with k8s runners.
Step 3: Test the Image
Before modifying the test logic for remote runner
compatibility, test the image locally. Ensure you have access to a k8s
cluster (local or remote). Configure kubectl
to use the cluster and authenticate if necessary.
Run the test image:
docker run \
--rm \
-v ~/.aws:/root/.aws:ro \
-v ~/.kube/config:/root/.kube/config:ro \
-e AWS_PROFILE=<your-profile> \
-e KUBECONFIG=/root/.kube/config \
link-test:latest
This mounts the local .aws
directory and kubectl
configuration to the container, allowing it to access the cluster. Verify the test completes successfully.
Step 4: Make the Test Remote Runner
-Compatible
To adapt the test for remote runner
execution, you need to divide the test logic into:
- Local Execution Logic: Responsible for setting up the environment and initiating the
remote runner
. - Remote Execution Logic: Handles the actual test operations, such as deploying contracts or generating workload.
Key Adjustments
- Check for Remote Execution: Use the
testEnv.WillUseRemoteRunner()
function to determine if the test will run in theremote runner
. If it will, ensure any non-idempotent operations, like test logic execution, are skipped in the local context. - Prevent Test Logic Duplication: Exit the local test execution after initiating the remote runner to avoid running the test logic both locally and remotely.
Updated Example
Here is how you can make these changes:
err = testEnv.Run()
if err != nil {
t.Fatal("Error running environment: ", err)
}
if testEnv.WillUseRemoteRunner() {
log.Info().Msg("Stopping local execution as test will continue in the remote runner")
return // Exit early to allow the remote runner to take over
}
// Additional test logic (if applicable) runs only in the remote runner
Why These Changes Are Necessary
- Environment Duplication: Ensures that test logic is executed only once, within the
remote runner
, and not both locally and remotely.
Step 5: Rebuild the image
Rebuild the image to include remote runner
-related changes.
Step 6: Push Image to Registry (Optional)
If using a remote cluster, push the image to a registry like AWS ECR:
aws ecr get-login-password --region <AWS_REGION> | docker login \
--username AWS --password-stdin <AWS_ACCOUNT>.dkr.ecr.<AWS_REGION>.amazonaws.com
docker tag link-test:latest \
<AWS_ACCOUNT>.dkr.ecr.<AWS_REGION>.amazonaws.com/<AWS_REPOSITORY>/link-remote-runner-test:latest
docker push <AWS_ACCOUNT>.dkr.ecr.<AWS_REGION>.amazonaws.com/<AWS_REPOSITORY>/link-remote-runner-test:latest
Step 7: Run with Remote Runner
Run the test in detached mode
by setting these environment variables:
DETACH_RUNNER=true
ENV_JOB_IMAGE=<image-url>
(one we created in step 5)
Command:
docker run \
--rm \
-v ~/.aws:/root/.aws:ro \
-v ~/.kube/config:/root/.kube/config:ro \
-e DETACH_RUNNER=true \
-e ENV_JOB_IMAGE=<image-url> \
-e AWS_PROFILE=<your-profile> \
-e KUBECONFIG=/root/.kube/config \
<image-url>
note
You may also need to pass secrets to your test. Refer to the Test Secrets Documentation for guidance on securely managing and injecting secrets into your tests.
The local test detaches after starting the remote runner. Logs from the remote runner can be checked for test progress.
You can find the complete example here.
Kubernetes - Test Secrets
We all have our secrets, don't we? It's the same case with tests... and since some of our repositories are public, we need to take special precautions to protect them.
warning
Before continuing, you should read the test secrets section of the CTF configuration documentation.
Overview
In general, your remote runner
will need access to the same secrets as your local test. Fortunately, these secrets are forwarded automatically and securely as long as their names have the prefix E2E_TEST_
.
To make a secret available to the remote runner
, simply pass it to the docker run
command:
docker run \
--rm \
-v ~/.aws:/root/.aws:ro \
-v ~/.kube/config:/root/.kube/config:ro \
-e DETACH_RUNNER=true \
-e E2E_TEST_MY_SECRET=my-secret \
-e ENV_JOB_NAME="<image-url>" \
-e AWS_PROFILE=<your-profile> \
-e KUBECONFIG=/root/.kube/config \
<image-url>
The secret will then be available to the remote runner
during its execution.
Important Considerations
warning
Do not use this method of passing secrets in CI environments. Exposing secrets in this way can compromise their security.
When running k8s
tests in CI pipelines, use dedicated actions or reusable workflows designed to handle secrets securely.
k8s chain.link
Labels
Purpose
Resource labeling has been introduced to better associate Kubernetes (k8s) costs with products and teams. This document describes the labels used in the k8s cluster.
Required Labels
Labels should be applied to all resources in the k8s cluster at three levels:
- Namespace
- Workload
- Pod
All three levels should include the following labels:
chain.link/team
- Name of the team that owns the resource.chain.link/product
- Product that the resource belongs to.chain.link/cost-center
- Product and framework name.
Additionally, pods should include the following label:
chain.link/component
- Name of the component.
chain.link/team
This label represents the team responsible for the resource, but it might not be the team of the individual who created the resource. It should reflect the team the environment is created for.
For example, if you are a member of the Test Tooling team, but someone from the BIX team requests load tests, the namespace should be labeled as: chain.link/team: bix
.
chain.link/product
This label specifies the product the resource belongs to. Internally, some products may have alternative names (e.g., OCR instead of Data Feeds). To standardize data analysis, use the following names:
automation
bcm
ccip
data-feedsv1.0
data-feedsv2.0
data-feedsv3.0
data-streamsv0.3
data-streamsv1.0
deco
functions
proof-of-reserve
scale
staking
vrf
For example:
- OCR version 1:
data-feedsv1.0
- OCR version 2:
data-feedsv2.0
chain.link/cost-center
This label serves as an umbrella for specific test or environment types and should rarely change. For load or soak tests using solutions provided by the Test Tooling team, use the convention: test-tooling-<test-type>-test
For example: test-tooling-load-test
.
This allows easy distinction from load tests run using other tools.
chain.link/component
This label identifies different components within the same product. Examples include:
chainlink
- Chainlink node.geth
- Go-Ethereum blockchain node.test-runner
- Remote test runner.
Adding Labels to New Components
Adding a new component to an existing framework is discouraged. The recommended approach is to add the component to CRIB and make these labels part of the deployment templates.
If you need to add a new component, refer to the following sections in the k8s Tutorial:
- Creating a new deployment part in Helm
- Creating a new deployment part in cdk8s
Kubernetes
Managing k8s is challenging, so we've decided to separate k8s
deployments here - CRIB
This documentation is outdated, and we are using it only internally to run our soak tests. For v2
tests please check this example and read CRIB docs
We run our software in Kubernetes.
Local k3d setup
make install
- (Optional) Install
Lens
from here or usek9s
as a low resource consumption alternative from here or from source here - Setup your docker resources, 6vCPU/10Gb RAM are enough for most CL related tasks
make create_cluster
make install_monitoring
Note: this will be actively connected to the server, the final log when it is ready isForwarding from [::1]:3000 -> 3000
and you can continue with the steps below in another terminal.- Check your contexts with
kubectl config get-contexts
- Switch context
kubectl config use-context k3d-local
- Read here and do some deployments
- Open Grafana on
localhost:3000
withadmin/sdkfh26!@bHasdZ2
login/password and check the default dashboard make stop_cluster
make delete_cluster
Typical problems
- Not enough memory/CPU or cluster is slow
Recommended settings for Docker are (Docker -> Preferences -> Resources):
- 6 CPU
- 10Gb MEM
- 50-150Gb Disk
NodeHasDiskPressure
errors, pods get evicted Usemake docker_prune
to clean up all pods and volumes
How to run the same environment deployment inside k8s
Managing k8s is challenging, so we've decided to separate k8s
deployments here - CRIB
This documentation is outdated, and we are using it only internally to run our soak tests. For v2
tests please check this example and read CRIB docs
You can build a Dockerfile
to run exactly the same environment interactions inside k8s in case you need to run long-running tests
Base image is here
FROM <account number>.dkr.ecr.us-west-2.amazonaws.com/test-base-image:latest
COPY . .
RUN env GOOS=linux GOARCH=amd64 go build -o test ./examples/remote-test-runner/env.go
RUN chmod +x ./test
ENTRYPOINT ["./test"]
Build and upload it using the "latest" tag for the test-base-image
build_test_image tag=someTag
or if you want to specify a test-base-image tag
build_test_image tag=someTag base_tag=latest
Then run it
# all environment variables with a prefix TEST_ would be provided for k8s job
export TEST_ENV_VAR=myTestVarForAJob
# your image to run as a k8s job
ACCOUNT=$(aws sts get-caller-identity | jq -r .Account)
export ENV_JOB_IMAGE="${ACCOUNT}.dkr.ecr.us-west-2.amazonaws.com/core-integration-tests:v1.1"
export DETACH_RUNNER=true # if you want the test job to run in the background after it has started
export CHAINLINK_ENV_USER=yourUser # user to run the tests
export CHAINLINK_USER_TEAM=yourTeam # team to run the tests for
# your example test file to run inside k8s
# if ENV_JOB_IMAGE is present it will create a job, wait until it finished and get logs
go run examples/remote-test-runner/env.go
How to create environments
Managing k8s is challenging, so we've decided to separate k8s
deployments here - CRIB
This documentation is outdated, and we are using it only internally to run our soak tests. For v2
tests please check this example and read CRIB docs
- Getting started
- Connect to environment
- Creating environments
- Modifying environments
- Configuring
- Utilities
- Chaos
- Coverage
- Remote run
Getting started
Read here about how to spin up a local cluster if you don't have one.
Following examples will use hardcoded chain.link
labels for the sake of satisfying validations. When using any of remote clusters you should
provide them with actual and valid values, for example using following convenience functions:
nsLabels, err := GetRequiredChainLinkNamespaceLabels("my-product", "load")
require.NoError(t, err, "Error creating required chain.link labels for namespace")
workloadPodLabels, err := GetRequiredChainLinkWorkloadAndPodLabels("my-product", "load")
require.NoError(t, err, "Error creating required chain.link labels for workloads and pods")
And then setting them in the Environment
config:
envConfig := &environment.Config{
Labels: nsLabels,
WorkloadLabels: workloadPodLabels
PodLabels: workloadPodLabels
NamespacePrefix: "new-environment",
}
Now, let's create a simple environment by combining different deployment parts.
Create examples/simple/env.go
package main
import (
"fmt"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/environment"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/chainlink"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/ethereum"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/mockserver"
)
func addHardcodedLabelsToEnv(env *environment.Config) {
env.Labels = []string{"chain.link/product=myProduct", "chain.link/team=my-team", "chain.link/cost-center=test-tooling-load-test"}
env.WorkloadLabels = map[string]string{"chain.link/product": "myProduct", "chain.link/team": "my-team", "chain.link/cost-center": "test-tooling-load-test"}
env.PodLabels = map[string]string{"chain.link/product": "myProduct", "chain.link/team": "my-team", "chain.link/cost-center": "test-tooling-load-test"}
}
func main() {
env := &environment.Config{
NamespacePrefix: "new-environment",
KeepConnection: false,
RemoveOnInterrupt: false,
}
addHardcodedLabelsToEnv(env)
err := environment.New(env).
AddHelm(ethereum.New(nil)).
AddHelm(chainlink.New(0, nil)).
Run()
if err != nil {
panic(err)
}
}
Then run go run examples/simple/env.go
Now you have your environment running, you can connect to it later
note
chain.link/*
labels are used for internal reporting and cost allocation. They are strictly required and validated. You won't be able to create a new environment without them.
In this tutorial we create almost all of them manually, but there are convenience functions to do it for you.
You can read more about labels here
Connect to environment
We've already created an environment previously, now we can connect
If you are planning to use environment locally not in tests and keep connection, modify KeepConnection
in environment.Config
we used
KeepConnection: true,
Add ENV_NAMESPACE=${your_env_namespace}
var and run go run examples/simple/env.go
again
You can get the namespace name from logs on creation time
Creating environments
Debugging a new integration environment
You can spin up environment and block on forwarder if you'd like to run some other code. Let's use convenience functions for creating chain.link
labels.
package main
import (
"github.com/smartcontractkit/chainlink-testing-framework/k8s/environment"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/chainlink"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/ethereum"
)
func main() {
env := &environment.Config{
NamespacePrefix: "new-environment",
KeepConnection: true,
RemoveOnInterrupt: true,
}
addHardcodedLabelsToEnv(env)
err := environment.New(env).
AddHelm(ethereum.New(nil)).
AddHelm(chainlink.New(0, nil)).
Run()
if err != nil {
panic(err)
}
}
Send any signal to remove the namespace then, for example Ctrl+C
SIGINT
Creating a new deployment part in Helm
Let's add a new deployment part, it should implement an interface
// ConnectedChart interface to interact both with cdk8s apps and helm charts
type ConnectedChart interface {
// IsDeploymentNeeded
// true - we deploy/connect and expose environment data
// false - we are using external environment, but still exposing data
IsDeploymentNeeded() bool
// GetName name of the deployed part
GetName() string
// GetPath get Helm chart path, repo or local path
GetPath() string
// GetProps get code props if it's typed environment
GetProps() any
// GetValues get values.yml props as map, if it's Helm
GetValues() *map[string]any
// ExportData export deployment part data in the env
ExportData(e *Environment) error
// GetLabels get labels for component, it must return `chain.link/component` label
GetLabels() map[string]string
}
When creating new deployment part, you can use any public Helm chart or a local path in Helm props
func New(props *Props) environment.ConnectedChart {
if props == nil {
props = defaultProps()
}
return Chart{
HelmProps: &HelmProps{
Name: "sol",
Path: "chainlink-qa/solana-validator", // ./local_path/chartdir will work too
Values: &props.Values,
},
Props: props,
}
}
func (m NewDeploymentPart) GetLabels() map[string]string {
return map[string]string{
"chain.link/component": "new-deployment-part",
}
}
Now let's tie them together
package main
import (
"github.com/smartcontractkit/chainlink-testing-framework/k8s/environment"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/examples/deployment_part"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/chainlink"
"time"
)
func main() {
env := &environment.Config{
NamespacePrefix: "adding-new-deployment-part",
TTL: 3 * time.Hour,
KeepConnection: true,
RemoveOnInterrupt: true,
}
addHardcodedLabelsToEnv(env)
e := environment.New(env).
AddHelm(deployment_part.New(nil)).
AddHelm(chainlink.New(0, map[string]any{
"replicas": 5,
"env": map[string]any{
"SOLANA_ENABLED": "true",
"EVM_ENABLED": "false",
"EVM_RPC_ENABLED": "false",
"CHAINLINK_DEV": "false",
"FEATURE_OFFCHAIN_REPORTING2": "true",
"feature_offchain_reporting": "false",
"P2P_NETWORKING_STACK": "V2",
"P2PV2_LISTEN_ADDRESSES": "0.0.0.0:6690",
"P2PV2_DELTA_DIAL": "5s",
"P2PV2_DELTA_RECONCILE": "5s",
"p2p_listen_port": "0",
},
}))
if err := e.Run(); err != nil {
panic(err)
}
}
Then run it examples/deployment_part/cmd/env.go
Creating a new deployment part in cdk8s
Let's add a new deployment part, it should implement the same interface
// ConnectedChart interface to interact both with cdk8s apps and helm charts
type ConnectedChart interface {
// IsDeploymentNeeded
// true - we deploy/connect and expose environment data
// false - we are using external environment, but still exposing data
IsDeploymentNeeded() bool
// GetName name of the deployed part
GetName() string
// GetPath get Helm chart path, repo or local path
GetPath() string
// GetProps get code props if it's typed environment
GetProps() any
// GetValues get values.yml props as map, if it's Helm
GetValues() *map[string]any
// ExportData export deployment part data in the env
ExportData(e *Environment) error
// GetLabels get labels for component, it must return `chain.link/component` label
GetLabels() map[string]string
}
Now let's tie them together
package main
import (
"github.com/smartcontractkit/chainlink-testing-framework/k8s/environment"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/examples/deployment_part_cdk8s"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/chainlink"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/ethereum"
)
func main() {
env := &environment.Config{
NamespacePrefix: "adding-new-deployment-part",
TTL: 3 * time.Hour,
KeepConnection: true,
RemoveOnInterrupt: true,
}
addHardcodedLabelsToEnv(env)
e := environment.New(env).
AddChart(deployment_part_cdk8s.New(&deployment_part_cdk8s.Props{})).
AddHelm(ethereum.New(nil)).
AddHelm(chainlink.New(0, map[string]any{
"replicas": 2,
}))
if err := e.Run(); err != nil {
panic(err)
}
e.Shutdown()
}
Then run it examples/deployment_part_cdk8s/cmd/env.go
Using multi-stage environment
You can split environment deployment in several parts if you need to first copy something into a pod or use connected clients first
package main
import (
"fmt"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/environment"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/cdk8s/blockscout"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/chainlink"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/ethereum"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/mockserver"
mockservercfg "github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/mockserver-cfg"
)
func main() {
envConfig := &environment.Config{
Labels: []string{"chain.link/product=myProduct", "chain.link/team=my-team", "chain.link/cost-center=test-tooling-load-test"},
WorkloadLabels: map[string]string{"chain.link/product": "myProduct", "chain.link/team": "my-team", "chain.link/cost-center": "test-tooling-load-test"},
PodLabels: map[string]string{"chain.link/product": "myProduct", "chain.link/team": "my-team", "chain.link/cost-center": "test-tooling-load-test"}
}
e := environment.New(envConfig)
err := e.
AddChart(blockscout.New(&blockscout.Props{})). // you can also add cdk8s charts if you like Go code
AddHelm(ethereum.New(nil)).
AddHelm(chainlink.New(0, nil)).
Run()
if err != nil {
panic(err)
}
// do some other stuff with deployed charts
pl, err := e.Client.ListPods(e.Cfg.Namespace, "app=chainlink-0")
if err != nil {
panic(err)
}
dstPath := fmt.Sprintf("%s/%s:/", e.Cfg.Namespace, pl.Items[0].Name)
if _, _, _, err = e.Client.CopyToPod(e.Cfg.Namespace, "./examples/multistage/someData.txt", dstPath, "node"); err != nil {
panic(err)
}
// deploy another part
err = e.
AddHelm(mockservercfg.New(nil)).
AddHelm(mockserver.New(nil)).
Run()
defer func() {
errr := e.Shutdown()
panic(errr)
}()
if err != nil {
panic(err)
}
}
Modifying environments
Modifying environment from code
In case you need to modify environment in tests you can always construct manifest again and apply it
That's working for cdk8s
components too
package main
import (
"fmt"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/environment"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/cdk8s/blockscout"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/chainlink"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/ethereum"
)
func main() {
modifiedEnvConfig := &environment.Config{
NamespacePrefix: "modified-env",
Labels: []string{"envType=Modified", "chain.link/product=myProduct", "chain.link/team=my-team", "chain.link/cost-center=test-tooling-load-test"},
WorkloadLabels: map[string]string{"chain.link/product": "myProduct", "chain.link/team": "my-team", "chain.link/cost-center": "test-tooling-load-test"},
PodLabels: map[string]string{"chain.link/product": "myProduct", "chain.link/team": "my-team", "chain.link/cost-center": "test-tooling-load-test"}
}
e := environment.New(modifiedEnvConfig).
AddChart(blockscout.New(&blockscout.Props{
WsURL: "ws://geth:8546",
HttpURL: "http://geth:8544",
})).
AddHelm(ethereum.New(nil)).
AddHelm(chainlink.New(0, map[string]any{
"replicas": 1,
}))
err := e.Run()
if err != nil {
panic(err)
}
e.ClearCharts()
err = e.
AddChart(blockscout.New(&blockscout.Props{
HttpURL: "http://geth:9000",
})).
AddHelm(ethereum.New(nil)).
AddHelm(chainlink.New(0, map[string]any{
"replicas": 1,
})).
Run()
if err != nil {
panic(err)
}
}
Modifying environment part from code
We can modify only a part of environment
package main
import (
"fmt"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/environment"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/chainlink"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/ethereum"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/mockserver"
mockservercfg "github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/mockserver-cfg"
)
func main() {
modifiedEnvConfig := &environment.Config{
NamespacePrefix: "modified-env",
}
addHardcodedLabelsToEnv(modifiedEnvConfig)
e := environment.New(modifiedEnvConfig).
AddHelm(mockservercfg.New(nil)).
AddHelm(mockserver.New(nil)).
AddHelm(ethereum.New(nil)).
AddHelm(chainlink.New(0, map[string]any{
"replicas": 1,
}))
err := e.Run()
if err != nil {
panic(err)
}
e.Cfg.KeepConnection = true
e.Cfg.RemoveOnInterrupt = true
e, err = e.
ReplaceHelm("chainlink-0", chainlink.New(0, map[string]any{
"replicas": 2,
}))
if err != nil {
panic(err)
}
err = e.Run()
if err != nil {
panic(err)
}
}
Configuring
Environment variables
List of environment variables available
const (
EnvVarNamespace = "ENV_NAMESPACE"
EnvVarNamespaceDescription = "Namespace name to connect to"
EnvVarNamespaceExample = "chainlink-test-epic"
// deprecated (for now left for backwards compatibility)
EnvVarCLImage = "CHAINLINK_IMAGE"
EnvVarCLImageDescription = "Chainlink image repository"
EnvVarCLImageExample = "public.ecr.aws/chainlink/chainlink"
// deprecated (for now left for backwards compatibility)
EnvVarCLTag = "CHAINLINK_VERSION"
EnvVarCLTagDescription = "Chainlink image tag"
EnvVarCLTagExample = "1.5.1-root"
EnvVarUser = "CHAINLINK_ENV_USER"
EnvVarUserDescription = "Owner of an environment"
EnvVarUserExample = "Satoshi"
EnvVarTeam = "CHAINLINK_USER_TEAM"
EnvVarTeamDescription = "Team to, which owner of the environment belongs to"
EnvVarTeamExample = "BIX, CCIP, BCM"
EnvVarCLCommitSha = "CHAINLINK_COMMIT_SHA"
EnvVarCLCommitShaDescription = "The sha of the commit that you're running tests on. Mostly used for CI"
EnvVarCLCommitShaExample = "${{ github.sha }}"
EnvVarTestTrigger = "TEST_TRIGGERED_BY"
EnvVarTestTriggerDescription = "How the test was triggered, either manual or CI."
EnvVarTestTriggerExample = "CI"
EnvVarLogLevel = "TEST_LOG_LEVEL"
EnvVarLogLevelDescription = "Environment logging level"
EnvVarLogLevelExample = "info | debug | trace"
EnvVarSlackKey = "SLACK_API_KEY"
EnvVarSlackKeyDescription = "The OAuth Slack API key to report tests results with"
EnvVarSlackKeyExample = "xoxb-example-key"
EnvVarSlackChannel = "SLACK_CHANNEL"
EnvVarSlackChannelDescription = "The Slack code for the channel you want to send the notification to"
EnvVarSlackChannelExample = "C000000000"
EnvVarSlackUser = "SLACK_USER"
EnvVarSlackUserDescription = "The Slack code for the user you want to notify"
EnvVarSlackUserExample = "U000000000"
)
Environment config
// Config is an environment common configuration, labels, annotations, connection types, readiness check, etc.
type Config struct {
// TTL is time to live for the environment, used with kyverno
TTL time.Duration
// NamespacePrefix is a static namespace prefix
NamespacePrefix string
// Namespace is full namespace name
Namespace string
// Labels is a set of labels applied to the namespace in a format of "key=value"
Labels []string
// PodLabels is a set of labels applied to every pod in the namespace
PodLabels map[string]string
// WorkloadLabels is a set of labels applied to every workload in the namespace
WorkloadLabels map[string]string
// PreventPodEviction if true sets a k8s annotation safe-to-evict=false to prevent pods from being evicted
// Note: This should only be used if your test is completely incapable of handling things like K8s rebalances without failing.
// If that is the case, it's worth the effort to make your test fault-tolerant soon. The alternative is expensive and infuriating.
PreventPodEviction bool
// Allow deployment to nodes with these tolerances
Tolerations []map[string]string
// Restrict deployment to only nodes matching a particular node role
NodeSelector map[string]string
// ReadyCheckData is settings for readiness probes checks for all deployment components
// checking that all pods are ready by default with 8 minutes timeout
// &client.ReadyCheckData{
// ReadinessProbeCheckSelector: "",
// Timeout: 15 * time.Minute,
// }
ReadyCheckData *client.ReadyCheckData
// DryRun if true, app will just generate a manifest in local dir
DryRun bool
// InsideK8s used for long-running soak tests where you connect to env from the inside
InsideK8s bool
// SkipManifestUpdate will skip updating the manifest upon connecting to the environment. Should be true if you wish to update the manifest (e.g. upgrade pods)
SkipManifestUpdate bool
// KeepConnection keeps connection until interrupted with a signal, useful when prototyping and debugging a new env
KeepConnection bool
// RemoveOnInterrupt automatically removes an environment on interrupt
RemoveOnInterrupt bool
// UpdateWaitInterval an interval to wait for deployment update started
UpdateWaitInterval time.Duration
// Remote Runner Specific Variables //
// JobImage an image to run environment as a job inside k8s
JobImage string
// Specify only if you want remote-runner to start with a specific name
RunnerName string
// Specify only if you want to mount reports from test run in remote runner
ReportPath string
// JobLogFunction a function that will be run on each log
JobLogFunction func(*Environment, string)
// Test the testing library current Test struct
Test *testing.T
// jobDeployed used to limit us to 1 remote runner deploy
jobDeployed bool
// detachRunner should we detach the remote runner after starting the test
detachRunner bool
// fundReturnFailed the status of a fund return
fundReturnFailed bool
}
Utilities
Collecting logs
You can collect the logs while running tests, or if you have created an enrionment already
package main
import (
"github.com/smartcontractkit/chainlink-testing-framework/k8s/environment"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/chainlink"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/ethereum"
)
func main() {
env := &environment.Config{}
addHardcodedLabelsToEnv(env)
e := environment.New(env).
AddHelm(ethereum.New(nil)).
AddHelm(chainlink.New(0, nil))
if err := e.Run(); err != nil {
panic(err)
}
if err := e.DumpLogs("logs/mytest"); err != nil {
panic(err)
}
}
Resources summary
It can be useful to get current env resources summary for test reporting
package main
import (
"fmt"
"github.com/rs/zerolog/log"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/environment"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/chainlink"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/helm/ethereum"
)
func main() {
env := &environment.Config{}
addHardcodedLabelsToEnv(env)
e := environment.New(env).
AddHelm(ethereum.New(nil)).
AddHelm(chainlink.New(0, nil))
err := e.Run()
if err != nil {
panic(err)
}
// default k8s selector
summ, err := e.ResourcesSummary("app in (chainlink-0, geth)")
if err != nil {
panic(err)
}
log.Warn().Interface("Resources", summ).Send()
e.Shutdown()
}
Chaos
Check our tests to see how we using Chaosmesh
Coverage
Build your target image with those 2 steps to allow automatic coverage discovery
FROM ...
# add those 2 steps to instrument the code
RUN curl -s https://api.github.com/repos/qiniu/goc/releases/latest | grep "browser_download_url.*-linux-amd64.tar.gz" | cut -d : -f 2,3 | tr -d \" | xargs -n 1 curl -L | tar -zx && chmod +x goc && mv goc /usr/local/bin
# -o my_service means service will be called "my_service" in goc coverage service
# --center http://goc:7777 means that on deploy, your instrumented service will automatically register to a local goc node inside your deployment (namespace)
RUN goc build -o my_service . --center http://goc:7777
CMD ["./my_service"]
Add goc
to your deployment, check example with dummy
service deployment:
package main
import (
"time"
"github.com/smartcontractkit/chainlink-testing-framework/k8s/environment"
goc "github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/cdk8s/goc"
dummy "github.com/smartcontractkit/chainlink-testing-framework/k8s/pkg/cdk8s/http_dummy"
)
func main() {
envConfig := &environment.Config{}
addHardcodedLabelsToEnv(envConfig)
e := environment.New(envConfig).
AddChart(goc.New()).
AddChart(dummy.New())
if err := e.Run(); err != nil {
panic(err)
}
// run your test logic here
time.Sleep(1 * time.Minute)
if err := e.SaveCoverage(); err != nil {
panic(err)
}
// clear the coverage, rerun the tests again if needed
if err := e.ClearCoverage(); err != nil {
panic(err)
}
}
After tests are finished, coverage is collected for every service, check cover
directory
TOML Config
Keep in mind that configuring Chainlink image/version & Pyroscope via env vars is deprecated. The latter won't even work anymore. That means that this method should be avoided in new environments. Instead, use the TOML config method described below.
AddHelm(chainlink.New(0, nil))
It's recommended to use a TOML config file to configure Chainlink and Pyroscope:
// read the config file
config := testconfig.GetConfig("Load", "Automation")
var overrideFn = func(_ interface{}, target interface{}) {
ctf_config.MustConfigOverrideChainlinkVersion(&config.ChainlinkImage, target)
ctf_config.MightConfigOverridePyroscopeKey(&config.Pyroscope, target)
}
AddHelm(chainlink.NewWithOverride(0, map[string]interface{}{
"replicas": 1,
}, &config, overrideFn))
Using that will cause the override function to be executed on the default propos thus overriding the default values with the values from the config file. If config.ChainlinkImage
is nil
or it's missing either Image
or Version
code will panic. If Pyroscope is disabled or key is not set it will be ignored.
TOML Config
This documentation is deprecated, we are using it in Chainlink Integration Tests
If you want to test our new products use v2
These basic building blocks can be used to create a TOML config file. For example:
import (
ctf_config "github.com/smartcontractkit/chainlink-testing-framework/config"
ctf_test_env "github.com/smartcontractkit/chainlink-testing-framework/docker/test_env"
)
type TestConfig struct {
ChainlinkImage *ctf_config.ChainlinkImageConfig `toml:"ChainlinkImage"`
ChainlinkUpgradeImage *ctf_config.ChainlinkImageConfig `toml:"ChainlinkUpgradeImage"`
Logging *ctf_config.LoggingConfig `toml:"Logging"`
Network *ctf_config.NetworkConfig `toml:"Network"`
Pyroscope *ctf_config.PyroscopeConfig `toml:"Pyroscope"`
PrivateEthereumNetwork *ctf_test_env.EthereumNetwork `toml:"PrivateEthereumNetwork"`
}
It's up to the user to provide a way to read the config from file and unmarshal it into the struct. You can check testconfig.go to see one way it could be done.
Validate()
should be used to ensure that the config is valid. Some of the building blocks have also a Default()
method that can be used to get default values.
Also, you might find BytesToAnyTomlStruct(logger zerolog.Logger, filename, configurationName string, target any, content []byte) error
utility method useful for unmarshalling TOMLs read from env var or files into a struct
Test Secrets
Test secrets are not stored directly within the TestConfig
TOML due to security reasons. Instead, they are passed into TestConfig
via environment variables. Below is a list of all available secrets. Set only the secrets required for your specific tests, like so: E2E_TEST_CHAINLINK_IMAGE=qa_ecr_image_url
.
Default Secret Loading
By default, secrets are loaded from the ~/.testsecrets
dotenv file. Example of a local ~/.testsecrets
file:
E2E_TEST_CHAINLINK_IMAGE=qa_ecr_image_url
E2E_TEST_CHAINLINK_UPGRADE_IMAGE=qa_ecr_image_url
E2E_TEST_ARBITRUM_SEPOLIA_WALLET_KEY=wallet_key
All E2E Test Secrets
Secret | Env Var | Example |
---|---|---|
Chainlink Image | E2E_TEST_CHAINLINK_IMAGE | E2E_TEST_CHAINLINK_IMAGE=qa_ecr_image_url |
Chainlink Upgrade Image | E2E_TEST_CHAINLINK_UPGRADE_IMAGE | E2E_TEST_CHAINLINK_UPGRADE_IMAGE=qa_ecr_image_url |
Wallet Key per network | E2E_TEST_(.+)_WALLET_KEY or E2E_TEST_(.+)_WALLET_KEY_(\d+)$ | E2E_TEST_ARBITRUM_SEPOLIA_WALLET_KEY=wallet_key or E2E_TEST_ARBITRUM_SEPOLIA_WALLET_KEY_1=wallet_key_1 , E2E_TEST_ARBITRUM_SEPOLIA_WALLET_KEY_2=wallet_key_2 for multiple keys per network |
RPC HTTP URL per network | E2E_TEST_(.+)_RPC_HTTP_URL or E2E_TEST_(.+)_RPC_HTTP_URL_(\d+)$ | E2E_TEST_ARBITRUM_SEPOLIA_RPC_HTTP_URL=url or E2E_TEST_ARBITRUM_SEPOLIA_RPC_HTTP_URL_1=url , E2E_TEST_ARBITRUM_SEPOLIA_RPC_HTTP_URL_2=url for multiple http urls per network |
RPC WebSocket URL per network | E2E_TEST_(.+)_RPC_WS_URL or E2E_TEST_(.+)_RPC_WS_URL_(\d+)$ | E2E_TEST_ARBITRUM_RPC_WS_URL=ws_url or E2E_TEST_ARBITRUM_RPC_WS_URL_1=ws_url_1 , E2E_TEST_ARBITRUM_RPC_WS_URL_2=ws_url_2 for multiple ws urls per network |
Loki Tenant ID | E2E_TEST_LOKI_TENANT_ID | E2E_TEST_LOKI_TENANT_ID=tenant_id |
Loki Endpoint | E2E_TEST_LOKI_ENDPOINT | E2E_TEST_LOKI_ENDPOINT=url |
Loki Basic Auth | E2E_TEST_LOKI_BASIC_AUTH | E2E_TEST_LOKI_BASIC_AUTH=token |
Loki Bearer Token | E2E_TEST_LOKI_BEARER_TOKEN | E2E_TEST_LOKI_BEARER_TOKEN=token |
Grafana Bearer Token | E2E_TEST_GRAFANA_BEARER_TOKEN | E2E_TEST_GRAFANA_BEARER_TOKEN=token |
Pyroscope Server URL | E2E_TEST_PYROSCOPE_SERVER_URL | E2E_TEST_PYROSCOPE_SERVER_URL=url |
Pyroscope Key | E2E_TEST_PYROSCOPE_KEY | E2E_TEST_PYROSCOPE_KEY=key |
Run GitHub Workflow with Your Test Secrets
By default, GitHub workflows execute with a set of predefined secrets. However, you can use custom secrets by specifying a unique identifier for your secrets when running the gh workflow
command.
Steps to Use Custom Secrets
-
Upload Local Secrets to GitHub Secrets Vault:
-
Install
ghsecrets
tool: Install theghsecrets
tool to manage GitHub Secrets more efficiently.go install github.com/smartcontractkit/chainlink-testing-framework/tools/ghsecrets@latest
If you use
asdf
, runasdf reshim
-
Upload Secrets: Run
ghsecrets set
from local core repo to upload the content of your~/.testsecrets
file to the GitHub Secrets Vault and generate a unique identifier (referred to asyour_ghsecret_id
).
cd path-to-chainlink-core-repo
ghsecrets set
For more details about
ghsecrets
, visit https://github.com/smartcontractkit/chainlink-testing-framework/tree/main/tools/ghsecrets#faq -
-
Execute the Workflow with Custom Secrets:
- To use the custom secrets in your GitHub Actions workflow, pass the
-f test_secrets_override_key={your_ghsecret_id}
flag when running thegh workflow
command.gh workflow run <workflow_name> -f test_secrets_override_key={your_ghsecret_id}
- To use the custom secrets in your GitHub Actions workflow, pass the
Default Secrets Handling
If the test_secrets_override_key
is not provided, the workflow will default to using the secrets preconfigured in the CI environment.
Creating New Test Secrets in TestConfig
When adding a new secret to the TestConfig
, such as a token or other sensitive information, the method ReadConfigValuesFromEnvVars()
in config/testconfig.go
must be extended to include the new secret. Ensure that the new environment variable starts with the E2E_TEST_
prefix. This prefix is crucial for ensuring that the secret is correctly propagated to Kubernetes tests when using the Remote Runner.
Here’s a quick checklist for adding a new test secret:
- Add the secret to ~/.testsecrets with the
E2E_TEST_
prefix to ensure proper handling. - Extend the
config/testconfig.go:ReadConfigValuesFromEnvVars()
method to load the secret inTestConfig
- Add the secrets to All E2E Test Secrets table.
Working example
For a full working example making use of all the building blocks see testconfig.go. It provides methods for reading TOML, applying overrides and validating non-empty config blocks. It supports 4 levels of overrides, in order of precedence:
BASE64_CONFIG_OVERRIDE
env varoverrides.toml
[product_name].toml
default.toml
All you need to do now to get the config is execute func GetConfig(configurationName string, product string) (TestConfig, error)
. It will first look for folder with file .root_dir
and from there it will look for config files in all subfolders, so that you can place the config files in whatever folder(s) work for you. It assumes that all configuration versions for a single product are kept in [product_name].toml
under different configuration names (that can represent anything you want: a single test, a test type, a test group, etc).
Overrides of config files are done in a super-simple way. We try to unmarshall consecutive files into the same struct. Since it's all pointer based only not-nil keys are overwritten.
IMPORTANT!
It is required to add overrides.toml
to .gitignore
in your project, so that you don't accidentally commit it as it might contain secrets.
Network config (and default RPC endpoints)
Some more explanation is needed for the NetworkConfig
:
type NetworkConfig struct {
// list of networks that should be used for testing
SelectedNetworks []string `toml:"selected_networks"`
// map of network name to EVMNetworks where key is network name and value is a pointer to EVMNetwork
// if not set, it will try to find the network from defined networks in MappedNetworks under known_networks.go
// it doesn't matter if you use `arbitrum_sepolia` or `ARBITRUM_SEPOLIA` or even `arbitrum_SEPOLIA` as key
// as all keys will be uppercased when loading the Default config
EVMNetworks map[string]*blockchain.EVMNetwork `toml:"EVMNetworks,omitempty"`
// map of network name to ForkConfigs where key is network name and value is a pointer to ForkConfig
// only used if network fork is needed, if provided, the network will be forked with the given config
// networkname is fetched first from the EVMNetworks and
// if not defined with EVMNetworks, it will try to find the network from defined networks in MappedNetworks under known_networks.go
ForkConfigs map[string]*ForkConfig `toml:"ForkConfigs,omitempty"`
// map of network name to RPC endpoints where key is network name and value is a list of RPC HTTP endpoints
RpcHttpUrls map[string][]string `toml:"RpcHttpUrls"`
// map of network name to RPC endpoints where key is network name and value is a list of RPC WS endpoints
RpcWsUrls map[string][]string `toml:"RpcWsUrls"`
// map of network name to wallet keys where key is network name and value is a list of private keys (aka funding keys)
WalletKeys map[string][]string `toml:"WalletKeys"`
}
func (n *NetworkConfig) Default() error {
...
}
Sample TOML config:
selected_networks = ["arbitrum_goerli", "optimism_goerli", "new_network"]
[EVMNetworks.new_network]
evm_name = "new_test_network"
evm_chain_id = 100009
evm_simulated = true
evm_chainlink_transaction_limit = 5000
evm_minimum_confirmations = 1
evm_gas_estimation_buffer = 10000
client_implementation = "Ethereum"
evm_supports_eip1559 = true
evm_default_gas_limit = 6000000
[ForkConfigs.new_network]
url = "ws://localhost:8546"
block_number = 100
[RpcHttpUrls]
arbitrum_goerli = ["https://devnet-2.mt/ABC/rpc/"]
new_network = ["http://localhost:8545"]
[RpcWsUrls]
arbitrum_goerli = ["wss://devnet-2.mt/ABC/ws/"]
new_network = ["ws://localhost:8546"]
[WalletKeys]
arbitrum_goerli = ["1810868fc221b9f50b5b3e0186d8a5f343f892e51ce12a9e818f936ec0b651ed"]
optimism_goerli = ["1810868fc221b9f50b5b3e0186d8a5f343f892e51ce12a9e818f936ec0b651ed"]
new_network = ["ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80"]
Whenever you are adding a new EVMNetwork to the config, you can either
- provide the rpcs and wallet keys in Rpc<Http/Ws>Urls and WalletKeys. Like in the example above, you can see that
new_network
is added to theselected_networks
andEVMNetworks
and then the rpcs and wallet keys are provided inRpcHttpUrls
,RpcWsUrls
andWalletKeys
respectively. - provide the rpcs and wallet keys in the
EVMNetworks
itself. Like in the example below, you can see thatnew_network
is added to theselected_networks
andEVMNetworks
and then the rpcs and wallet keys are provided inEVMNetworks
itself.
selected_networks = ["new_network"]
[EVMNetworks.new_network]
evm_name = "new_test_network"
evm_chain_id = 100009
evm_urls = ["ws://localhost:8546"]
evm_http_urls = ["http://localhost:8545"]
evm_keys = ["ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80"]
evm_simulated = true
evm_chainlink_transaction_limit = 5000
evm_minimum_confirmations = 1
evm_gas_estimation_buffer = 10000
client_implementation = "Ethereum"
evm_supports_eip1559 = true
evm_default_gas_limit = 6000000
If your config struct looks like that:
type TestConfig struct {
Network *ctf_config.NetworkConfig `toml:"Network"`
}
then your TOML file should look like that:
[Network]
selected_networks = ["arbitrum_goerli","new_network"]
[Network.EVMNetworks.new_network]
evm_name = "new_test_network"
evm_chain_id = 100009
evm_simulated = true
evm_chainlink_transaction_limit = 5000
evm_minimum_confirmations = 1
evm_gas_estimation_buffer = 10000
client_implementation = "Ethereum"
evm_supports_eip1559 = true
evm_default_gas_limit = 6000000
[Network.RpcHttpUrls]
arbitrum_goerli = ["https://devnet-2.mt/ABC/rpc/"]
new_network = ["http://localhost:8545"]
[Network.RpcWsUrls]
arbitrum_goerli = ["ws://devnet-2.mt/ABC/rpc/"]
new_network = ["ws://localhost:8546"]
[Network.WalletKeys]
arbitrum_goerli = ["1810868fc221b9f50b5b3e0186d8a5f343f892e51ce12a9e818f936ec0b651ed"]
new_network = ["ac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80"]
If in your product config you want to support case-insensitive network names and map keys remember to run NetworkConfig.UpperCaseNetworkNames()
on your config before using it.
Providing custom values in the CI
Up to this point when we wanted to modify some dynamic tests parameters in the CI we would simply set env vars. That approach won't work anymore. The way to go around it is to build a TOML file, base64
it, mask it and then set is as BASE64_CONFIG_OVERRIDE
env var that will be read by tests. Here's an example of a working snippet of how that could look:
convert_to_toml_array() {
local IFS=','
local input_array=($1)
local toml_array_format="["
for element in "${input_array[@]}"; do
toml_array_format+="\"$element\","
done
toml_array_format="${toml_array_format%,}]"
echo "$toml_array_format"
}
selected_networks=$(convert_to_toml_array "$SELECTED_NETWORKS")
if [ -n "$PYROSCOPE_SERVER" ]; then
pyroscope_enabled=true
else
pyroscope_enabled=false
fi
if [ -n "$ETH2_EL_CLIENT" ]; then
execution_layer="$ETH2_EL_CLIENT"
else
execution_layer="geth"
fi
cat << EOF > config.toml
[Network]
selected_networks=$selected_networks
[ChainlinkImage]
image="$CHAINLINK_IMAGE"
version="$CHAINLINK_VERSION"
[Pyroscope]
enabled=$pyroscope_enabled
server_url="$PYROSCOPE_SERVER"
environment="$PYROSCOPE_ENVIRONMENT"
key_secret="$PYROSCOPE_KEY"
[Logging.Loki]
tenant_id="$LOKI_TENANT_ID"
url="$LOKI_URL"
basic_auth_secret="$LOKI_BASIC_AUTH"
bearer_token_secret="$LOKI_BEARER_TOKEN"
[Logging.Grafana]
url="$GRAFANA_URL"
EOF
BASE64_CONFIG_OVERRIDE=$(cat config.toml | base64 -w 0)
echo ::add-mask::$BASE64_CONFIG_OVERRIDE
echo "BASE64_CONFIG_OVERRIDE=$BASE64_CONFIG_OVERRIDE" >> $GITHUB_ENV
These two lines in that very order are super important
BASE64_CONFIG_OVERRIDE=$(cat config.toml | base64 -w 0)
echo ::add-mask::$BASE64_CONFIG_OVERRIDE
::add-mask::
has to be called only after env var has been set to it's final value, otherwise it won't be recognized and masked properly and secrets will be exposed in the logs.
Providing custom values for local execution
For local execution it's best to put custom variables in overrides.toml
file.
Providing custom values in k8s
It's easy. All you need to do is:
- Create TOML file with these values
- Base64 it:
cat your.toml | base64
- Set the base64 result as
BASE64_CONFIG_OVERRIDE
environment variable.
BASE64_CONFIG_OVERRIDE
will be automatically forwarded to k8s (as long as it is set and available to the test process), when creating the environment programmatically via environment.New()
.
Quick example:
BASE64_CONFIG_OVERRIDE=$(cat your.toml | base64) go test your-test-that-runs-in-k8s ./file/with/your/test
Not moved to TOML
Not moved to TOML:
SLACK_API_KEY
SLACK_USER
SLACK_CHANNEL
TEST_LOG_LEVEL
CHAINLINK_ENV_USER
DETACH_RUNNER
ENV_JOB_IMAGE
- most of k8s-specific env variables were left untouched
CRIB Connector
GAPv1
won't be supported in the future, you can still use this example, CI run but expect this to be changed.
Docker
Docker package represents low-level wrappers for Docker containers built using testcontainers-go library. It is strongly advised that you use CTFv2's Framework to build your enviroment instead of using it directly. Consider yourself warned!
Supported Docker containers can be divided in a couple of categories:
- Chainlink ecosystem-related
- Blockchain nodes
- Third party apps
- Test helpers
- helper containers
Following Chainlink-related containers are available:
- Job Distributor
Blockchain nodes:
- Besu
- Erigon
- Geth
- Nethermind
- Reth
- Prysm Beacon client (PoS Ethereum)
Third party apps:
- Kafka
- Postgres
- Zookeeper
- Schema registry
Test helpers (mocking solutions):
- Killgrave
- Mockserver
Helper containers:
- Ethereum genesis generator
- Validator keys generator
Basic structure
All of our Docker container wrappers are composed of some amount of specific elements and an universal part called EnvComponent
defined as:
type EnvComponent struct {
ContainerName string `json:"containerName"`
ContainerImage string `json:"containerImage"`
ContainerVersion string `json:"containerVersion"`
ContainerEnvs map[string]string `json:"containerEnvs"`
WasRecreated bool `json:"wasRecreated"`
Networks []string `json:"networks"`
Container tc.Container `json:"-"`
PostStartsHooks []tc.ContainerHook `json:"-"`
PostStopsHooks []tc.ContainerHook `json:"-"`
PreTerminatesHooks []tc.ContainerHook `json:"-"`
LogLevel string `json:"-"`
StartupTimeout time.Duration `json:"-"`
}
It comes with a bunch of functional options that allow you to:
- set container name
- set image name and tag
- set startup timeout
- set log level
- set post start/stop hooks
And following chaos-testing related functions:
ChaosNetworkDelay()
- introducing networking delayChaosNetworkLoss()
- simulating network package lossChaosPause()
- pausing the container
Blockchain Nodes
If you plan to experiment with these Docker containers, use an existing Ethereum environment builder rather than assembling components manually. This is particularly important for Proof-of-Stake (PoS) networks, where the setup involves multi-stage operations and shared state across multiple containers.
Each supported client has a default image version defined here. Clients (except Reth) are available in two flavors:
- Proof-of-Stake (PoS): Ethereum 2.0.
- Proof-of-Work/Authority (PoW/PoA): Ethereum 1.0.
Reth supports only PoS networks.
These ephemeral networks use a simplified configuration with a single blockchain node. For PoS, three containers simulate this:
- Execution layer
- Consensus layer
- Validator
note
We use a fork of Ethereum Genesis Generator to create genesis files for PoS networks.
Execution Layer Clients
The following execution layer clients are available:
- Besu
- Erigon
- Geth
- Nethermind
- Reth
Consensus Layer Client
- Prysm (the only available consensus client).
Quick Start
The simplest way to start an Ethereum network is by specifying the execution layer only:
builder := NewEthereumNetworkBuilder()
cfg, err := builder.
WithExecutionLayer(types.ExecutionLayer_Nethermind).
Build()
if err != nil {
panic(err)
}
net, rpcs, err := cfg.Start()
if err != nil {
panic(err)
}
If no version is specified, Ethereum 1.0 (pre-Merge) will be used.
Starting Ethereum 2.0 Networks
To start an Ethereum 2.0 network, add the Ethereum version parameter:
builder := NewEthereumNetworkBuilder()
cfg, err := builder.
WithEthereumVersion(config_types.EthereumVersion_Eth2).
WithExecutionLayer(config_types.ExecutionLayer_Geth).
Build()
if err != nil {
panic(err)
}
net, rpcs, err := cfg.Start()
if err != nil {
panic(err)
}
note
Booting Ethereum 2.0 networks takes longer due to the additional containers. Wait times of up to 1 minute are common.
Custom Docker Images
To use custom Docker images instead of the defaults:
builder := NewEthereumNetworkBuilder()
cfg, err := builder.
WithCustomDockerImages(map[config.ContainerType]string{
config.ContainerType_ExecutionLayer: "ethereum/client-go:v1.15.0",
}).
Build()
if err != nil {
panic(err)
}
net, rpcs, err := cfg.Start()
if err != nil {
panic(err)
}
Available Container Types
const (
ContainerType_ExecutionLayer ContainerType = "execution_layer"
ContainerType_ConsensusLayer ContainerType = "consensus_layer"
ContainerType_ConsensusValidator ContainerType = "consensus_validator"
ContainerType_GenesisGenerator ContainerType = "genesis_generator"
ContainerType_ValKeysGenerator ContainerType = "val_keys_generator"
)
note
Use the latest_available
tag for the most recent release, including pre-releases, or the latest_stable
tag for the latest officially stable release. The latest_available
may include beta or development versions, whereas latest_stable
ensures compatibility with production environments.
Advanced Options
Connect to Existing Docker Networks
By default, a new random Docker network is created. To use an existing one:
builder := NewEthereumNetworkBuilder()
cfg, err := builder.
WithExecutionLayer(types.ExecutionLayer_Nethermind).
WithDockerNetworks([]string{"my-existing-network"}).
Build()
Chain Customization
Ethereum 2.0 Parameters
- Seconds per slot: This defines how many seconds validators have to propose and vote on blocks. Lower values accelerate block production and epoch finalization but can cause issues if validators fail to keep up. The minimum allowed value is
3
. - Slots per epoch: Determines the number of slots (voting rounds) per epoch. Lower values mean epochs finalize faster, but fewer voting rounds can impact network stability. The minimum value is
2
. - Genesis delay: The extra delay (in seconds) before the genesis block starts. This ensures all containers (execution, consensus, and validator) are up and running before block production begins.
- Validator count: Specifies the number of validators in the network. A higher count increases the robustness of block validation but requires more resources. The minimum allowed value is
4
.
General Parameters
- ChainID: Integer value for the chain.
- Addresses to fund: Pre-fund accounts in the genesis block.
Default Configuration
seconds_per_slot=12
slots_per_epoch=6
genesis_delay=15
validator_count=8
chain_id=1337
addresses_to_fund=["0xf39Fd6e51aad88F6F4ce6aB8827279cffFb92266"]
Log Level
Customize node log levels (default: info
):
WithNodeLogLevel("debug")
Supported values:
trace
,debug
,info
,warn
,error
Wait for Finalization (Ethereum 2.0)
Wait until the first epoch finalizes:
WithWaitingForFinalization()
Accessing Containers
The builder.Start()
function returns:
- Network configuration: Input to test clients.
- RPC endpoints:
- Public: Accessible externally.
- Private: Internal to the Docker network.
Endpoint Usage
- Use public endpoints for Ethereum clients deploying contracts, interacting with the chain, etc.
- Use private endpoints for Chainlink nodes within the same Docker network.
Ethereum 2.0 Container Creation Sequence
The following sequence explains Ethereum 2.0 container startup:
sequenceDiagram
participant Host
participant Validator Keys Generator
participant Genesis Generator
participant Execution Layer
participant Beacon Chain (Consensus Layer)
participant Validator
participant After Genesis Helper
participant User
Validator Keys Generator->>Host: Generate validator keys
Genesis Generator->>Host: Create genesis files
Host->>Execution Layer: Start with genesis
Host->>Beacon Chain (Consensus Layer): Start with genesis
Beacon Chain (Consensus Layer)->>Execution Layer: Connect and sync
Host->>Validator: Start with validator keys
Validator->>Beacon Chain (Consensus Layer): Begin block voting
loop new blocks
After Genesis Helper->>Execution Layer: Monitor new blocks
end
After Genesis Helper->>User: Ready to interact
Chainlink ecosystem
Currently, only one application has a wrapper that lives in this framework: the Job Distributor. Chainlink Node wrapper can be found in the chainlink repository.
Job Distributor
JD is a component for a centralized creation and management of jobs executed by Chainlink Nodes. It's a single point of entry that frees you from having to setup each job separately on each node from the DON.
It requires a Postgres DB instance, which can also be started using the CTF:
pg, err := test_env.NewPostgresDb(
[]string{network.Name},
test_env.WithPostgresDbName("jd-db"),
test_env.WithPostgresImageVersion("14.1"))
if err != nil {
panic(err)
}
err = pg.StartContainer()
if err != nil {
panic(err)
}
Then all you need to do, to start a new instance is:
jd := New([]string{network.Name},
//replace with actual image
WithImage("localhost:5001/jd"),
WithVersion("latest"),
WithDBURL(pg.InternalURL.String()),
)
err = jd.StartContainer()
if err != nil {
panic(err)
}
Once you have JD started you can create a new GRPC connection and start interacting with it to register DON nodes, create jobs, etc:
conn, cErr := grpc.NewClient(jd.Grpc, []grpc.DialOption{grpc.WithTransportCredentials(insecure.NewCredentials())}...)
if cErr != nil {
panic(cErr)
}
// use the connection
Test helpers
There two test helper containers:
- Killgrave
- Mockserver
Both represent HTTP mocking solutions, with Mockserver used in k8s and Killgrave outside of it. Since both will soon be replaced by a single, in-house solution, their usage is discouraged.
That worked is begin done in this PR and tracked in this ticket.
Logging
This small library was created to address two issues:
- mixed up logging for parallel tests, when using vanilla loggers
- conformity with logging interface required by
testcontainers-go
(a Docker container library)
It uses "github.com/rs/zerolog"
for the logger.
Configuration
There's only one configuration option: the log level. You can set it via TEST_LOG_LEVEL
environment variable to:
trace
debug
info
(default)warn
error
How to use
The main way to get a Logger instance is to call logging.GetTestLogger(*testing.T)
. testing.T
instance can be nil
.
When using it together with testcontainers-go
, which is a library we use to interact with Docker containers you should
use GetTestContainersGoTestLogger(*testing.T)
instead.
And that's all there is to it :-)
Build info
This documentation was generated from:
- Branch
main
- Commit:
ed803554