The problem with running every test on every commit
Test suites grow with the system. New features add new tests, microservices multiply integration paths, and overlapping coverage creeps in across teams. What started as a fast feedback loop turns into a thirty-minute drag on every pull request.
The "run everything" default no longer works because:
| What you're paying for | What you're getting |
|---|
| Long CI cycles | Hundreds of tests running for a single-line change |
| Rising infrastructure cost | Compute spent on tests with no dependency on the modified component |
| Slower developer feedback | Engineers waiting on test results that were never relevant |
| Lost signal | Real failures buried in noise from unrelated suites |
The deeper problem is selection, not execution. Pipelines have no way to decide which tests actually matter for a given change, so they default to running them all.
A reasoning layer between the commit and the test suite
Testkube puts an AI agent between code change and test execution. The agent reads the diff, classifies the change, queries historical execution data, and decides which workflows to run. The decision is based on signals the agent can actually evaluate:
- Code change impact. Which files, modules, or services were modified.
- Historical failure patterns. Which tests fail for changes in similar paths.
- Test execution history. Pass rates, flakiness, and duration over time.
- Service dependency graphs. Which downstream tests are connected to the change.
A documentation change runs nothing. An API modification runs the contract validator and the integration test. The full suite still runs on a schedule, just not on every commit.
How it works in Testkube
Testkube provides four layers that make smart test selection possible inside containerized environments:
Execution layer
TestWorkflows run inside your cluster as independent, labeled units. Each workflow is scoped to a specific area, so the agent can map changes to the workflows that cover them.
Data layer
Every execution captures detailed metadata, logs, results, and duration. This becomes the historical signal the agent reasons over when deciding what to run.
Control layer
AI Triggers watch for workflows carrying a specific label and status. When matched, control passes to the configured AI Agent instead of executing tests directly.
Decision layer
AI Agents pull external context through MCP servers (GitHub, observability tools, repo metadata), classify the change, and trigger the selected tests through the Testkube MCP server.
More on building the agent: a step-by-step walkthrough with screenshots showing how the AI Agent classifies commits and runs only the relevant tests.
Read the walkthrough →What changes when selection is intelligent
| Before | After |
|---|
| Every commit runs the full suite | Only tests connected to the change run on each commit |
| Selection logic lives in CI YAML | Selection logic lives in an agent that reads the diff and historical data |
| Doc edits trigger end-to-end runs | Non-functional changes skip execution entirely |
| Full coverage relies on running everything, every time | Full coverage runs on a schedule, smart selection runs per commit |
Built for teams shipping at AI velocity
When AI-generated code increases the volume of changes, the volume of tests grows alongside it. Smart selection is what keeps that growth from breaking the pipeline.
- Faster CI. Cut wait time on every pull request by skipping tests with no connection to the change.
- Lower infrastructure cost. Reduce compute spend on tests that don't add signal.
- Better defect detection. Surface real failures faster by running the tests most likely to catch them.
- Safer over time. Schedule full-suite runs as a safety net so nothing slips through.
See Testkube in action
Start a free trial to explore how teams orchestrate tests across their containerized environments.
Start free trial