7 Ways CI/CD Pipelines Break With Kubernetes (and Fixes)

Oct 1, 2025
read
Katie Petriella
Senior Growth Manager
Testkube
Read more from
Katie Petriella
Katie Petriella
Senior Growth Manager
Testkube

Table of Contents

Try Testkube instantly in our sandbox. No setup needed.

Try Testkube instantly in our sandbox. No setup needed.

Subscribe to Testkube's Monthly Newsletter
to stay up to date

You have successfully subscribed to the Testkube newsletter.
You have successfully subscribed to the Testkube newsletter.
Oops! Something went wrong while submitting the form.
Oct 1, 2025
read
Katie Petriella
Senior Growth Manager
Testkube
Read more from
Katie Petriella
Katie Petriella
Senior Growth Manager
Testkube
Your CI tests pass but pods crash. Manifests drift from reality. Rollbacks only half-work. These aren't edge cases—they're structural gaps between your pipeline and your cluster.

Table of Contents

Executive Summary

Most CI/CD pipelines weren't designed for Kubernetes. They were designed for a world where you push code to a server, run some tests, and deploy a binary. Kubernetes changed the deployment model (containers, declarative manifests, rolling updates, multi-cluster environments) but the pipelines feeding into it often haven't caught up.

The result is a specific category of failures that teams hit repeatedly. Not the obvious stuff like syntax errors in YAML (though that happens too). The harder problems: tests that pass in CI but fail in the cluster, deployments that drift from what's checked into Git, rollbacks that don't actually work when you need them.

If you're running tests and deployments through Jenkins, GitHub Actions, CircleCI, or any other general-purpose CI tool pointed at a Kubernetes cluster, you've probably seen some version of these failures. Here's where the breakdowns actually happen and why they keep recurring.

Your tests don't run where your code runs

This is the most common source of false confidence. Your CI pipeline spins up a container, runs your test suite, everything passes. Then the same code deploys to a Kubernetes cluster and breaks.

The gap is environmental. In CI, your tests typically run in an isolated container or VM with mocked dependencies. In production, your code runs inside a pod governed by resource limits, network policies, service mesh rules, and RBAC constraints that didn't exist in the test environment. A test that passes when it has unrestricted memory will fail when the pod's memory limit is 256Mi. An integration test that hits a mocked API endpoint won't catch the fact that your production cluster's network policy blocks egress to that service.

The problem gets worse with microservices. If Service A depends on Service B, and you're testing Service A in isolation against a stub, you're not testing the actual contract between them. You're testing against your assumptions about what Service B returns, and those assumptions may have been true last month and aren't now.

Some teams try to solve this with a shared staging cluster that CI pipelines deploy to. That introduces its own problems: multiple pipelines deploying to the same namespace create race conditions, test data leaks between runs, and a single broken deployment can block every other pipeline waiting for its turn.

This is one of the problems Testkube was built to address. Because Testkube runs tests as native Kubernetes jobs inside your actual cluster, your tests execute under the same resource limits, network policies, and service configurations as your production workloads. Each test execution gets its own pod with granular resource control, so there's no interference between runs and no gap between "what CI tested" and "what the cluster actually does." You can run any testing framework (Postman, Cypress, JMeter, k6, Selenium, and others) without having to containerize them yourself or write custom pipeline glue.

Tools like KinD (Kubernetes in Docker) offer a lighter-weight alternative by spinning up a cluster inside the CI worker. It's not a perfect replica of production (you won't have the same node count, network topology, or cloud provider integrations) but it catches a category of failures that pure container-level testing misses. The tradeoff is time. Standing up a KinD cluster, deploying your manifests, running tests, and tearing it down adds minutes to every pipeline run.

Manifests drift from reality

In theory, your Kubernetes manifests in Git describe exactly what's running in the cluster. In practice, the drift starts almost immediately.

Someone runs kubectl edit to fix an urgent production issue and forgets to backport the change to the manifest file. A Helm chart gets deployed with values overridden on the command line that never make it into the values.yaml committed to the repo. An admission webhook mutates resources at deploy time, so what's in Git and what's actually applied are different by definition.

This drift is insidious because nothing fails loudly. The next CI/CD pipeline run deploys the manifests from Git, which may revert the hotfix, or may apply on top of the mutated state in a way that produces unpredictable results. Teams discover the drift when a deployment behaves differently than expected and someone starts comparing what's in the cluster with what's in the repo.

GitOps tools like ArgoCD and Flux are specifically designed to prevent this by continuously reconciling cluster state with a Git repository. If something changes in the cluster that doesn't match Git, the tool either alerts you or reverts it. This works well but introduces its own complexity. You need to commit every change through Git, including emergency fixes, which creates friction when the site is down and you need to act fast.

The deeper issue is that most CI/CD pipelines treat deployment as a one-directional push: build artifact, apply manifest, done. They don't verify that what they applied actually matches what's running five minutes later.

Image tags hide version mismatches

Using :latest as a container image tag is widely understood to be a bad practice, but the subtler version of this problem is more common: using mutable tags that CI pipelines assume are stable.

Here's the scenario. Your pipeline builds an image, tags it v1.2.3, pushes it to the registry, and deploys a manifest referencing myapp:v1.2.3. A week later, someone rebuilds the image from a slightly different branch, pushes it to the same tag, and now the same v1.2.3 resolves to a different image digest. Your next deployment pulls the overwritten image, and suddenly production is running code that was never tested by the pipeline that originally deployed it.

The fix is image digests. Referencing myapp@sha256:abc123 instead of a tag guarantees that the exact bits you tested are the exact bits you deploy. But most CI pipelines use tags by default, and making the switch requires updating both the pipeline logic and the deployment manifests to propagate digests instead of tags.

Google's CI/CD best practices documentation for GKE puts this bluntly: container images shouldn't be rebuilt as they pass through the pipeline. Build once, promote the same artifact through staging and production, and use digests to ensure nothing changes along the way.

Rollbacks don't work the way you think

Kubernetes has built-in rollback support. kubectl rollout undo will revert a deployment to its previous revision. In theory, this gives you a safety net for bad deploys.

In practice, rollbacks in Kubernetes only revert the pod template, meaning the container image and its configuration. They don't revert ConfigMaps, Secrets, PersistentVolumeClaims, or any other resources that your application depends on. If your bad deploy included a schema migration, a ConfigMap change, and a new container image, rolling back the deployment only undoes the image. The schema migration and config change stay.

Most CI/CD pipelines don't have a concept of "undo everything that was just applied." They know how to go forward (build, test, deploy) but going backward requires understanding the full set of resources that changed, which resources have side effects that can't be reversed (like database migrations), and what order to revert them in.

Teams that handle this well typically version their entire deployment as a Helm release or Kustomize overlay, with all resources (manifests, configs, secrets references) tracked together. Rolling back means deploying the previous version of the whole bundle, not just the container image. But this requires discipline in how you structure your deployment artifacts, and most pipelines aren't set up to manage rollback as a first-class operation.

Secrets get mishandled

Kubernetes secrets are base64-encoded, not encrypted. Anyone with read access to the namespace can decode them. This is well known, and most teams use something like Sealed Secrets, External Secrets Operator, or a Vault integration to manage secrets properly.

Where CI/CD pipelines introduce risk is in how they handle secrets during the build and deploy process. Jenkins, for example, can store credentials in its credential store, but if you reference them as environment variables in pipeline scripts, they can end up in build logs, cached layers, or artifact metadata. A pipeline that injects a database password as a build argument will bake that password into the Docker image layer history, where anyone who pulls the image can extract it.

The fix is to keep secrets out of the build process entirely. Inject them at runtime via Kubernetes secret mounts or sidecar containers, and never pass them as build arguments or environment variables in CI. But this requires the pipeline and the deployment to be designed together with secrets handling in mind, which is often an afterthought.

No quality gates before production

A surprising number of Kubernetes CI/CD pipelines go straight from "tests passed" to "deploy to production" without intermediate verification. The pipeline builds, runs unit tests, maybe runs integration tests, and then applies manifests to the production cluster.

What's missing: manifest validation (does this YAML actually describe valid Kubernetes resources?), policy checks (does this deployment violate any org-wide policies like resource limits or security contexts?), image scanning (does this container image contain known vulnerabilities?), and smoke testing after deployment (did the pods actually start, pass health checks, and begin serving traffic?).

Each of these is a gate that can catch failures before they affect users. Manifest validation with tools like kubeval or kubeconform catches structural errors. Policy engines like OPA/Gatekeeper enforce organizational constraints. Image scanners like Trivy flag vulnerable dependencies. Post-deploy smoke tests verify that the application actually works in its real environment.

Without these gates, the pipeline's definition of "success" is narrow (the code compiled and the tests passed) while the definition of "deployment actually works" is much broader.

The cluster is a black box to the pipeline

Most CI/CD tools treat Kubernetes as an endpoint. They push manifests and move on. They don't monitor whether the rollout succeeded, whether pods are crash-looping, whether the new version is actually receiving traffic.

This means the pipeline can report success while the deployment is actively failing. A pod enters CrashLoopBackOff because of a missing environment variable. The rolling update stalls because new pods never pass their readiness probe. A canary deployment routes 5% of traffic to a broken version that's returning 500 errors.

Closing this gap requires the pipeline to wait for the deployment to stabilize and verify health. kubectl rollout status is the minimum. It'll tell you if the rollout completed or timed out. But real confidence requires checking pod logs, monitoring error rates, and ideally integrating with your observability stack to automatically abort if error rates spike after a deploy.

This is where the architecture of traditional CI/CD tools shows its limits. Jenkins, GitHub Actions, and CircleCI are built around the concept of a pipeline: a linear sequence of steps that runs and completes. Kubernetes deployments are ongoing processes that unfold over time. Testkube approaches this differently by running inside the cluster itself, giving it direct access to pod logs, execution artifacts, and resource metrics for every test run. Instead of the pipeline hoping the deployment worked, you can trigger post-deployment smoke tests from Kubernetes events, schedule them, or kick them off via API, then surface results through a centralized dashboard with full execution history and AI-assisted log analysis.

What this points to

These aren't edge cases. They're structural mismatches between how general-purpose CI/CD tools work and how Kubernetes actually operates. The tools assume a push-and-forget deployment model. Kubernetes requires continuous reconciliation, environment-aware testing, and deployment verification that extends well beyond "the manifest was applied."

You can patch each of these problems individually. Add KinD clusters for testing, adopt GitOps for drift prevention, use digests instead of tags, build out quality gates. Many teams do exactly this and end up with a pipeline that's more duct tape than architecture: a Jenkins or GitHub Actions workflow with a dozen custom scripts and plugins holding it together.

The alternative is to move your test orchestration into Kubernetes itself, where it has native access to the environment, resources, and observability that external CI tools have to bolt on from the outside.

If any of these failure modes sound familiar, try Testkube for free and run your first test inside your cluster in minutes.

Tags

About Testkube

Testkube is a cloud-native continuous testing platform for Kubernetes. It runs tests directly in your clusters, works with any CI/CD system, and supports every testing tool your team uses. By removing CI/CD bottlenecks, Testkube helps teams ship faster with confidence.
Explore the sandbox to see Testkube in action.