

Table of Contents
Start your free trial.
Start your free trial.
Start your free trial.




Table of Contents
Executive Summary
Most CI/CD pipelines were not designed for Kubernetes. They were designed for a world where you push code to a server, run some tests, and deploy a binary. Kubernetes changed the deployment model (containers, declarative manifests, rolling updates, multi-cluster environments) but the pipelines feeding into it often have not caught up.
The result is a specific category of failures that teams hit repeatedly. Not the obvious stuff like syntax errors in YAML (though that happens too). The harder problems: tests that pass in CI but fail in the cluster, deployments that drift from what is checked into Git, rollbacks that do not actually work when you need them.
If you are running tests and deployments through Jenkins, GitHub Actions, CircleCI, or any other general-purpose CI tool pointed at a Kubernetes cluster, you have probably seen some version of these failures. Here is where the breakdowns actually happen and why they keep recurring.
Already trying to decouple testing from your CI/CD pipeline? See the broader case for why test execution belongs outside the pipeline. Read: Stop running tests with your CI/CD tool →
1. Your tests do not run where your code runs
Why do tests pass in CI but fail in Kubernetes? Tests in CI usually run in an isolated container with mocked dependencies and unrestricted resources. Production pods run under memory limits, network policies, service mesh rules, and RBAC constraints that did not exist in the test environment. The runtime environment is different in ways tests cannot see, so passing in CI does not mean working in the cluster.
This is the most common source of false confidence. Your CI pipeline spins up a container, runs your test suite, everything passes. Then the same code deploys to a Kubernetes cluster and breaks. This is the tests pass locally, fail in CI environment mismatch pattern playing out one level higher.
The gap is environmental. In CI, your tests typically run in an isolated container or VM with mocked dependencies. In production, your code runs inside a pod governed by resource limits, network policies, service mesh rules, and RBAC constraints that did not exist in the test environment. A test that passes when it has unrestricted memory will fail when the pod's memory limit is 256Mi. An integration test that hits a mocked API endpoint will not catch the fact that your production cluster's network policy blocks egress to that service.
The problem gets worse with microservices. If Service A depends on Service B, and you are testing Service A in isolation against a stub, you are not testing the actual contract between them. You are testing against your assumptions about what Service B returns, and those assumptions may have been true last month and are not now.
Some teams try to solve this with a shared staging cluster that CI pipelines deploy to. That introduces its own problems: multiple pipelines deploying to the same namespace create race conditions, test data leaks between runs, and a single broken deployment can block every other pipeline waiting for its turn.
The fix: Run tests inside the cluster itself. This is one of the problems Testkube was built to address. Because Testkube runs tests as native Kubernetes jobs inside your actual cluster, your tests execute under the same resource limits, network policies, and service configurations as your production workloads. Each test execution gets its own pod with granular resource control, so there is no interference between runs and no gap between "what CI tested" and "what the cluster actually does." You can run any testing framework (Postman, Cypress, JMeter, k6, Selenium, and others) without having to containerize them yourself or write custom pipeline glue.
Tools like KinD (Kubernetes in Docker) offer a lighter-weight alternative by spinning up a cluster inside the CI worker. It is not a perfect replica of production (you will not have the same node count, network topology, or cloud provider integrations) but it catches a category of failures that pure container-level testing misses. The tradeoff is time. Standing up a KinD cluster, deploying your manifests, running tests, and tearing it down adds minutes to every pipeline run.
2. Manifests drift from reality
What is manifest drift in Kubernetes? Manifest drift is when the resources running in your cluster no longer match the YAML in your Git repository. It happens when someone uses kubectl edit for a hotfix, Helm overrides values that never get committed, or admission webhooks mutate resources at deploy time.
In theory, your Kubernetes manifests in Git describe exactly what is running in the cluster. In practice, the drift starts almost immediately.
Someone runs kubectl edit to fix an urgent production issue and forgets to backport the change to the manifest file. A Helm chart gets deployed with values overridden on the command line that never make it into the values.yaml committed to the repo. An admission webhook mutates resources at deploy time, so what is in Git and what is actually applied are different by definition.
This drift is insidious because nothing fails loudly. The next CI/CD pipeline run deploys the manifests from Git, which may revert the hotfix, or may apply on top of the mutated state in a way that produces unpredictable results. Teams discover the drift when a deployment behaves differently than expected and someone starts comparing what is in the cluster with what is in the repo.
The fix: GitOps tools like Argo CD and Flux are specifically designed to prevent this by continuously reconciling cluster state with a Git repository. If something changes in the cluster that does not match Git, the tool either alerts you or reverts it. This works well but introduces its own complexity. You need to commit every change through Git, including emergency fixes, which creates friction when the site is down and you need to act fast.
The deeper issue is that most CI/CD pipelines treat deployment as a one-directional push: build artifact, apply manifest, done. They do not verify that what they applied actually matches what is running five minutes later.
3. Image tags hide version mismatches
Should you use image tags or digests in Kubernetes pipelines? Use digests. Image tags are mutable, so the same tag can point to different image bits over time. Referencing an image by its sha256 digest guarantees the exact bits you tested are the exact bits you deploy.
Using :latest as a container image tag is widely understood to be a bad practice, but the subtler version of this problem is more common: using mutable tags that CI pipelines assume are stable.
Here is the scenario. Your pipeline builds an image, tags it v1.2.3, pushes it to the registry, and deploys a manifest referencing myapp:v1.2.3. A week later, someone rebuilds the image from a slightly different branch, pushes it to the same tag, and now the same v1.2.3 resolves to a different image digest. Your next deployment pulls the overwritten image, and suddenly production is running code that was never tested by the pipeline that originally deployed it.
The fix: Image digests. Referencing myapp@sha256:abc123 instead of a tag guarantees that the exact bits you tested are the exact bits you deploy. But most CI pipelines use tags by default, and making the switch requires updating both the pipeline logic and the deployment manifests to propagate digests instead of tags.
Google's CI/CD best practices documentation for GKE puts this bluntly: container images should not be rebuilt as they pass through the pipeline. Build once, promote the same artifact through staging and production, and use digests to ensure nothing changes along the way.
4. Rollbacks do not work the way you think
Why are Kubernetes rollbacks unreliable? kubectl rollout undo only reverts the pod template (container image and configuration). It does not revert ConfigMaps, Secrets, PersistentVolumeClaims, or any other resources that changed with the deployment. If a bad deploy included a schema migration and a ConfigMap change, rolling back the deployment only undoes the image.
Kubernetes has built-in rollback support. kubectl rollout undo will revert a deployment to its previous revision. In theory, this gives you a safety net for bad deploys.
In practice, rollbacks in Kubernetes only revert the pod template, meaning the container image and its configuration. They do not revert ConfigMaps, Secrets, PersistentVolumeClaims, or any other resources that your application depends on. If your bad deploy included a schema migration, a ConfigMap change, and a new container image, rolling back the deployment only undoes the image. The schema migration and config change stay.
Most CI/CD pipelines do not have a concept of "undo everything that was just applied." They know how to go forward (build, test, deploy) but going backward requires understanding the full set of resources that changed, which resources have side effects that cannot be reversed (like database migrations), and what order to revert them in.
The fix: Teams that handle this well typically version their entire deployment as a Helm release or Kustomize overlay, with all resources (manifests, configs, secrets references) tracked together. Rolling back means deploying the previous version of the whole bundle, not just the container image. But this requires discipline in how you structure your deployment artifacts, and most pipelines are not set up to manage rollback as a first-class operation.
5. Secrets get mishandled
Where do CI/CD pipelines leak Kubernetes secrets? Secrets get exposed when pipelines pass them as build arguments or environment variables that end up baked into container image layers, build logs, or artifact metadata. The fix is to keep secrets out of the build entirely and inject them at runtime via Kubernetes secret mounts or sidecars.
Kubernetes secrets are base64-encoded, not encrypted. Anyone with read access to the namespace can decode them. This is well known, and most teams use something like Sealed Secrets, External Secrets Operator, or a Vault integration to manage secrets properly.
Where CI/CD pipelines introduce risk is in how they handle secrets during the build and deploy process. Jenkins, for example, can store credentials in its credential store, but if you reference them as environment variables in pipeline scripts, they can end up in build logs, cached layers, or artifact metadata. A pipeline that injects a database password as a build argument will bake that password into the Docker image layer history, where anyone who pulls the image can extract it.
The fix: Keep secrets out of the build process entirely. Inject them at runtime via Kubernetes secret mounts or sidecar containers, and never pass them as build arguments or environment variables in CI. But this requires the pipeline and the deployment to be designed together with secrets handling in mind, which is often an afterthought.
6. No quality gates before production
What quality gates should run before a Kubernetes deployment? Four gates catch most failures before production. Manifest validation (kubeval or kubeconform). Policy checks (OPA/Gatekeeper, Kyverno). Image scanning (Trivy). Post-deploy smoke tests. Each prevents a different category of failure from reaching users.
A surprising number of Kubernetes CI/CD pipelines go straight from "tests passed" to "deploy to production" without intermediate verification. The pipeline builds, runs unit tests, maybe runs integration tests, and then applies manifests to the production cluster.
What is missing: manifest validation (does this YAML actually describe valid Kubernetes resources?), policy checks (does this deployment violate any org-wide policies like resource limits or security contexts?), image scanning (does this container image contain known vulnerabilities?), and smoke testing after deployment (did the pods actually start, pass health checks, and begin serving traffic?).
The fix: Each of these is a gate that can catch failures before they affect users. Manifest validation with tools like kubeval or kubeconform catches structural errors. Policy engines like OPA/Gatekeeper enforce organizational constraints. Image scanners like Trivy flag vulnerable dependencies. Post-deploy smoke tests verify that the application actually works in its real environment. Building quality gates into the deployment process closes the gap between "the pipeline succeeded" and "the deployment actually works."
Without these gates, the pipeline's definition of "success" is narrow (the code compiled and the tests passed) while the definition of "deployment actually works" is much broader.
7. The cluster is a black box to the pipeline
Why do CI/CD pipelines miss failed Kubernetes deployments? Most CI/CD tools treat Kubernetes as an endpoint. They push manifests and move on. They do not monitor whether the rollout succeeded, whether pods are crash-looping, or whether the new version is actually receiving traffic. The pipeline can report success while the deployment is actively failing.
Most CI/CD tools treat Kubernetes as an endpoint. They push manifests and move on. They do not monitor whether the rollout succeeded, whether pods are crash-looping, whether the new version is actually receiving traffic.
This means the pipeline can report success while the deployment is actively failing. A pod enters CrashLoopBackOff because of a missing environment variable. The rolling update stalls because new pods never pass their readiness probe. A canary deployment routes 5% of traffic to a broken version that is returning 500 errors.
The fix: Closing this gap requires the pipeline to wait for the deployment to stabilize and verify health. kubectl rollout status is the minimum. It will tell you if the rollout completed or timed out. But real confidence requires checking pod logs, monitoring error rates, and ideally integrating with your observability stack to automatically abort if error rates spike after a deploy.
This is where the architecture of traditional CI/CD tools shows its limits. Jenkins, GitHub Actions, and CircleCI are built around the concept of a pipeline: a linear sequence of steps that runs and completes. Kubernetes deployments are ongoing processes that unfold over time. Testkube approaches this differently by running inside the cluster itself, giving it direct access to pod logs, execution artifacts, and resource metrics for every test run. Instead of the pipeline hoping the deployment worked, you can trigger post-deployment smoke tests from Kubernetes events, schedule them, or kick them off via API, then surface results through a centralized dashboard with full execution history and AI-assisted log analysis.
See the bigger pattern. Why testing tightly coupled to CI/CD becomes a structural problem in Kubernetes environments. Read: The challenges of testing in your CI/CD pipeline →
What this points to
These are not edge cases. They are structural mismatches between how general-purpose CI/CD tools work and how Kubernetes actually operates. The tools assume a push-and-forget deployment model. Kubernetes requires continuous reconciliation, environment-aware testing, and deployment verification that extends well beyond "the manifest was applied."
You can patch each of these problems individually. Add KinD clusters for testing, adopt GitOps for drift prevention, use digests instead of tags, build out quality gates. Many teams do exactly this and end up with a pipeline that is more duct tape than architecture: a Jenkins or GitHub Actions workflow with a dozen custom scripts and plugins holding it together. This is the pipeline sprawl most platform teams know too well.
The alternative is to move your test orchestration into Kubernetes itself, where it has native access to the environment, resources, and observability that external CI tools have to bolt on from the outside.
Key takeaways
- The failures are structural, not edge cases. Traditional CI/CD tools assume a push-and-forget deployment model. Kubernetes requires continuous reconciliation, environment-aware testing, and post-deploy verification that pipelines were not designed to handle.
- Environment parity is the single biggest source of false confidence. Tests run with unrestricted resources and mocked dependencies in CI. Production pods run under memory limits, network policies, and RBAC. The gap is where failures hide.
- Image tags lie. Digests do not. Mutable tags can point to different image bits over time, so the same tag can deploy untested code. Use sha256 digests to guarantee the bits you tested are the bits you deploy.
- Kubernetes rollback is partial, not total. kubectl rollout undo only reverts the pod template. ConfigMaps, Secrets, migrations, and other side-effect resources stay applied. Version deployments as a whole bundle (Helm release, Kustomize overlay) to make rollback first-class.
- Patching each problem builds duct tape. Moving testing into the cluster builds architecture. Run tests as native Kubernetes jobs, validate manifests against real cluster state, and verify deployments after they apply. That is the structural fix.
Frequently asked questions
Why do CI/CD pipelines fail with Kubernetes?
Most CI/CD pipelines were designed for a push-test-deploy model that predates Kubernetes. They treat the cluster as a deployment endpoint and stop watching once manifests are applied. Kubernetes requires continuous reconciliation, environment-aware testing, and deployment verification beyond "the manifest was applied." That structural mismatch causes recurring failures across testing, drift, rollbacks, secrets, and observability.
Why do tests pass in CI but fail in Kubernetes?
Tests in CI usually run in an isolated container with mocked dependencies and unrestricted resources. Production pods run under memory limits, network policies, service mesh rules, and RBAC constraints that did not exist in the test environment. A test that passes in CI can fail in the cluster because the runtime environment is different at the layer tests cannot see.
What is manifest drift in Kubernetes?
Manifest drift is when the resources running in your cluster no longer match the YAML in your Git repository. It happens when someone uses kubectl edit for a hotfix, Helm overrides values that never get committed, or admission webhooks mutate resources at deploy time. The next pipeline run can revert the hotfix or apply on top of mutated state in unpredictable ways.
Why are Kubernetes rollbacks unreliable?
kubectl rollout undo only reverts the pod template (container image and configuration). It does not revert ConfigMaps, Secrets, PersistentVolumeClaims, or any other resources that changed with the deployment. If a bad deploy included a schema migration and a ConfigMap change, rolling back the deployment only undoes the image. The migration and config change stay applied.
Should I use image tags or digests in Kubernetes pipelines?
Use digests. Image tags like v1.2.3 are mutable, so the same tag can point to different image bits over time. Referencing an image by its sha256 digest guarantees the exact bits you tested are the exact bits you deploy. Build the image once, promote the same artifact through staging and production, and use digests to ensure nothing changes along the way.
What quality gates should run before Kubernetes deployments?
Four gates catch most failures before production. Manifest validation (kubeval or kubeconform) to catch structural errors. Policy checks (OPA/Gatekeeper, Kyverno) to enforce organizational constraints. Image scanning (Trivy) to flag known vulnerabilities. Post-deploy smoke tests to verify pods start, pass health checks, and serve traffic correctly. Each gate prevents a different category of failure from reaching users.
How do I run tests inside a Kubernetes cluster?
Tools like Testkube run tests as native Kubernetes jobs inside your cluster, executing under the same resource limits, network policies, and service configurations as your production workloads. You can run Postman, Cypress, JMeter, k6, Playwright, Selenium, and other frameworks without writing custom pipeline glue. Triggers include CI events, Kubernetes events, schedules, manual runs, and AI agent workflows.


About Testkube
Testkube is the open testing platform for AI-driven engineering teams. It runs tests directly in your Kubernetes clusters, works with any CI/CD system, and supports every testing tool your team uses. By removing CI/CD bottlenecks, Testkube helps teams ship faster with confidence.
Get Started with a trial to see Testkube in action.




