Responsive

Testing in Modern CI/CD Pipelines: The Good, The Bad, and The Ugly

October 23, 2024
:
28
:
18
Ole Lensmar
Ole Lensmar
CTO
Testkube
Cortney Nickerson
Cortney Nickerson
Developer Advocate
Testkube
Share on X
Share on LinkedIn
Share on Reddit
Share on HackerNews
Copy URL

Table of Contents

Subscribe to our monthly newsletter to stay up to date with all-things Testkube.

You have successfully subscribed to the Testkube newsletter.
You have successfully subscribed to the Testkube newsletter.
Oops! Something went wrong while submitting the form.

Transcript

Streamed at Kubernetes Community Day UK 2024

Ole: Let's jump straight in. Thanks for coming. Great to be here. Courtney, myself, you wanna start? Talk bye.

Cortney: So today we're going to be talking about testing and CICD pipelines, the good, the bad, the ugly. My name is Courtney Nickerson. I know a little bit about good, very little, as you can see in this photo, and a lot about bad and ugly. It was my son's birthday party just yesterday, and I did survive. He just turned six. It was bad and ugly, but also good. So very appropriate to set the tone for today.

I'm a developer advocate at Kubeshop. I also do whatever else they ask me to do. He asked me to do a plethora of different things, and I do my best to do that.

I co-organize the CNCF group in Bilbao. It this is not my Spanish accent. I was born in the US, but I live in northern Spain, and so I organize the CNCF group there in Bilbao.

I also help organize KCD Spain and KCD Islamabad, which is a semi-passion project of mine because we keep increasing the number of women who are participating as speakers. So, that's that's a little bit about me. Oh, and in my spare time, I do also, work do things as an ambassador for for Civo, which is very fun and and a very cool tool if you haven't tried it out.

And, yeah, what else? There's not much else to say about it.

Ole: Oh, there's a lot more to say, but only good things.

Cortney: Only Those are all the good things we'll keep the the bad and the ugly for for after the talk. And I'll pass it over to Ole.

Ole: Thanks. So I'm, Ole Lensmar.

Great to be here. I'm CTO at Kubeshop and for the Testkube project. Also if you've used something called SoapUI, I created that exactly twenty years ago now. It's It's been a while. It's still around. I was also very involved in the Swagger ecosystem. If you're familiar with Swagger, something that I worked on a lot.

I do spend my free time on my bike, playing guitar, and or at the cinema. Usually not everything at the same time, but in this GPT generated picture that's where it ended up. I do also have children but they're at an age where they'd rather not be in a photo, especially not when I'm involved. So I'll just mention them and move on. Okay.

Audience Poll

So we're actually gonna do a short, poll, please, because it'll help us, a little bit later when we dive in on things. First of all is which CI/CD solutions do you use? So maybe using Jenkins, ArgoCD, GitHub Actions, GitLab, Azure DevOps, Bitbucket, CircleCI, I mean there's a lot. Tekton, there you go. Awesome.

Cortney: Very cool. Quite a spread.

Ole: It's very cool. I have never seen this before.

Cortney: Wow. Keep them coming. That's great. Even some bamboo. Urban code deploy. Team city. GitHub. So we've got a lot of I know. Git practitioners with us today.

Ole: I guess what we can see here is if people use more than one. Maybe a hand in the air would be great just to see if you're using more than one.

Okay. A lot of you. Okay. That's great.

Thank you. Okay let's because we have three questions and we don't want the entire presentation to be about the poll. As exciting as it is I'm gonna Since we're gonna talk about testing it'd be great to hear kind of what testing tools tools you use in your CICD, or outside but preferably. So that could be like k6, Playwright, Cypress, Selenium, JMeter, Pytest, Postman, Rest Assured, Cucumbers, SoapUI.

Cortney: Clearly he's the CTO of the bunch.

Ole: Trivy Kurate. Oh, that's a nice one. Ginkgo, definitely JUnit. Gauss is a new one to me. I'll have to look that one up. Home unit Test.

Cortney: Okay. Five more people typing. That's great. Oh, six.

Ole: Lots of typing. This is actually a typing exercise, this, this presentation. Okay. So if if I move to the next slide, what happens on...

Cortney: You'll cut them off.

Ole: I'll cut I don't wanna do that.

Cortney: So ten more seconds for people to respond.

Ole: Okay. Nobody's typing. Here we go. And the last one is, multiple choice because it could have been a lot to type. So what what's the hardest thing when it comes to running tests in CI/CD?

Cortney: Oh, wow. Wow. That was fast.

Ole: So is it hard to define how to run your tests? Okay. Is it scaling? So running, you know, your load test at scale or running a lot of tests at the same time? Is it about troubleshooting? How do you get to logs and reports and artifacts that might be generated in your infrastructure or from your testing tool? Is it reporting and analytics kind of over time, pass fail ratios, that kind of stuff? Or is it just triggering your tests from your pipelines or from your GitOps reconciliation process or whatever else? Or maybe you don't have any struggles at all. There you go. At least some of you are the lucky lucky ones.

Okay. Thirty people answered.

Cortney: That's Forty percent defining defining how to run tests. That's fantastic. Great.

Ole: Okay. Great. Thank you so much.

Cortney: In the right place.

Ole: So, yeah. Okay. Let's more So I know I'm gonna be preaching to the choir here, but, what is the CD CI/CD pipeline? I think everyone knows, but I'll say it anyway. CI is about integrating your code changes regularly to your code base and then building any artifacts from that. And then CD is then deploying those artifacts into your infrastructure.

So to try and and then I try to kind of separate those two and then I'm sure you've seen some version of this picture before where the different steps, in the CI/CD process.

The Need for Testing in CI/CD

And obviously testing is super important across all those steps, right? So you you wanna be testing both functional tests that could be unit tests, API tests, UI tests, end to end tests, anything that kind of tests the functionality of your components that you're building or the code you're writing. But also non functional testing, super important. So performance testing, load testing, security testing, compliance testing, accessibility testing, if you're building something for a public or well for anyone to interact with. Or, acceptance testing which kind of falls a little bit in the middle.

And, these are all things that are great to automate, as part of those CI/CD pipelines. I will say, because I have a testing background, that doesn't mean you should only do automated testing. You should definitely also invest some time in to exploratory or manual testing meaning get someone there using your system because people are much smarter than tools. At least for now. We'll see, where we end up. And having someone test the tool and I'm just testing what you're building manually and trying to be creative and finding errors and doing that kind of thing, as a compliment to automated testing is is a well worth investment. But I'm not gonna go down that path anymore.

Challenges with Automated Testing in CI/CD

So there are a a couple of challenges with testing and automated testing in CI/CD. And I I'm gonna talk a little bit to that and then we're gonna show a little bit of things as well. So one of them is that we what we see a lot is the proliferation of testing tools and scripts. So if you go back twenty years in time, maybe the number of testing tools wasn't that big. There was Selenium for browser UI testing. There was an API testing tool which I've mentioned a couple of times and there were others.

Now there's a whole bunch of tools, right? There's Cypress, there's Playwright, there's Selenium, there's Others Robot Framework which which is on Playwright. For front end testing there's a bunch of load testing tools. We have JMeter, we have k6, we have artillery, we have gatling. There's a bunch of unit testing frameworks and there's a bunch of tools in between. And since we have this shift left which some of you might know where you wanna do testing earlier. And we've also had these microservices architectures where everyone does their own thing and maybe it's even in a polyglot environment where everyone can choose their own language and technology stack. You're gonna end up with a bunch of testing tools.

And how do you make sure that all of those testing tools can work in your CI/CD pipelines? And you might be even running multiple versions. One team is using Cypress 10 and the other team is using Cypress 11. And how do you make sure that all of that works nicely in a, you know, proliferated environment? Oh, and something just happened which I can't account for, but I'll just continue talking. Wait a sec. Not sure why it did that.

Next one is proliferation of CICD tools. Once again, twenty years ago it was Jenkins, not yet maybe, the precursor to Jenkins. There were some others, but now a lot of people have Jenkins and then they added maybe a GitOps tool like ArgoCD, or they're moving to GitHub Actions or GitLab or any of the other tools that we saw here. So how do what does that mean for testing? And now you suddenly you have five, six, seven, eight testing tools. You have two or three different tools that are running your pipelines. How do you make sure all of those tests are still running, they're using running in a consistent environment so you get the same results wherever you're running them?

And then running tests whenever needed. Running tests just in your CICD pipeline is great as part of your build process, but it's pretty common that you might want to run your test out of bounds. I just want to go in and run the test. I don't want to run the entire build. Where I want to run my test, when my resources are updated in my my Kubernetes cluster or when, I don't know, some other event happens which is not strictly for CI/CD, and which makes today, which if you can't do that, if you're buying if you're binding your test execution to your CI/CD, it usually slows down your entire process, and you don't have the flexibility that you might need.

Running tests at scale, we saw that as one of the one of the here. So a lot of people are running tests, that's great. What if you wanna generate massive loads, for a k6 test or a JMeter test? Do you go to a cloud provider like Blazemeter or k6 cloud or do you wanna run it inside your infrastructure?

What if what if you wanna scale out your functional tests? Say you have a thousand Cypress tests. Running those in sequence can take hours. You might wanna parallelize those tests to run in parallel to cut down your test execution time so it's realistic to run all those tests at the same time.

And then once again, this multiplies if you take on the the previous, bullets I had. Obviously, not a problem of a startup or of a an early project. But, you know, hopefully that startup will eventually be a huge company with hundreds of, you know, engineers and maybe even a QA team. And, you know, as as they've evolved, they've adopted different tools and they still have legacy tools in house.

Test troubleshooting and test result, which was a test result management, I guess it's supposed to say. Once again going back to earlier, if you let's say your if your tests are failing, how do you know words how to troubleshoot that? Let's say your test touches a microservices architecture. You have ten microservices involved. You're gonna have to collect logs from all of those maybe to see, you know, to can I find some stack traces?

Can I find some errors? You're gonna want the errors and logs from your testing tool. Maybe it recorded a video of what it did or a trace, etcetera. And all of that, how do you pull all of that together, make it easy for the tester or the developer to get access to with without giving them access to your infrastructure from a security point.

So there's a bunch of kind of complexities there. And last one, giving controlled QA. We see this also. The DevOps team doesn't always want to let QA into the Jenkins environment or into the GitHub actions environment.

They say, hey, you know, so the QA team, well, can you, you know, change the arguments for the playwright test or you can do this and that and that goes into a ticket into Jira and then, you know, and a week later that gets updated. And so which is a barrier obviously for kind of having a fluid and more agile process.

Lots of good stuff there. So Courtney Yeah.

Traditional Test Orchestration with Scripts

Ole: I know this what I talked about was maybe a little bit abstract and and high level, so I can hand it over to you for a little bit more A little onto earth description of these challenges.

Cortney: Yeah. Let's take a look at what these challenges actually look like in a in a pipeline or workflow that is a bit more familiar and not quite as as abstract. So a lot of people are starting off all of this with the CI/CD build system. That's where you get started and that's where all of this action actually takes place, but these systems aren't actually built to run tests. They're built to do totally different things, just deliver software, do continuous integration of that. And testing, while it is a component, it's not exactly what these tools are built for.

So we're already starting off with a block that isn't ideal, and then we start there and then we trigger into the associated scripting systems where scripts are usually stored in a lot of your cases we saw in the survey in in GitHub, it triggers GitHub actions or Jenkins pipelines, and then that's going to go ahead and trigger your actual tests, and then that's going to give you some sort of output with artifacts and other results that you need to have access to in order to be able to troubleshoot. But none of these things are actually in the same place because they've got different places for each type of test that you've run depending on how you've got that set up. And so this becomes a bit complex.

Let's take a look at all of the different places. This is just starting off with really basic stuff of where things can go wrong or become complicated or start taking up time, because, obviously, this is not an ideal system. You've got a DevOps team that's going to be called upon by a lot of different people in order to maintain all of this and keep it up and running. That now becomes a human issue.

We all know those are the most difficult to manage. And that's cutting back the amount of time that people have to be productive so that they can manage all of these different things. Then you also have the fact that in order to trigger these tests, it's bound to your CI/CD system, and a lot of times you don't need to trigger tests and a build. You should be able to trigger just the test. It's counterintuitive depending on your situation and and what it is that you're testing to have to trigger from a build process, so that adds on another layer of complexity.

Then you have manual scripting. I'm sure anybody who's run tests, manual scripting is your favorite thing next to YAML. It's where you love to spend your time, right? But obviously in this type of setup, you're now doing a lot of manual scripting. People's levels of being able to do that are also varying, and it leads into a lot of different problems that can happen there with human error.

And then once you get past that, you're now into scaling, because this is a Kubernetes event, and everybody who's here is interested in being able to scale all of their projects, and whether it's load testing or anything else, to be able to do that at scale is already quite complicated to orchestrate. When you already have all these other things that you're dealing with just to get started, then scaling becomes a bit daunting, in and of itself.

Then versioning, everybody wants to be Git-oriented and know what's going on and keep track of things, but how do you do that when you've got three, four, five, maybe six different tools running with all of their various components?

And then we're moving into security issues. Your artifacts are in different places, your security team doesn't want them in different places, now you're having to deal with your security team who nobody wants to talk to those people. At least I don't.

So you've got all of these different things that are happening, and then on top of it you then have to report. How are you doing in your quality assurance? Well, it depends how you're going to pull those reports together, because apparently your Cypress report is in one place, and then you've got your going report, and you've got your JMeter report somewhere else, and so you have to find some way to pull all of those things together. And so this is definitely something that becomes quite difficult to do, and on top of it is spending a lot of time trying to sort out all of these different things, and all of it starts with the fact that your CI/CD pipeline is not ideal. It hasn't been built for testing. It's built for something else and we're just substituting in order to get this done.

Homegrown Test Execution Machines

So this is kind of what everybody's text execution machines look like. It's a hodge podge of different things, and they're all kind of held together by a thread, and you're hoping that they don't all fall apart. And so you can see all of the different components there, and how do you possibly orchestrate anything to be able to scale out your tests and get all of that to go cluster? Because at the end of the day in order to ship your software, in order to ship anything that you are building, you're kind of dependent on how quickly you can get your tests executed and get results from those and troubleshoot those.

And if it's not dependent on that, then basically you've decided that quality isn't really important. And we all know we probably aren't running as many tests as we'd like, but we also all know that a huge part of that is because all of this is very difficult in order to to manage and upkeep.

So there are some people who are working on different things. There is some good at the end of all this bad and ugly that can come out.

And Ole is going to give a demo. So we've got Testkube. Testkube is a free and open source project. It is on the CNCF landscape. I promise it's there. I know that landscape is quite vast. But it is there. You can find it. Now that they have a search bar, you can find it even quicker, under app deployment. It does have a commercial component, but you can get started with a free and open source, version of it. It is compatible with any testing tool. This is very important. We're all running a lot of different tests, so it's compatible with any testing tool. Tests are added to your cluster as a literally as a Kubernetes component. They are all CRDs.

So yes, they did say nobody loves YAML and nobody loves scripting, but at least YAML is readable in a much easier fashion. And we you have all of your scripts in one place or all of your tests running in one place, then at least you're managing them all all at the same time. And that's the other big advantage of Testkube, is it gives you a single pane of glass across all of these different things and components because it has actually been built for testing. So you can trigger things from your CI/CD, but you're going to be able to have actually all of it in one very usable UI. And we can go ahead and take a look at what that looks like.

Ole: Great. Thank you. So

Cortney: Back to you.

Testkube Demonstration

Ole: Yes, I'm going to do the short demo of Testkube, but obviously a lot of what we're talking about has been have been leading up to this. And this is kind of the problems we've been seeing. So once again, Testkube tries to solve a problem that relates to the proliferation of testing tools, CI/CD solutions, and, just more complex architectures and workflows related to testing. So it's maybe not for the project that just uses postman tests in GitHub actions. It's more for the project that does much more complex things. So I'm just going to show you a little bit what it looks like. This is the Testkube dashboard.

And, Testkube works with environments. These are different environments where, the Testkube agent which runs the actual tests inside your clusters is installed. I have one running in kind on my local machine but there are others. This is, running this other one here is Oops. Can we please click here? Is running in Google Cloud.

So, you can see this has different configurations. So Testkube, works with the concept of workflows and a workflow is YAML. Won't hide with that. A YAML definition, which is stored as a CRD inside your cluster, which defines how to run your tests. And workflows as an abstraction above any testing tool. So as you can see here, there's a bunch of different workflows for pytest, for postman, for, I don't know, for whatever else, k6, JMeter, gradle, artillery, etcetera, etcetera. Just to kind of, show you how that can be done. I'm gonna jump into one of the workflows I had in my environment.

There's a playwright, execution. And one of the nice things with Testkube is that it also does the parallelization of tests for you. So this is a pair, playwright test. Let's just run that right away. So this is now running on my local kind cluster. You'll see that it's running these different steps. It's parallelizing the Playwright test across two nodes, and at the end we're gonna get some results when this is done. We can have a look at those. There we go.

So first of all, we can see the logs from all of the individual steps that were done as part of that workflow. We can look at the two workers that we had and we can see the commands that were run on each worker. We can also look at the artifacts generated by Playwright. In this case we have a Playwright report. So this is generated by Playwright itself.

And this actually contains like traces and all that kind of stuff. So Testkube kind of grabs all the reports for you from any testing tool that you might be using. We can look at JMeter reports and others later on. It'll also show you kind of flowchart of how you test. You might wanna parallelize your test across many, many nodes. We can do other examples later on.

I do wanna show what the YAML looks like. It looks like YAML. Surprise. Yeah. It's it's YAML. These are CRDs stored in your cluster, so you can use Argo, ArgoCD, or any GitOps Flex, of course, to synchronize them, into your cluster and then run those either based on, you know, any kind of event that you want, and which takes us to triggering.

So these workflows can, of course, be triggered from your CI/CD solutions, and we provide kind of inline code that you can just copy paste to GitHub Actions, GitLab, Jenkins, even for ArgoCD or Argo Rollouts. If you're doing progressive delivery, you can use these tests tests as an analysis template to kind of help you with both Bluegreen and Canary deployments, kind of validate as your that your deployments are working. So this kind of decouples the test execution from CI/CD. You can run it from any of these tools.

There's an API, there's a CLI, and there's a bunch of other things. So you can actually connect it to triggers, which is a low level mechanism in Kubernetes. So you can listen to events in Kubernetes itself. So let's say you wanna run a test when a config map is updated because that config map reconfigures your ingress controller and you wanna make sure that when that gets reconfigured that your tests continue running.

Or you if you have a deployment that that gets updated and you wanna make sure that you're running to your end to end tests, every time that deployment gets updated, regardless of if it's updated manually through kubectl or through ArgoCD or through some other pipeline.

You can hook into, the triggering mechanism in Kubernetes itself. Webhooks, you can I'm not a good example here, but you can hook, sorry, go over here. Obviously, you can connect this to Slack, to PagerDuty's app here, etcetera, how you might want to notify for different things.

It also has support for CDevents which is a CD foundation standard, which helps you integrate. So if you someone was using Tekton which is also CD events, CDF project, also has, and actually for Tekton specifically, I think we did see, CI/CD integrations Tekton. There's an example here of how to do that. So super easily to integrate into whatever you might be using to help you run your tests.

Lastly, I don't want to go too deep in the demo because we're also a little bit short on time.

We've talked about, reporting. And this is also, Testkube will give you both insights into kind of the number of which tests you run, which tests take the longest, which could be interesting from a DevOps perspective. So how do I find those tests that ran for twenty, thirty minutes? Maybe I should look into those, allocate more resources.

And then of course more traditional test reporting where you can look at, you know, by pass/fail ratios, that kind of stuff, to figure out, you know, more higher level quality of all of your testing efforts. And this is of course across any of the tests that you run. Right? So if any of the tests that we saw here, so we have a a pytest here for example and, test that was run, pretty simple one. Nice thing here is that we capture jUnit reports generated so you can even see those directly from inside here, similar as you can do in Jenkins and other tools. So we are being inspired by what's what's already out there, of course.

Lastly, I'll show you a, JMeter test. So JMeter is a load testing tool as you probably know. We can look at the latest execution of this one. We've run this quite often as you can see. And we do have oh, yes. We do have a report here. So this is the report generated once again by JMeter. So once again, it's available in one place. You didn't have to give we don't have to give access to, any other systems than Testkube itself to work with results and do all that kind of stuff that we've talked about earlier. Okay.

And I'm actually gonna stop here. There's a lot more features but it's kind of the high level, rundown on what, Testkube does and how it and I think once again going back to what we talked in the beginning, the problems we're trying to solve around that complexity about scale in your testing efforts, with proliferation of tooling and processes.

Cortney.

Cortney: Yes.

Ole: How would you like to conclude?

Cortney: Yeah. I think the the other big thing because there was so much in our poll about defining tests. There's been a lot of work done within TestCube to make the definition of your test much easier straightforward for people, especially when you're just getting started, or you're not really sure exactly how you want to go about testing that you can then build off of a wizard to be able to set up your test and then kind of grow it from there.

That's a great place to start when you definitely don't know how you want to test something. Or you've got some tests, but you think you need to go a step further with them. It's a much easier intuitive way to be able to kind of see what's going on and read through through what that test is actually doing.

Other than that, there's a lot of of ways to get in touch with Testkube team, or with Ole. He's a plethora of information. He's been testing for how many years now, Ole?

Ole: Oh, twenty five. There you go. Long. I haven't asked that question. I don't know.

Cortney: And he's still going. So obviously, there's a lot that still needs to be done in the testing space.

Oh, here's a great, great example of of one of the one of the wizards to define your tests much easier.

Ole: Yeah. I'm just this is a k6 test. Let's see if it shows up. Oh, it didn't because I already had that one. Sorry. Okay. Oops. Where'd it go? Oh. Here it is. So this is a case k6 load testing. I just kind of show I created that test. I didn't show actually how to create a test. It's running across three workers here, but I could have set that number to ten. Although my local kind cluster might have struggled a little bit.

Once again, as you can see, it was super easy and I can get the individual. Unfortunately, k6 in itself isn't great at reporting, but you do get, a little bit of an output here. But they have a nice, alternative solutions that you can look at for k6 specifically. Okay. I'll stop there.

Cortney: There we go. Thank you.