I recently joined Alan Shimel, editor-in-chief of DevOps.com for a chat about what it means to be a Kubernetes-native security platform and why we believe it’s the most effective way to secure containers and Kubernetes. You can watch our conversation in the video below, or you can read through the transcript of our talk that follows, condensed and modified for clarity.
Alan: Hey everyone, it’s Alan Shimel, editor-in-chief of DevOps.com and Security Boulevard and we’re coming at you today on our Digital Anarchist platform. Happy to have sort of a surprise video interview with a friend of mine. I’d like to introduce you all to Ali Golshan, CTO of StackRox.
So Ali, we’re going spend the next 20–30 minutes talking about Kubernetes, containers, and security but before we do, I just wanted to get out of the way; maybe there’s some people in the audience who are not familiar with StackRox, so let’s take a minute or two and kind of put that to rest. Who is StackRox? What do you guys do? And, we’ll go from there.
Ali: Sure. StackRox was founded in January 2015. Our main mission has been to build safety for developers, and security tooling for security engineers and operators with a specific focus on Kubernetes-native controls, securing containers, and microservices on top of it. So, rather than looking at the world through their lowest common denominator which is containers on top of any orchestrator, we decided to double down and take a look at it and say, “well we think orchestrators are becoming the operating systems of cloud for the future, which one of these do we think is going to be the dominant one?” So we bet on Kubernetes. We look at ourselves as a Kubernetes-native security platform for cloud-native solutions.
Alan: Well that was a good bet in hindsight right, and this is why I’m much better betting in hindsight than making bets beforehand, but you picked the right one. But seriously, when we look at Kubernetes, you know, I just had this discussion earlier today with someone. They asked me what is Kubernetes and why is it important? And, I tried to explain to them, you know VMware spent about five billion dollars last week on some acquisitions and so much of what they’ve spent and so much of the VMworld buzz coming out has been towards DevOps and Kubernetes. What powered the cloud for the last 15 years?
It was the hypervisor, and how quick, like the snap of a finger, sort of a Thanos snap, from the Avengers, that changed the course of history and now it’s not the hypervisor that is dominant, it’s the container, and the container orchestrator, that has become the dominant infrastructure in the cloud.
This sounds like a Marvel movie, but it’s true. Have you guys amassed any findings on just how prevalent Kubernetes has become or any kind of the growth around it?
Ali: Yeah, so I think a good way to think about it is maybe taking a step back. You mentioned virtualization was really what triggered this entire cloud movement. If you look at a few years ago (2013 – 2014), the majority of the drive around containerization, virtualization, and cloud was still very much focused around monolithic applications. What I mean by that is that you either wanted to run that monolith on-prem, run in a virtualized environment so you get some additional output out of your system, or they all float out to the public cloud. What has really driven this now, has been this push towards microservices.
So, Kubernetes’ emergence has really been due to the need for building distributed systems that have high availability and performance and are highly scalable and automated. This was a big realization and is separate from the move to containerization.
What I mean by that is, if you look at four or five years ago, developers were starting to use containers. And, I’m a big believer that everything in user space will eventually run in some isolated format, so containers happen to be the current form factor. But containers are a decision that developers, DevOps, or engineers generally made as a tool to use.
If you think about Kubernetes, Kubernetes is essentially a business decision. You make it as part of your business or digital transformation because it’s a huge bet you’re making. The reason for this is that it affects your developers, your Ops, all the way through your security folks. So, what we are seeing is really twofold.
We’re seeing a large class of companies who build on top of other types of orchestrators because of their scale requirements — for example Mesos is now migrating towards Kube. We’re seeing any new company that is either building SaaS or building SaaS services for their customers are building on top of Kube and then the other thing we’re actually seeing is that a lot of customers use Kubernetes as part of their management for their edge-based computing and their distributed computing.
So, what I will say is, the adoption of Kubernetes is as high velocity as containers, but you’re now starting to see the day-to-day impact of it because people initially built containers and then once that got to a certain density level, they looked at it and said “holy crap how do we actually manage, orchestrate, and move all these things around?”
And, that’s where Kubernetes came a little bit later — a couple of years after the initial containerization movement.
Alan: Absolutely, you know you said a lot of things in there. I’d love to pick a few that we can go deep.
So, first of all, let’s put Kubernetes to the side for a moment and talk about containers and microservices. Containers have been around in Linux world for as long as Linux has been around virtually. No pun intended.
But it wasn’t until maybe Docker, in the last five – six years, that ushered in this container revolution. But I think part and parcel was that containers were really a great architecture for moving from monolithic to microservices-based applications, right?
But, just like everything else I’ve seen in technology, what we saw here was we ran ahead and did this stuff because there is a good reason for it and it brings some real benefits, and then all of a sudden someone said “gee, what about security?” and “oh yeah, there’s that security thing again.”
We saw it with the cloud. We’ve seen it at every step of the way from decentralized to centralized client-server, endpoints, and mobile. And so, with container the first response was “well we’re going to take our security that we already have and we’re just going to containerize it.”
And it really wasn’t that good. I’m trying to be kind, but it stunk. It was the wrong tool for the job, and as anyone will tell you, it’s all about the tools. And so, we saw a tremendous mismatch. I was dismayed where the conversation around container security seemed to start and end with doing vulnerability scanning, which you know is what I was doing in 2003.
You’re going to tell me we haven’t advanced the art since then and this is a different architecture? There’s more to be done here than scanning for vulnerabilities.
Kubernetes gave us some meat in which to do security with. It gave us some means; it gave us some infrastructure. You know much more than I. Tell us about it. What happened here?
Ali: Yeah that’s a great point. The way I think about it, is in the context of crawl, walk, run. When you’re talking about initially people using containers and using a lot of traditional tooling and security: that was the crawl stage where a lot of customers were experimenting with containers. What does Docker do? What does CRI-O do? What are all these container technologies enabling my developers to do?
I think what ended up happening is — especially if you kind of rewind five years ago — those containers were running with limited capacity. Even if they were running in production, they were very much nested in your traditional infrastructure. So, you still had firewalls and WAFs, IPS, IDS, and EDRs, and because of it, the amount of impact on your enterprise was relatively low because you weren’t running mission-critical applications in there.
And that presented two things: one, naturally people didn’t want to go invest specifically for some low-level risk thing, second (part of which is quite frankly my issue with the larger security industry) is as soon as there’s an emerging technology and security, the majority of security vendors just look at it as a marketing problem. “Oh yeah, of course we cover that,” or “Oh of course, our products can secure that,” which in the case of containers, and specifically moving forward, Kubernetes and microservices, is untrue.
You need an entirely different architecture because the structure of running applications immutable, ephemeral, distributed, and then the introduction of the entire DevOps lifecycle of CI/CD impacts that.
So, that’s kind of where I feel like we moved from crawl to walk, where companies realized, “OK initially we had some protection around these, and now we realize that traditional vendors don’t solve this. So, what is the lowest hanging fruit we have to go solve?”
Earlier, you touched on vulnerability scanning. I’m a huge advocate because I think understanding what vulnerabilities you’re introducing into your environment, doing proper image scanning so you understand packages, dependencies, and licenses, is all very important.
Now, the problem here again was that traditional models of vulnerability scanning (either at runtime scanning or traditional static) did not apply to this DevOps cycle. So, with containers then came the acceleration of CI/CD tooling and the DevOps workflow. Being able to have vulnerability scanners that integrate into your CI/CD and registry, and scan images and fail builds shifted security further left.
It was a really great starting point, and a couple of companies have done really well around that, and that was what I call the “walk stage.”
And, now here we are where we’re entering that run stage over the next year.
Companies realize you can’t use traditional security. It’s not just about vulnerability scanning, but the tooling you need have to be native to the infrastructure used; they have to be native to Kubernetes or other form factors you use. And, they have to be full lifecycle. They can’t be runtime alone, they can’t just be build. They have to integrate into your build; they have to integrate into your deployment; and they have to do things at runtime. And, it’s that consistency, where the common language being used is Kubernetes, that has brought us to this stage.
I think that’s the kind of the progression we are seeing. Companies went from experiment to run, and are now truly operationalizing — and that’s where Kube, containers, and the new security form factors are coming in.
Alan: I agree with you. What’s interesting though too is in a perfect world, companies opt for less tools not more tools.
When we look at something like security for microservices, containers, and Kubernetes, people don’t want one security tool for pre-deployment, another security tool at the time of deployment, and yet another security tool post deployment. And if they do, those three tools better have a pretty damn tight integration. I don’t want three different interfaces, three different languages, and three different paradigms.
I think it’s important that this next generation of tools, let’s call them Kubernetes-native, for lack of a better word right now, understand the different phases along that software development lifecycle and either can handle the whole lifecycle, or are very, very tightly integrated with tools that go along this.
Where’s StackRox in that?
Ali: Yeah, great question. So, the way I think about this is that traditionally the model was for developers to build it, and hand it over to Ops. Ops would then operationalize it, deploy it, and then security becomes a bunch of gatekeepers.
In that traditional world, every company wanted to become a horizontal platform — a thing you plug into from end-to-end. And integration was very tough; API driven models were not very common. Now, with the rise of APIs and RPC, that’s hugely advanced.
The way we think about it — and we actually tell our customers this — is you should think about Kubernetes as that horizontal platform that you should force all other vendors to plug into. So, nobody should be your workflow. Everybody should build their workflow as part of your Kube, your CI/CD, your DevOps, and your SDLC lifecycle.
We take a very similar approach. When we think about safety across the entire lifecycle of build, deploy, run, we think about different use cases, not different features at every stage. And I think that’s a very important part of it, because Kubernetes is about operationalization at scale, which requires automation.
What that means is that at the build process, we want to make sure we do image scanning and understand vulnerabilities. But if somebody has their own vulnerability scanner, we plug into the ones they have; we don’t have to provide our own.
At the deployment stage, we run checks for CIS benchmarks for Docker and Kubernetes, NIST, PCI, and HIPAA. At runtime, we do configuration management, networking, firewalling, and detection. But the key thing is the output of all this information is consumable by Kube itself, the way we produce it. That’s how we think about it.
When we want to make sure something happens at build if there’s vulnerabilities, we fail the build so it’s not introduced into production. At deployment time, we use the Kube-native constructs, like admission controllers or scale-to-zero, if an image is not meant to go into deployment or if it’s violating certain policies.
At runtime, rather than injecting ourselves in as a proxy or inline between the runtime engine and a host, which causes a lot of operational friction and you become part of critical path, we use things like pod egress/ingress policies to program layer 3, Istio and service mesh for layer 7, and we use the notion of killing pods and containers.
It’s one thing to integrate into Kubernetes, but in my view, being Kubernetes-native is more than just integration; it’s building your workflow in the same way the workflows in Kube and CI/DI work.
Alan: You’re 100% correct, and that’s really cloud-native versus just integrating into Kubernetes.
So, that brings up the whole cloud-native thing. So, for instance, you spoke about service mesh and Istio. We’ll hold that up as an example. When I think of the whole Cloud-Native Computing Foundation, Kubernetes gets an inordinate amount of the press around it. But it’s kind of like an American naval fleet, where the aircraft carrier is the flagship. But there’s more to a fleet than an aircraft. There are battleships, destroyers, cruisers, frigates, submarines, support ships, and amphibious landing ship.
We see all of these in the cloud-native world. Some of them directly support the Kubernetes mission, some of them exist in their own, independent of Kubernetes.
I got to imagine for StackRox here, you guys are saying, “wow you know it’s good because we have a blueprint going forward but my goodness we’ve got to keep our tails in this and keep moving as fast as we can to keep up with this.” Talk about some of these challenges.
Ali: Yeah. I think when you talk about Cloud-Native Computing Foundation, there is a lot more to it than Kube. You’re talking about things like Prometheus, Envoy, CoreDNS, Containerd, and a lot of other tooling. The best way to think about this is that this ecosystem is meant to accelerate cloud-native adoption and tooling. From building to monitoring to servicing to managing across the board.
Each one is important from the standpoint of what is the use case and what is the value you are trying to add to your business? The reality of it is you may not need a lot of this or some of it may not be necessarily relevant for your business.
So, the way we look at it at StackRox is that we look at our long-term goal of securing distributed systems, whether they’re running containers or serverless, on-prem, multi-cloud, public cloud, it doesn’t matter.
However, our focus is Kubernetes, and a Kubernetes service mesh, because we want to secure this new cloud operating system. I’m a big believer that in enterprise there’s no such thing as being a visionary. There’s this understood set of problems, and if you solve those problems the right way, the customers will actually pay you for it, and then you can scale that, and they’ll ask you to solve additional problems.
You can’t suddenly come up with something that’s really interesting, that solves no particular problem. So, under that context, our general trajectory is relatively clear. We know what we want to do; we want to make the lives of developers easier; we want to make sure they use their native tools so if we alert on things, they consume it through things like Slack, JIRA, or Pager Duty.
When we write policies, we’re writing them using Open Policy Agent (OPA), or using YAML that is native to Kubernetes. The biggest thing we talk about is avoiding vendor lock-in because if we design things that forces a proprietary model, those proprietary components are the things that come in and end up being a dead end for you as a customer.
If we want to operationalize developers, give security folks the right tools, and scale this using the existing constructs, what are the use cases we want to center around? That naturally positions us as saying “okay we want to understand visibility, compliance, configuration management, detection, response, networking.”
If you break each one of those categories down, you can look at the CNCF Foundation and at the tools that contribute to each one of those use cases, our approach, then is to not reinvent the wheel. We don’t want to build layer 3 segmentation controls because it exists in Kubernetes Pod Policies. We don’t want to create, for example killing system calls that are unnatural, when we can run as a daemon set and kill pods using native Kubernetes capabilities.
We add value where we see there’s a gap, and we leverage the existing infrastructure and tools where they are available. We look at our company as a long-term build. If we know Kubernetes.io is releasing a feature set for networking, or Istio is coming out with something in nine months, rather than going and building that and trying to commercialize that in a shorter term, we wait for that and help operationalize that the right way, the safe way, and the secure way, once it’s introduced.
That way, we can also tell our customers, “listen, our interests are aligned. If you don’t see value in the product you can rip us out. And, all that logic, all those rules and heuristics, are written into your infrastructure, into Kube. There’s nothing proprietary, there’s no lock-in, so we are partnered to make sure we both get value out of this.”
Alan: That’s fantastic. That’s being a good community member.
I had this conversation yesterday with a c-level person at a large company that was just acquired actually, and you know, even in open source — which all this cloud-native stuff is — we’ve moved from let’s call them the “Richard Stallman years” where it was kind of anarchy and “the heck with big brother,” to a period where big brother ran open source. Every open source project had a benefactor manager who contributed 99% of the code and was exploiting it for commercial gain.
Now we’ve moved to the age of the foundation and it really represents democratization, if you will. It’s spread out among a larger group of vendors, individuals, and users. And, just like in any democracy, if you follow the norms of society within the rules, you can not only flourish, but you help make it a better world, and it’s a better foundation.
I’d be remiss to mention that I’m pretty involved with the CDF, the Continuous Delivery Foundation, which is a sister organization to CNCF, and it’s a similar thing there. And by the way, I don’t know how involved or up to speed on you are on CDF Ali, but there are security challenges there too. They’re not necessarily Kubernetes specific, though more and more we’re seeing Kube dominate there as well, in the CI/CD tools.
We’re going see the same sort of thing there. What do you see from StackRox’s point of view for that?
Ali: This is an area where we’re really interested and eager. If you think about our product offering, more than half of the features, values, and offerings are about integrations into the CI/CD and the build-deployment stage. And that’s a really interesting for a couple of reasons which I’ll touch on, as well as why I think CDF is very important.
First of all, I think security has always struggled to do things like enforcement and blocking. The reason is because security always lived in a probabilistic world. For example, I have this application running but I don’t have any context of who built it, when they built it, how they built it, and all the asset and inventory information needed. I’m running it in this environment; these are the users that have access to it; this user is the highest risk user; this user now touches these services. With every data point, you incrementally increase your probability about some action without reaching certainty.
And as a result, we got into this cycle where people were saying, “okay well you’re not high enough confidence, I can’t enforce this; if I can’t enforce it, I can’t take action on it; if I can’t automatically take action, I can’t scale it.”
This was a very traditional problem that existed with security. Today, if you look at what we’re talking about, especially as part of CI/CD, there are a number of things that I think are very interesting. Starting from the static code analysis side of things — integrations into things like github and gitlab; integrations into your build processes and your image registries — and before you’re actually running an application, whether it’s containerized, whether it’s serverless, or however you’re running it, you have the opportunity as a security operator to treat all of this as infrastructure as code.
Infrastructure as code is declarative information; it’s binary. You understand if something has privilege or if it doesn’t; if it has a dependency or it doesn’t.
Now, the reason this is very important to security, and why we decided to go full lifecycle and invest on the CI, the CD, and on the build side of things is that this is where you can go from probabilistic to deterministic security because I understand all the characteristics and all the attributions of my application now that it’s running. If I have deterministic security, I can take action with confidence. I can send that information back to the developer for hardening that application.
So, if I go from probabilistic to deterministic, I can automate. If I automate, then I can scale. This is what I think is the huge value coming out of the CI/CD process of integrations into gitlab, github, and build tools, because you can collect all this asset and inventory information and progress security from probabilistic to deterministic, such that you can automate and scale it.
At its core, that’s what we’re trying to do at StackRox. We went Kube-native because we want to have access through Kube and all this declarative information so we can make security deterministic, to automate it for users, and to scale it. And at the same time, for developers, we embed into their regular workflows, build processes, tooling, and ticketing solutions, instead forcing them into the traditional security way of always going to the single-pane-of-glass model.
Ali, we can talk all day, but I think we’re coming up on 30 minutes. I want to thank you for taking time out of your day. I know you guys are running as fast as you can. We’re going to call it a wrap on this one but we’re going to try to record another one with you soon Ali, and in that one I want to kind of peel the layers back of the onion a little bit about Kube-native and container security. And, let’s really talk about what StackRox does in there, and not just specific to StackRox, but generically what are the challenges in this Kube-native security world and what we can do.