Register for our next webcast - securing containers and Kubernetes with StackRox Save My Seat >
{ .link_text }}

Custom Kubernetes Controls with Open Policy Agent (OPA) - Part 1

As the adoption of Kubernetes spreads, users have begun to look for additional options to control and secure their Kubernetes clusters. Cluster administrators tend to focus on restricting what can run in a cluster. While Kubernetes Role-Based Access Control (RBAC) provides a strong permission system, its oversight ends at the resource level, and it lacks the ability to control the configurations of specific resources. In this post, we will discuss one option for finer-grained resource controls, the Open Policy Agent (OPA) Gatekeeper project, which can complement Kubernetes RBAC.

Resource Controls

Kubernetes RBAC supplies a permission system for the creation and manipulation of resource objects. With RBAC, cluster administrators can authorize cases such as “user X can create replica sets in the namespace example” or “the service account for the workload lookup can read config maps in the namespace data.”

While RBAC controls are a critical part of Kubernetes security, they do not provide the ability to control the settings, content, or configuration of cluster objects. For example, you may want to allow users in a database group to create and edit deployments in the database namespace, but any pod containers in that namespace should not run as the root user. You may also have a requirement that all resource objects in your cluster have a specific set of labels.

Kubernetes-native security: what is it and why it matters

Download this ebook to learn why a Kubernetes-native approach to protecting your containerized applications provides the most comprehensive security in Kubernetes environments

Download Now

Pod Security Policies (PSPs) can enforce the first requirement, preventing containers from running as root, in addition to some other container controls, such as disabling access to host resources or requiring a read-only root file system for the container. If PSPs are enabled for a cluster, any attempt to create a pod which does not adhere to its associated PSP will be rejected by the PSP admission controller. As the name implies, PSPs apply only to pods and only to a subset of fields in the pod specification, which limits their coverage. PSPs can also be difficult and confusing to manage, adding overhead to every deployment.

Users who need to control other pod fields or any fields in other resource types have the option of writing their own validating admission controller. Kubernetes supports the registration of admission controllers for a cluster, which can receive requests made to the cluster API service. If the controller rejects a request, the API service will also reject it.

Writing an admission controller for each specific use case does not scale, though. A system that supports multiple configurations covering different resource types and fields would yield more portability and reusability. Open Policy Agent and OPA Gatekeeper provide just that.

The OPA policy engine evaluates requests to determine whether they conform to configured policies or not. OPA can integrate with a number of applications and tools, but it is extremely compatible with Kubernetes. OPA takes input as JSON, is easy to containerize, and supports dynamic configuration, all of which make it well suited to provide policy evaluation for the Kubernetes API service. OPA Gatekeeper, another project under the OPA umbrella, delivers an OPA and Kubernetes integration by building a Kubernetes admission controller on top of the policy engine.

Next, we will cover what OPA is and what it can do, and afterward we will describe how OPA Gatekeeper works and how it can help with cluster management and policy enforcement.

Open Policy Agent

Open Policy Agent offers an open-source service that can evaluate inputs against user-defined policies and mark the input as passing or failing. Any application or service that can be configured to make an API request for determining authorization or other policy decisions can integrate with OPA. OPA evaluates only whether a request conforms to the required policies; enforcement of policy violations falls to the integrating application.

What makes OPA incredibly versatile is that it’s agnostic to API and data schemas. Because the OPA policies should programmatically handle any object traversal that they require, OPA can perform dynamic evaluation of any JSON-formatted data from any source. When integrated with the Kubernetes API, OPA can check any cluster resource type, standard or custom, and policies can refer to and test any field in those resource schemas. This versatility makes OPA very useful for Kubernetes cluster security compliance as well as for practical resource configuration management.

Rego

Users can write policies using the OPA custom programming language, Rego. Rego has a very simple syntax and small set of functions and operators, optimized for query evaluation.

A few points to keep in mind when reading and writing Rego policies:

  1. Rules are evaluated as logical AND statements.
  2. The order of rule statements does not matter.
  3. true and defined are usually synonymous. false is also usually synonymous with undefined.
  4. Rule evaluation short-circuits on reaching a statement that evaluates to undefined.

As a simple example, let’s write a policy for a non-existent application to reject names that do not start with the letter ‘a.’

package startswitha
 
name := input.request.name
 
default allow = false
 
allow {
  startswith(name, 'a')
}

If we send a request to the OPA service with the payload {"name": “apple"}, OPA should return HTTP success code 200 with a “result” field in the payload. However, if we send the request {"name": “banana"}, OPA will return a value other than 200.

How does that work?

  1. package startswitha - We give our policy its own namespace. All policies require a package declaration.
  2. name := input.request.name - Assign the value of input.request.name, which we expect to get passed in the request payload, to a variable called name.
  3. default allow = false Set the default value for allow to false.
  4. The final block is a rule. We evaluate the block containing the startswith function. If that statement succeeds, the block evaluates to true and allow is set to true. If the startswith statement condition is false, it will evaluate to undefined and the rule evaluation stops without changing the value of allow.

Be sure to read the full Rego documentation for more details and language capabilities.

Gatekeeper

Note: this section refers to Gatekeeper v3.1.

Gatekeeper provides a Kubernetes admission controller built around the OPA engine to integrate OPA and the Kubernetes API service. Although other methods for integrating OPA with Kubernetes exist, Gatekeeper adds useful functionality. The Gatekeeper controller constantly monitors existing cluster objects to detect policy violations. Its greatest value, though, comes from the ability to configure OPA policies dynamically using Gatekeeper’s Custom Resource Definitions (CRDs). Adding a library of parameterized ConstraintTemplate objects to a cluster simplifies and standardizes creation of tailored policies for specific resources and scopes.

As an example, we could make a ConstraintTemplate that can be used to check if a resource object has a label named “fruit.”

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: fruitlabel
spec:
  crd:
    spec:
      names:
        kind: FruitLabel
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package fruitlabel
 
        labels := input.review.object.metadata.labels
 
        has_fruit {
          labels.fruit
        }
 
        violation[{"msg": msg}] {
          not has_fruit
          msg := "You should eat more fruit"
        }

That ConstraintTemplate object doesn’t trigger policy enforcement on its own. It does, however, create a brand new custom resource in our cluster, of the type FruitLabel. If we want to enforce our FruitLabel policy, we create a constraint by using that new resource type.

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: FruitLabel
metadata:
  name: fruitpods
spec:
  match:
    kinds:
    - apiGroups: [""]
      kinds: ["Pod"]
    excludedNamespaces:
    - kube-system
  parameters: {}

Now Gatekeeper will enforce our FruitLabel policy for all pods not in the kube-system namespace. If we decide we want to have a “fruit” label on all Service objects, but only in the namespace fruitservices, we could create another FruitLabel object with that scope. We could also parameterize our template to support configuration for labels other than “fruit.”

The Gatekeeper service also continually audits existing objects to make sure they are not in violation of policies that may have been applied after the object’s creation. Gatekeeper currently offers no option for handling these violations itself, but users can poll a constraint object’s status field to get the list of offending resource objects and deal with them as needed.

More to come

We just had a brief introduction to OPA and Gatekeeper. Note that while OPA Gatekeeper is an incredibly powerful and configurable tool, it does have limitations. Currently, its admission controller can be deployed only as a singleton pod. Without high availability (HA) options, the Gatekeeper webhook defaults to failing “open,” allowing the Kubernetes API server to accept all requests if the admission controller is unavailable. If Gatekeeper’s webhook instead fails to “closed,” write requests made to the cluster API will fail if the controller is unresponsive. HA support is planned for a future release.

Rego usage also introduces a learning curve. As more organizations adopt Gatekeeper, we can expect to see a growing library of shared ConstraintTemplates, but it can take some time to get up and running when writing custom templates that function as desired without introducing unintended side effects. Users looking for an alternative open-source tool may want to check out Kyverno, another open-source project that offers some overlapping functionality. Both tools have strengths and weaknesses.

In Part 2 of this blog, we will look at a longer and more practical example policy for Gatekeeper, one that can restrict the taint tolerations that pods can use. We will also demonstrate the importance of comprehensive policy test coverage, how to write those tests, and how to troubleshoot issues.