skip to main content

Jan 25, 2021

GKE Monitoring Best Practices for Better Security and Operability

By: Wei Lien Dang

This is the final installment of our four-part Google Kubernetes Engine (GKE) security blog series. Don’t forget to check out our previous blog posts in the series:

This post concludes the series by highlighting the routine maintenance and operational tasks required to keep your GKE clusters and infrastructure secured.

Cluster Upgrades

Why: Kubernetes and GKE release regular updates to supported versions to fix critical security flaws or other bugs. Keeping up-to-date with patches for your cluster’s Kubernetes version is an important requirement for your cluster and workload security.

What to do: GKE can manage both master (control plane) and node pool upgrades for you automatically. Make sure you have auto-upgrades enabled in your cluster and select a maintenance window.

Upgrades for regional clusters and node pools should not cause any availability issues for most workloads. However, for GKE clusters which have zonal control planes,the cluster’s API endpoint and some other control plane services may be offline during the upgrade.

Audit Logging

Why: Logging events and changes at both the Kubernetes level and the node level creates an important audit trail to use for evaluating your cluster’s security, especially in case of a breach or attack. GKE can collect your Kubernetes API audit logs and node system logs, in addition to your application container logs, and send them to the Stackdriver service for secure storage and analysis.

What to do: Log collection is enabled by default in new GKE clusters. You can configure which logs are collected, but you should collect the Kubernetes cluster logs at a minimum.

You can also send each node’s Linux auditd logs to Stackdriver. Linux audit log collection is only supported for nodes running Container-Optimized OS.

Credential Rotation

Why: Kubernetes clusters rely on a number of secure certificate chains and credentials for security. If sensitive keys or certificates are compromised, the integrity and safety of the entire cluster and its workloads may be placed at risk. Additionally, many security policies and compliance certifications require regular rotation of encryptions keys and credentials.

What to do: You can rotate your cluster’s root Certificate Authority and all cluster certificates signed by that CA.

However, because credential rotation is not currently supported in regional clusters, and it also entails brief downtime and cluster API IP address rotation in zonal clusters, a preferable solution may be to create a new GKE cluster and migrate your old cluster’s workloads and resources over to the new cluster.