Check Kubernetes with Popeye! Security, configs, problems, scores and more with an open source and lightweight CLI tool

TL/DR;

Tired of manually combing through your Kubernetes cluster for issues? Popeye is like a health check-up for your cluster, finding potential problems with your configurations and resource usage. It's a command-line tool that scans your live cluster, not just static files, and points out things like misconfigurations, unused resources, and even potential resource over-allocations. It's read-only, so it won't touch your cluster, just give you a friendly (or maybe not-so-friendly, depending on your cluster's health) report. You can even get fancy with different output formats (JSON, HTML, you name it), send reports to S3, and integrate it with Prometheus and Grafana for ongoing monitoring.

Why You Need a Kubernetes Linter Like Popeye

Let's face it, Kubernetes is awesome for orchestrating your containerized applications. However, as your deployments grow, so does the complexity. Suddenly, you're drowning in a sea of YAML files, wondering if that Service in the default namespace is actually talking to your Pod, or if that PersistentVolumeClaim from a deleted project is still hanging around like a bad smell.

That's where Popeye comes in, flexing its spinach-powered muscles to give your cluster a thorough checkup.

Popeye to the Rescue!

Popeye dives into your live cluster, inspecting your resources as they're actually running. This isn't just some dry-run static analysis tool. It's the real deal, looking for common problems that can trip you up:

  • Misconfigurations: Are your container port mappings correct? Do your Pod labels match your Service selectors?
  • Resource Usage: Popeye can even tap into your metrics server (if you're using one) and warn you about potential CPU or memory over-allocations before your cluster throws in the towel.
  • Stale Resources: Remember that Namespace you thought you deleted months ago? Popeye will find it. Those unused Secrets? Yep, it'll flag those too.
  • Security Best Practices: Popeye can help you catch things like Pods running as root, missing resource limits, and other security gotchas.

Installation

 You've got options! (Yay! fuN!) Download binaries, use brew install, or go install if you're a Go aficionado.

brew install derailed/popeye/popeye
go install github.com/derailed/popeye@latest

Getting Started with Popeye

Interpreting the Report: Popeye color-codes its findings to give you a clear picture of your cluster's health:

    • ✅ OK: Everything looks good!
    • 🔊 Info: Just some FYI messages.
    • 😱 Warn: Potential issues that you might want to look into.
    • 💥 Error: Action required! These are problems that need fixing.

Level Up with Prometheus and Grafana: Integrate Popeye with Prometheus to collect metrics and visualize your cluster's health over time in Grafana. You can even set up alerts so you're notified when Popeye finds something fishy.

Customizing Scans (Spinach, Anyone?)

You can fine-tune Popeye's behavior using a spinach.yaml configuration file. Want to adjust resource utilization thresholds, exclude specific resources, or even override the severity of certain checks? Spinach has got you covered!

popeye:
  allocations:
    cpu:
      overPercUtilization: 70 # Trigger a warning if CPU utilization goes above 70%

Run the Scan: Popeye works right out of the box. Just point it at your cluster:

popeye

Want to scan a specific namespace? No problem:

popeye -n my-awesome-app 

Popeye in Action: A Practical Example

Let's say you're running a web app in your cluster. You've got a Deployment, a Service, and a few other resources. You run Popeye, and it spits out the following:

😱  WARN  po  Pods  default/my-awesome-app-7c94985768-x5fzk  Container 'my-awesome-app' has no resource requests or limits defined!

Uh oh! Looks like you forgot to set resource limits on your Pod. This means your app could potentially consume all the resources on your node, starving out other applications. Time to update that YAML file!

Keep Your Cluster Healthy with Popeye

Popeye is an essential tool for anyone running Kubernetes. It's like having a Kubernetes expert constantly looking over your shoulder, pointing out potential issues before they turn into major headaches. So, add Popeye to your toolbox and start giving your cluster the health checks it deserves!

GitHub - derailed/popeye: 👀 A Kubernetes cluster resource sanitizer
👀 A Kubernetes cluster resource sanitizer. Contribute to derailed/popeye development by creating an account on GitHub.
popeye
👀 A Kubernetes cluster resource sanitizer

Some of the available linters

K8s Resource Linters Aliases
🛀 Node no
Conditions ie not ready, out of mem/disk, network, pids, etc
Pod tolerations referencing node taints
CPU/MEM utilization metrics, trips if over limits (default 80% CPU/MEM)
🛀 Namespace ns
Inactive
Dead namespaces
🛀 Pod po
Pod status
Containers statuses
ServiceAccount presence
CPU/MEM on containers over a set CPU/MEM limit (default 80% CPU/MEM)
Container image with no tags
Container image using latest tag
Resources request/limits presence
Probes liveness/readiness presence
Named ports and their references
🛀 Service svc
Endpoints presence
Matching pods labels
Named ports and their references
🛀 ServiceAccount sa
Unused, detects potentially unused SAs
🛀 Secrets sec
Unused, detects potentially unused secrets or associated keys
🛀 ConfigMap cm
Unused, detects potentially unused cm or associated keys
🛀 Deployment dp, deploy
Unused, pod template validation, resource utilization
🛀 StatefulSet sts
Unused, pod template validation, resource utilization
🛀 DaemonSet ds
Unused, pod template validation, resource utilization
🛀 PersistentVolume pv
Unused, check volume bound or volume error
🛀 PersistentVolumeClaim pvc
Unused, check bounded or volume mount error
🛀 HorizontalPodAutoscaler hpa
Unused, Utilization, Max burst checks
🛀 PodDisruptionBudget
Unused, Check minAvailable configuration pdb
🛀 ClusterRole
Unused cr
🛀 ClusterRoleBinding
Unused crb
🛀 Role
Unused ro
🛀 RoleBinding
Unused rb
🛀 Ingress
Valid ing
🛀 NetworkPolicy
Valid, Stale, Guarded np
🛀 PodSecurityPolicy
Valid psp
🛀 Cronjob
Valid, Suspended, Runs cj
🛀 Job
Pod checks job
🛀 GatewayClass
Valid, Unused gwc
🛀 Gateway
Valid, Unused gw
🛀 HTTPRoute
Valid, Unused gwr

Nicolás Georger

Nicolás Georger

Self-taught IT professional driving innovation & social impact with cybernetics, open source (Linux, Kubernetes), AI & ML. Building a thriving SRE/DevOps community at SREDevOps.org. I specialize in simplifying solutions through cloud native technologies and DevOps practices.