Container Security on Kubernetes: A Practical Guide with Trivy, Falco, and Kyverno

Most Kubernetes clusters are less secure than the teams running them believe. The container images have not been scanned in months — if ever. There is no runtime monitoring to detect anomalous behavior inside running pods. And there are no guardrails preventing someone from deploying a container running as root with access to the host network.

This is not a theoretical risk. A single compromised container image with a known CVE can give an attacker a foothold in your cluster. From there, lateral movement is straightforward if your pods run with elevated privileges and your network policies are wide open.

The fix is not complicated. Three open-source tools — Trivy, Falco, and Kyverno — cover the three pillars of container security: image scanning, runtime threat detection, and policy enforcement. This guide walks through setting up all three on a Kubernetes cluster with practical configurations you can deploy today.

The Three Pillars of Container Security

Before diving into tools, it helps to understand what you are defending against and where the attack surface lives.

Image vulnerabilities are the most common entry point. Public base images from Docker Hub frequently contain known CVEs in system libraries. If you build your application on top of node:18 or python:3.11 without scanning, you are inheriting every vulnerability in the base OS packages. The fix is to scan images before they run in your cluster and block any image with critical or high-severity CVEs.

Runtime threats are what happens after a compromised or misconfigured container starts running. A container that spawns a shell, opens an unexpected network connection, or reads sensitive files from the filesystem is exhibiting behavior that legitimate application code should not. Detecting this in real time requires kernel-level visibility into container system calls.

Policy gaps are the configuration mistakes that make the first two problems worse. Running as root, using latest tags, mounting the Docker socket, disabling seccomp profiles — these are all common misconfigurations that widen the blast radius of any incident. Preventing them requires admission control that rejects non-compliant workloads before they reach the cluster.

A secure cluster addresses all three. Scanning without runtime monitoring misses zero-days and behavioral exploits. Runtime monitoring without policy enforcement means you are alerting on problems you could have prevented. Policy enforcement without scanning means you trust image contents you have never inspected.

Image Scanning with Trivy

Trivy is an open-source vulnerability scanner from Aqua Security that scans container images, filesystems, IaC templates, and Kubernetes manifests. For container security, we focus on two use cases: scanning images in CI before they are pushed to a registry, and continuously scanning images running in the cluster.

Scanning in CI

The highest-leverage place to catch vulnerabilities is in your CI pipeline, before the image reaches your registry. Add Trivy as a step after your image build:

# GitHub Actions example
- name: Scan container image
  uses: aquasecurity/trivy-action@master
  with:
    image-ref: ${{ env.IMAGE_NAME }}:${{ env.IMAGE_TAG }}
    format: table
    exit-code: 1
    severity: CRITICAL,HIGH
    ignore-unfixed: true

The key flags:

  • exit-code: 1 fails the pipeline if vulnerabilities are found, preventing the image from being pushed.
  • severity: CRITICAL,HIGH limits results to actionable findings. Medium and low findings generate noise without meaningfully reducing risk.
  • ignore-unfixed excludes CVEs that have no available patch. You cannot fix what upstream has not fixed yet, and failing builds on unfixable CVEs blocks your deployments without improving security.

For more granular control, create a .trivyignore file in your repository to suppress specific CVEs you have evaluated and accepted:

# Accepted: no exploit path in our usage of this library
CVE-2024-1234
# Accepted: mitigated by network policy restricting egress
CVE-2024-5678

Continuous Scanning with Trivy Operator

CI scanning catches vulnerabilities at build time, but images that were clean when deployed can become vulnerable as new CVEs are disclosed. The Trivy Operator runs inside your cluster and continuously scans running workloads.

Install it via Helm:

helm repo add aqua https://aquasecurity.github.io/helm-charts/
helm repo update

helm install trivy-operator aqua/trivy-operator \
  --namespace trivy-system \
  --create-namespace \
  --set trivy.severity="CRITICAL,HIGH" \
  --set operator.scanJobTimeout=10m

The operator creates VulnerabilityReport custom resources for every container running in the cluster:

kubectl get vulnerabilityreports -A -o wide

This gives you a cluster-wide view of which workloads have known vulnerabilities, even if they were deployed months ago. You can build alerting on top of these CRDs — a Prometheus exporter or a simple CronJob that checks for critical findings and sends a notification.

A practical configuration for the operator:

apiVersion: v1
kind: ConfigMap
metadata:
  name: trivy-operator-config
  namespace: trivy-system
data:
  # Scan images every 24 hours
  scanJob.podTemplateLabels: "app=trivy-scan"
  # Only report critical and high
  trivy.severity: "CRITICAL,HIGH"
  # Use the image digest, not the tag, for caching
  trivy.imageRef: "digest"
  # Resource limits for scan jobs
  scanJob.resources.requests.cpu: "100m"
  scanJob.resources.requests.memory: "256Mi"
  scanJob.resources.limits.cpu: "500m"
  scanJob.resources.limits.memory: "512Mi"

Resource limits on scan jobs are important. Without them, Trivy scans can consume significant memory when analyzing large images, potentially impacting workloads sharing the same nodes.

Runtime Threat Detection with Falco

Trivy tells you what vulnerabilities exist in your images. Falco tells you when something suspicious is actually happening at runtime. Falco hooks into the Linux kernel via eBPF (or a kernel module) and monitors system calls made by every container. When a system call matches a rule — like a container spawning a shell, reading /etc/shadow, or opening an outbound connection to an unusual port — Falco generates an alert.

Installing Falco

Deploy Falco as a DaemonSet so it runs on every node:

helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update

helm install falco falcosecurity/falco \
  --namespace falco \
  --create-namespace \
  --set driver.kind=ebpf \
  --set falcosidekick.enabled=true \
  --set falcosidekick.config.slack.webhookurl="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"

The driver.kind=ebpf flag uses eBPF instead of a kernel module, which is safer and does not require kernel headers on the node. Falcosidekick is a companion that routes Falco alerts to external systems — Slack, PagerDuty, Loki, Elasticsearch, or any webhook endpoint.

Understanding Falco Rules

Falco ships with a comprehensive default ruleset that covers common attack patterns. Some of the most valuable default rules:

  • Terminal shell in container — Detects when someone opens an interactive shell inside a running container. Legitimate application containers should never spawn shells.
  • Read sensitive file untrusted — Detects reads of /etc/shadow, /etc/pam.d/, and other credential files.
  • Contact K8s API server from container — Detects containers making direct API calls to the Kubernetes API server, which can indicate an attacker attempting to enumerate cluster resources.
  • Unexpected outbound connection — Detects network connections to unusual destinations, which can indicate data exfiltration or command-and-control communication.

Writing Custom Rules

The default rules are a strong starting point, but custom rules tuned to your workloads catch threats the defaults miss. Here is a rule that detects a container writing to directories it should not touch:

- rule: Write to non-application directories
  desc: >
    Detect writes to system directories from application containers.
    Application containers should only write to /app, /tmp, and /var/log.
  condition: >
    container and
    evt.type in (open, openat, openat2) and
    evt.is_open_write=true and
    not fd.name startswith /app and
    not fd.name startswith /tmp and
    not fd.name startswith /var/log and
    not fd.name startswith /proc and
    container.image.repository != "falcosecurity/falco" and
    k8s.ns.name != "kube-system"
  output: >
    Unexpected file write in container
    (file=%fd.name user=%user.name command=%proc.cmdline
     container=%container.name image=%container.image.repository
     namespace=%k8s.ns.name pod=%k8s.pod.name)
  priority: WARNING
  tags: [filesystem, container]

Another practical rule — detect containers that download and execute binaries, a common pattern in cryptomining and reverse shell attacks:

- rule: Download and execute in container
  desc: >
    Detect a container downloading a file and then executing it.
    This is a common pattern for post-exploitation payloads.
  condition: >
    container and
    spawned_process and
    proc.name in (curl, wget) and
    proc.args contains "http" and
    proc.pname in (bash, sh, dash)
  output: >
    Download attempt detected in container
    (command=%proc.cmdline container=%container.name
     image=%container.image.repository namespace=%k8s.ns.name
     pod=%k8s.pod.name)
  priority: CRITICAL
  tags: [network, process, mitre_execution]

Deploy custom rules as a ConfigMap or use the falco.rules Helm values to include them alongside the defaults.

Reducing Noise

The biggest operational challenge with Falco is alert fatigue. The default rules generate alerts for legitimate behavior in some workloads — for example, an init container that runs shell commands, or a debugging sidecar that reads system files.

Address this with exceptions rather than disabling rules:

- rule: Terminal shell in container
  append: true
  condition: and not k8s.ns.name in (debug-tools, ci-runners)

The append: true directive adds conditions to the existing rule rather than replacing it. This keeps the rule active for all workloads except the specific namespaces where shell access is expected.

Review your Falco alerts weekly for the first month. Tune aggressively. A Falco deployment that generates 500 alerts per day gets ignored. One that generates 5 meaningful alerts per week gets investigated.

Policy Enforcement with Kyverno

Trivy and Falco are detective controls — they find problems and alert you. Kyverno is a preventive control — it stops problems from entering the cluster in the first place.

Kyverno is a Kubernetes-native policy engine that runs as an admission controller. When a resource is created or updated, Kyverno evaluates it against your policies and either allows, denies, or mutates the resource. Policies are written as Kubernetes custom resources, so there is no new language to learn.

Installing Kyverno

helm repo add kyverno https://kyverno.github.io/kyverno/
helm repo update

helm install kyverno kyverno/kyverno \
  --namespace kyverno \
  --create-namespace \
  --set replicaCount=3

Running three replicas ensures the admission controller remains available during node failures or upgrades. A single-replica Kyverno can block all deployments if its pod goes down.

Essential Policies

Start with a small set of policies that address the most impactful misconfigurations. Here are the ones we deploy on every cluster.

Require non-root containers:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-non-root
  annotations:
    policies.kyverno.io/title: Require Non-Root Containers
    policies.kyverno.io/description: >
      Containers must run as a non-root user. Running as root
      gives the process full access to the host if a container
      breakout occurs.
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: check-runAsNonRoot
      match:
        any:
          - resources:
              kinds:
                - Pod
      exclude:
        any:
          - resources:
              namespaces:
                - kube-system
                - trivy-system
                - falco
      validate:
        message: "Containers must set securityContext.runAsNonRoot to true."
        pattern:
          spec:
            containers:
              - securityContext:
                  runAsNonRoot: true

Disallow privilege escalation:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-privilege-escalation
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: check-privilege-escalation
      match:
        any:
          - resources:
              kinds:
                - Pod
      exclude:
        any:
          - resources:
              namespaces:
                - kube-system
      validate:
        message: "Privilege escalation is not allowed. Set allowPrivilegeEscalation to false."
        pattern:
          spec:
            containers:
              - securityContext:
                  allowPrivilegeEscalation: false

Require image digests instead of tags:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-image-digest
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: check-image-digest
      match:
        any:
          - resources:
              kinds:
                - Pod
      exclude:
        any:
          - resources:
              namespaces:
                - kube-system
                - kyverno
      validate:
        message: >
          Images must use a digest (sha256) reference, not a mutable tag.
          Use image@sha256:abc123 format instead of image:latest.
        pattern:
          spec:
            containers:
              - image: "*@sha256:*"

This prevents tag-based attacks where an attacker pushes a malicious image to the same tag in a compromised registry. Digest references are immutable — the exact image bytes are pinned.

Block latest tag:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-latest-tag
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: check-image-tag
      match:
        any:
          - resources:
              kinds:
                - Pod
      validate:
        message: "Using the 'latest' tag is not allowed. Specify an explicit version tag."
        pattern:
          spec:
            containers:
              - image: "!*:latest"

Require resource limits:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-resource-limits
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: check-limits
      match:
        any:
          - resources:
              kinds:
                - Pod
      validate:
        message: "All containers must specify resource requests and limits."
        pattern:
          spec:
            containers:
              - resources:
                  requests:
                    cpu: "?*"
                    memory: "?*"
                  limits:
                    memory: "?*"

Notice this requires memory limits but not cpu limits. CPU limits can cause throttling that degrades application performance in ways that are difficult to diagnose. Memory limits are essential because an OOM condition crashes the container. This matches the recommendation from most Kubernetes performance engineers.

Audit Mode vs. Enforce Mode

Every policy has a validationFailureAction that is either Enforce (block non-compliant resources) or Audit (allow the resource but log a policy violation).

Start every new policy in Audit mode. Run it for a week and review the violations. This tells you which existing workloads would break if you flipped to Enforce. Fix those workloads first, then switch the policy to Enforce.

# Start here
validationFailureAction: Audit

# Move here after confirming no legitimate workloads are affected
validationFailureAction: Enforce

Kyverno generates PolicyReport and ClusterPolicyReport resources that list all violations:

kubectl get policyreport -A
kubectl get clusterpolicyreport -o yaml

These reports integrate with monitoring tools — you can export them to Prometheus via the Policy Reporter project and build Grafana dashboards that show compliance trends over time.

Putting It All Together

Here is how these three tools work together in practice:

  1. Before deployment: Trivy scans the image in CI and blocks any build with critical or high CVEs from being pushed to the registry.
  2. At deployment time: Kyverno validates the pod spec against your policies. If the container runs as root, uses a latest tag, or lacks resource limits, the deployment is rejected before it starts.
  3. After deployment: Falco monitors system calls from every running container on every node. If a container spawns a shell, makes an unexpected network connection, or writes to a system directory, an alert fires.
  4. Continuously: The Trivy Operator rescans running images daily, catching newly disclosed CVEs in images that were clean when first deployed.

A Minimal Security Baseline

If you are starting from zero, deploy these three tools in this order:

Week 1: Kyverno in Audit mode. Install Kyverno and deploy the five policies above in Audit mode. Review the policy reports to understand your current compliance posture. Fix the most critical violations — pods running as root and containers with no resource limits.

Week 2: Trivy in CI. Add Trivy scanning to your CI pipeline. Set it to warn (exit code 0) initially so you can see the findings without blocking deployments. Once you have addressed the critical CVEs in your base images, switch to fail (exit code 1).

Week 3: Kyverno to Enforce. After fixing existing workloads, flip your policies from Audit to Enforce. Start with the most impactful policies — require-non-root and disallow-privilege-escalation — and work through the rest over the following days.

Week 4: Falco. Deploy Falco with the default ruleset and Falcosidekick routing alerts to your team's Slack channel. Spend the first week tuning — adding exceptions for legitimate behavior and reducing noise to a level where every alert gets investigated.

Ongoing: Trivy Operator. Install the operator for continuous scanning. Set up a weekly review process for new vulnerability findings. Integrate the policy reports and vulnerability reports into your monitoring dashboards.

Common Mistakes

Scanning images but not blocking on results. A scanner that produces reports nobody reads provides zero security value. The scanner must fail the pipeline. If you are not ready to block deployments, set a date when you will be and commit to it.

Running Falco with default rules and no tuning. The default rules are comprehensive but noisy for most workloads. An untuned Falco deployment becomes background noise within a week. Budget time for tuning during the initial rollout.

Writing policies that break existing workloads. Always start in Audit mode. Deploying a require-non-root policy in Enforce mode on a cluster full of root containers will break every deployment pipeline simultaneously.

Exempting too many namespaces. It is tempting to exclude kube-system and every infrastructure namespace from policies. Each exclusion is an attack surface. Exempt only what is genuinely required and document why.

Treating security as a one-time project. New CVEs are disclosed daily. New workloads are deployed weekly. Policies need updating as your architecture evolves. Continuous scanning, monitoring, and policy review is the only approach that maintains security over time.

The Bottom Line

Container security on Kubernetes is not about buying an expensive platform or hiring a dedicated security team. Three open-source tools — Trivy for image scanning, Falco for runtime detection, and Kyverno for policy enforcement — cover the critical attack surface for most organizations.

The pattern is straightforward: scan before deployment, enforce at admission, detect at runtime. Each layer catches what the others miss. The total setup time is measured in days, not months, and the ongoing operational overhead is a few hours per week of alert review and policy tuning.

If your cluster is running containers you have never scanned, with no policies preventing root access, and no runtime monitoring — you have a security gap that is straightforward to close. These tools are free, battle-tested, and designed for Kubernetes. The only cost is the time to set them up.