Security Multi-cloud

Deploy Falco and Falcosidekick for Runtime Threat Detection on Kubernetes

A fintech’s platform team ships to a 60-node EKS cluster and has the usual pre-deploy controls dialled in — image scanning in CI, admission policies, network policies — but the CISO keeps asking the one question none of it answers: if an attacker is already inside a running container right now, what tells us? The pre-deploy gates are blind the instant a pod starts. The wake-up call was a pen-test finding where a tester popped a shell in a payments pod through a deserialization bug, read a mounted service-account token, and pivoted — and nothing alerted, because nothing was watching the kernel. This guide stands up the control that closes that gap: Falco watching every syscall on every node, and Falcosidekick fanning its alerts out to the places humans and machines actually look — Slack for the on-call, Microsoft Sentinel for correlation, and a generic webhook for everything else. By the end you will have runtime detections firing within seconds of a suspicious execve, routed and deduplicated, with a clean teardown path.

Prerequisites

Target topology

Deploy Falco and Falcosidekick for Runtime Threat Detection on Kubernetes — topology

The shape is deliberately simple, which is the point of a runtime sensor — it must be cheap to run on every node. Falco runs as a privileged DaemonSet: one pod per worker node, each loading a modern eBPF probe into the kernel and subscribing to the syscall stream for every container on that node, plus the Kubernetes audit log as a second event source. When a syscall sequence matches a rule — a shell spawned in a container, a read of /etc/shadow, an outbound connect to a non-allowlisted IP, a write below a read-only path — Falco emits a structured JSON alert.

Those alerts do not go to stdout and die. Each Falco pod is configured with an HTTP output pointing at Falcosidekick, a small stateless Deployment (2 replicas behind a ClusterIP Service) whose entire job is fan-out and routing. Falcosidekick receives every alert once and, based on priority and rule tags, forwards it in parallel to multiple sinks: Slack (formatted message to the on-call channel, gated to warning and above to avoid noise), Microsoft Sentinel (every event, via the Log Analytics HTTP Data Collector API, for SIEM correlation and long retention), and a generic webhook (here, the front door to an automation layer that opens a ServiceNow incident on critical and cross-references the host against CrowdStrike Falcon telemetry). Secrets for all three sinks are pulled from Vault into the Falcosidekick pod, never committed. Delivery is Falco → Falcosidekick → sinks, so adding or removing a destination is a Falcosidekick config change, not a fleet-wide Falco reconfiguration.

1. Create the namespace and pull secrets from Vault

Keep the sensors in their own namespace and label it so your admission/network policies can special-case the privileged DaemonSet.

kubectl create namespace falco
kubectl label namespace falco pod-security.kubernetes.io/enforce=privileged \
  security.kloudvin.io/component=runtime-sensor

Falcosidekick needs three secrets: the Slack token, the Sentinel workspace key, and the webhook bearer token. Pull them from HashiCorp Vault rather than typing them into values.yaml. The clean pattern is the Vault Secrets Operator (or External Secrets) syncing a Vault KV path into a native Secret; the imperative equivalent for a first stand-up:

# Vault holds these under secret/data/falco/falcosidekick
SLACK_WEBHOOK=$(vault kv get -field=slack_webhook_url secret/falco/falcosidekick)
SENTINEL_WSID=$(vault kv get -field=sentinel_workspace_id secret/falco/falcosidekick)
SENTINEL_KEY=$(vault kv get -field=sentinel_shared_key  secret/falco/falcosidekick)
WEBHOOK_TOKEN=$(vault kv get -field=webhook_bearer_token secret/falco/falcosidekick)

kubectl -n falco create secret generic falcosidekick-secrets \
  --from-literal=SLACK_WEBHOOKURL="$SLACK_WEBHOOK" \
  --from-literal=AZURESENTINEL_WORKSPACEID="$SENTINEL_WSID" \
  --from-literal=AZURESENTINEL_SHAREDKEY="$SENTINEL_KEY" \
  --from-literal=WEBHOOK_CUSTOMHEADERS="Authorization:Bearer $WEBHOOK_TOKEN"

The environment-variable names are exactly the keys Falcosidekick reads (it maps SLACK_WEBHOOKURL, AZURESENTINEL_WORKSPACEID, etc.), so we can mount this Secret straight into the pod with envFrom and keep the Helm values free of any sensitive string. In production you would let the Vault Secrets Operator own this Secret so rotation in Vault propagates automatically — the imperative form above is fine for the initial bring-up and for understanding what is in the box.

2. Add the Helm repo and template Falcosidekick’s routing

Falco and Falcosidekick ship from the same chart, so one Helm release installs both.

helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update
helm search repo falcosecurity/falco --versions | head -5

Now write a values.yaml that (a) selects the modern eBPF driver, (b) enables and points Falco at Falcosidekick, and © configures Falcosidekick’s three sinks with per-sink minimum priorities so each destination gets the right signal-to-noise. Drop secrets in via the Secret from step 1 — note there are no tokens in this file:

# values.yaml
driver:
  kind: modern_ebpf          # CO-RE eBPF probe; no kernel headers, no module build

collectors:
  kubernetes:
    enabled: true            # enrich alerts with pod/namespace/labels

falco:
  json_output: true          # structured alerts (required for clean routing)
  json_include_output_property: true
  priority: notice           # Falco emits notice+; sinks filter further
  http_output:
    enabled: true
    url: "http://falco-falcosidekick:2801/"   # in-cluster Falcosidekick Service

falcosidekick:
  enabled: true
  replicaCount: 2
  config:
    debug: false
    customfields: "cluster:prod-eks-payments,env:prod"   # tag every alert
    slack:
      minimumpriority: "warning"     # page humans only on warning+
      messageformat: "Falco rule *{{ .Rule }}* on `{{ .Hostname }}` ({{ .Priority }})"
    azuresentinel:
      minimumpriority: ""            # send EVERYTHING to the SIEM
    webhook:
      address: "https://soc-bridge.internal.kloudvin.io/falco"
      minimumpriority: "critical"    # only criticals trigger automation
      mutualtls: false
  # pull Slack URL, Sentinel id/key, webhook auth header from the Vault-synced Secret
  extraEnv: []
  podSecurityContext: {}

Wire the Secret into Falcosidekick’s pod. Because the chart names the Falcosidekick env vars exactly as the Secret keys, an envFrom reference is all it takes — add this to the release via a second values fragment so the chart pods inherit it:

# values-secrets.yaml  (kept separate so the routing values can live in Git)
falcosidekick:
  extraEnvFrom:
    - secretRef:
        name: falcosidekick-secrets

If your chart version predates extraEnvFrom, the equivalent is to set the same keys under falcosidekick.config.* from the Secret using existingSecret, or patch the Deployment after install — but extraEnvFrom is the clean path on current charts.

3. Install the release

Install Falco + Falcosidekick + the web UI into the falco namespace, pinning the chart version for reproducibility (resolve the exact version from step 2’s helm search).

helm install falco falcosecurity/falco \
  --namespace falco \
  --version 4.* \
  -f values.yaml \
  -f values-secrets.yaml \
  --set falcosidekick.webui.enabled=true \
  --wait --timeout 5m

Confirm the DaemonSet landed one pod per node and Falcosidekick is up:

kubectl -n falco get daemonset,deploy,pods -o wide
kubectl -n falco rollout status daemonset/falco

You should see falco pods equal to your node count (all Running, each with the falco and falcoctl-artifact-follow containers ready) and two falco-falcosidekick pods. If a Falco pod is CrashLoopBackOff, jump to Common pitfalls — it is almost always the driver.

For GitOps shops, do not run helm install by hand in prod. Commit values.yaml (the routing, no secrets) to the platform repo and let Argo CD reconcile it as an Application pointing at the falcosecurity/falco chart; the values-secrets.yaml reference stays a pointer to the Vault-synced Secret, so the Git repo never holds a token. The same manifests can be applied from a Jenkins or GitHub Actions job if you are not yet on Argo CD — the chart is identical, only the delivery mechanism differs. Infra around the cluster (the EKS node group, the IAM/OIDC trust for the SOC bridge) is Terraform; node-level hardening that has to happen outside Kubernetes — sysctl baselines, the CrowdStrike Falcon sensor install on the AMI — is Ansible. Falco is the in-cluster piece; keep its config in Git and its delivery in Argo CD.

4. Verify the eBPF driver actually loaded

A Falco pod can be Running while the probe silently failed to attach — then you have a sensor that sees nothing. Check the logs for the driver line explicitly:

kubectl -n falco logs ds/falco -c falco | grep -iE "ebpf|driver|engine|source" | head

A healthy modern-eBPF start prints something like Loading rules from..., Starting health webserver, and crucially a line confirming the modern bpf engine is the event source with the syscall source open. If you instead see fallbacks to the kernel module or errors about probe loading, the node kernel likely lacks CO-RE BTF — see pitfalls. You can also confirm the engine version and loaded sources from inside the pod:

kubectl -n falco exec ds/falco -c falco -- falco --version
kubectl -n falco exec ds/falco -c falco -- falco --list-events | head

5. Tune rules and silence the expected noise

Out of the box Falco ships a solid default rule set, but in a real cluster a handful of rules will fire constantly on legitimate behavior (your CI runner exec-ing into pods, a sidecar reading a sensitive file by design). Untuned, the on-call mutes the channel within a day — which is the actual failure mode of runtime detection. Add a custom rules file that appends exceptions rather than editing the shipped rules, so chart upgrades do not clobber your tuning:

# custom-rules.yaml  (delivered via the chart's customRules map)
customRules:
  kloudvin-tuning.yaml: |-
    # Allow our known debug image to spawn shells without alerting
    - macro: trusted_debug_containers
      condition: (container.image.repository in (registry.kloudvin.io/sre/debug))

    - rule: Terminal shell in container
      append: true
      condition: and not trusted_debug_containers

    # Raise priority of service-account token reads in the payments namespace
    - rule: Read sensitive file untrusted
      append: true
      condition: and not (k8s.ns.name = "kube-system")

    # A bespoke detection: outbound connection from a payments pod to a non-RFC1918 IP
    - rule: Payments pod egress to public IP
      desc: A pod in payments connected outbound to a public address
      condition: >
        evt.type in (connect) and evt.dir = < and k8s.ns.name = "payments"
        and fd.sip exists and not fd.snet in ("10.0.0.0/8","172.16.0.0/12","192.168.0.0/16")
      output: >
        Payments egress to public IP (pod=%k8s.pod.name dest=%fd.sip:%fd.sport
        proc=%proc.cmdline image=%container.image.repository)
      priority: CRITICAL
      tags: [network, payments, mitre_exfiltration]

Apply it by adding the customRules block to your values.yaml and upgrading:

helm upgrade falco falcosecurity/falco \
  --namespace falco --reuse-values \
  -f custom-rules.yaml

The append: true form is the single most important habit in operating Falco — it keeps your tuning separate from the vendor rules so you can helm upgrade the chart and pick up new detections without re-litigating every exception. The new CRITICAL payments-egress rule will, by the routing in step 2, reach Slack and Sentinel and the webhook (which opens the ServiceNow ticket), because critical >= warning >= "".

Validation

Prove the whole chain end to end by deliberately tripping a rule and watching it surface in each sink. The canonical test is spawning a shell inside a running container — the default “Terminal shell in container” rule.

# Pick any running app pod (NOT a falco pod) and exec a shell into it
TARGET=$(kubectl -n payments get pod -l app=checkout -o jsonpath='{.items[0].metadata.name}')
kubectl -n payments exec -it "$TARGET" -- /bin/sh -c "cat /etc/shadow; id"

Within a second or two:

# 1) Falco saw the syscall — alert in the node pod's log
kubectl -n falco logs ds/falco -c falco | grep -i "shell in a container" | tail -1

# 2) Falcosidekick received and routed it — check its metrics/outputs
kubectl -n falco logs deploy/falco-falcosidekick | grep -iE "slack|sentinel|webhook" | tail -3

# 3) Falcosidekick exposes Prometheus counters per output — confirm non-zero sends
kubectl -n falco exec deploy/falco-falcosidekick -- \
  wget -qO- http://localhost:2801/metrics | grep -E 'falcosidekick_outputs_total'

Then confirm the human-facing destinations: a formatted alert in your Slack on-call channel, and a record in Microsoft Sentinel under the custom log table Falcosidekick writes to (query Falcosidekick_CL in the Logs blade — allow a few minutes for the first ingestion, as the Data Collector API batches). If the critical webhook path is wired, you should also see a fresh ServiceNow incident referencing the host and rule. Seeing the same event arrive in all configured sinks is the proof that Falco detection, Falcosidekick routing, and the Vault-fed credentials are all correct.

Rollback / teardown

Falco’s footprint is a single Helm release, so removal is clean and complete — nothing persists in the kernel after the pods stop (the eBPF probe is detached on pod exit).

# Remove the entire Falco + Falcosidekick release
helm uninstall falco --namespace falco

# Confirm DaemonSet and pods are gone (probe is unloaded with the pod)
kubectl -n falco get all

# Remove the Vault-sourced secret and the namespace
kubectl -n falco delete secret falcosidekick-secrets --ignore-not-found
kubectl delete namespace falco

If you are under Argo CD, do not helm uninstall — delete or disable the Application (or set it to non-auto-sync first), or Argo will immediately re-sync the release back. For a partial rollback — say a bad custom-rules.yaml is flooding alerts — revert just the rules with helm rollback falco <previous-revision> (find it with helm history falco -n falco), which restores the prior config without disturbing the running probe.

Common pitfalls

Security notes

Falco is a detection control, not a prevention one — it tells you an attacker is acting, fast, but it does not stop the syscall. Treat it as the runtime tripwire that complements your preventive layers (admission policy, network policy, image scanning) and your EDR. On these nodes CrowdStrike Falcon is the EDR doing host-level prevention and its own behavioral detection; Falco adds Kubernetes-aware, container-context syscall visibility that an EDR built for VMs does not natively express — the two are layered, and the webhook path lets a Falco critical cross-reference the same host in Falcon’s console. Lock down Falco itself: it runs privileged by necessity, so restrict who can exec into the falco namespace via RBAC, keep all sink credentials in Vault (never in values.yaml, which lives in Git), and forward Falco’s own audit and the Falcosidekick delivery logs to Sentinel so a tampering attempt on the sensor is itself an alert. Route critical detections to a ServiceNow incident automatically so a real intrusion creates a ticket with an owner and an SLA, not just a Slack message that scrolls away.

Cost notes

The sensor itself is nearly free in licensing — Falco and Falcosidekick are open-source (CNCF) — so the real cost is node overhead and SIEM ingestion. Per node, budget roughly 150-250m CPU and 256-512Mi for the Falco pod under normal syscall volume; set resources.requests/limits in the chart accordingly and watch for noisy workloads (build agents, anything doing heavy file I/O) that spike Falco CPU. The line item that surprises teams is Microsoft Sentinel ingestion: sending every event there is correct for correlation but you pay per GB, so use Falco’s priority floor and Falcosidekick’s azuresentinel.minimumpriority to keep debug/info chatter out of the SIEM while still capturing notice and above. A practical split: Slack at warning+ (low volume, free), Sentinel at notice+ (moderate, metered), the automation webhook at critical only (tiny, but each one may spawn a ServiceNow ticket and a Falcon lookup). Tune the rules (step 5) before you tune the budget — most Sentinel cost on an untuned Falco is noise from a handful of chatty rules you should have excepted anyway.

KubernetesFalcoRuntime SecurityeBPFFalcosidekickThreat Detection
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading