A retail-logistics company is drowning in point-to-point integrations: the warehouse management system pokes the order service over a brittle REST call, the fraud team scrapes a read replica every five minutes, and the analytics team’s nightly batch is always six hours stale by the time anyone looks at it. The platform team’s mandate is to put a real event backbone in the middle — every parcel scan, every order state change, every inventory delta published once to Apache Kafka and consumed by anyone who needs it, in order, durably, with a schema contract that stops a producer from silently breaking ten downstream consumers. They already run everything on Kubernetes, so a managed cloud Kafka is off the table for the regulated on-prem workloads; they need Kafka in their own cluster, operated like a first-class platform service rather than a pet. This guide walks through deploying that backbone with Confluent for Kubernetes (CFK) — the official Confluent operator — running a three-broker (KRaft) Kafka cluster, Schema Registry, and Kafka Connect, all secured with TLS and RBAC, and slotted into the enterprise’s existing identity, secrets, GitOps, and observability tooling.
Prerequisites
- A Kubernetes cluster, v1.27+, with at least three worker nodes spread across availability zones (so the three KRaft controllers and three brokers land in different failure domains). 16 GiB RAM and 4 vCPU per node is a sane floor for a non-toy cluster.
- A default StorageClass that provisions block storage with
volumeBindingMode: WaitForFirstConsumerandallowVolumeExpansion: true(e.g.gp3on EKS,managed-csion AKS,pd-ssdon GKE). Kafka wants low-latency persistent disks, not NFS. kubectlv1.27+,helmv3.12+, and theconfluentCLI v3+ on your workstation.- A Confluent Platform license (a 30-day trial license ships built in; production needs a license secret). Confluent Platform 7.7 / CFK 2.9 is assumed below.
- A CA you can issue certificates from — here, HashiCorp Vault’s PKI secrets engine acts as the internal CA so every Kafka listener gets a short-lived, auto-renewed TLS cert instead of a hand-rolled self-signed chain.
- An OIDC identity provider for human and CI access — Microsoft Entra ID is the workforce IdP, federated from Okta as the upstream SSO so engineers log into Kafka tooling with the same corporate identity they use everywhere else.
- Cluster-admin on the target namespace, and a Git repository the platform team controls (the operator and all
CRs are delivered through Argo CD, neverkubectl applyby hand in production).
Target topology
Everything lives in a single namespace, confluent. CFK — the operator — runs as a Deployment and watches a set of custom resources (Kafka, KRaftController, SchemaRegistry, Connect, KafkaTopic, KafkaRestClass, ConfluentRolebinding). You declare what you want as YAML; the operator reconciles brokers, controllers, StatefulSets, Services, and certificates to match. The three KRaft controllers form the metadata quorum (replacing the old ZooKeeper ensemble), the three brokers form the data plane, Schema Registry enforces schema contracts on the wire, and Kafka Connect runs source/sink connectors. Inter-component traffic is mTLS; client traffic is TLS with RBAC. Producers and consumers reach the cluster through a bootstrap Service; external systems reach it through a dedicated external listener.
The supporting cast around the cluster is what turns it from a demo into a platform:
- HashiCorp Vault is the internal CA (PKI engine) and the store for the Connect connector credentials and the license secret — pulled in by the Vault Agent injector so nothing sensitive sits in a plain Kubernetes Secret.
- Microsoft Entra ID (federated from Okta) is the OIDC provider behind RBAC, so a human’s group membership maps to Kafka principals and role bindings.
- Argo CD delivers the operator and every custom resource via GitOps; GitHub Actions runs the schema-compatibility and config-lint checks before a change is allowed to merge.
- Terraform provisions the cluster, node pools, StorageClasses, and Vault PKI mount; Ansible handles any node-level tuning (file-descriptor limits,
vm.max_map_count) the managed node image does not already set. - Dynatrace (or Datadog) scrapes the JMX/Prometheus metrics and the OneAgent gives per-broker and per-topic telemetry; CrowdStrike Falcon runs as a node sensor for runtime threat detection on the Kafka nodes.
- Wiz (and Wiz Code in the pipeline) does posture and IaC scanning so a broker listener never drifts to plaintext or a public LoadBalancer without an alert.
- ServiceNow is the change gate — a new external listener or an RBAC role binding for a third party goes through a change request before Argo CD is allowed to sync it.
- Akamai sits at the edge only for the operator-adjacent HTTP surfaces (the Control Center UI, the Schema Registry REST endpoint when exposed to partners), terminating TLS and providing WAF in front of those.
1. Provision the cluster prerequisites with Terraform
Stand up (or reuse) the Kubernetes cluster, a fast StorageClass, and the Vault PKI mount with Terraform so the substrate is reproducible. The Kafka-specific pieces are the StorageClass and the Vault PKI role; the cluster itself is whatever your platform already uses.
# storageclass.tf — fast block storage, expandable, late-bound to the AZ the pod lands in
resource "kubernetes_storage_class" "kafka_fast" {
metadata { name = "kafka-fast" }
storage_provisioner = "ebs.csi.aws.com"
reclaim_policy = "Retain" # never auto-delete broker data
volume_binding_mode = "WaitForFirstConsumer"
allow_volume_expansion = true
parameters = {
type = "gp3"
iops = "6000"
throughput = "250"
encrypted = "true"
}
}
# vault-pki.tf — internal CA that issues short-lived Kafka listener certs
resource "vault_mount" "pki_kafka" {
path = "pki_kafka"
type = "pki"
max_lease_ttl_seconds = 7776000 # 90d issuing-CA lifetime
}
resource "vault_pki_secret_backend_role" "kafka" {
backend = vault_mount.pki_kafka.path
name = "kafka-internal"
allowed_domains = ["confluent.svc.cluster.local", "kafka.internal.acme.com"]
allow_subdomains = true
max_ttl = "720h" # 30d leaf certs, auto-renewed
key_type = "rsa"
key_bits = 2048
}
terraform init && terraform apply -auto-approve
kubectl get storageclass kafka-fast # confirm it exists and is the intended class
Run wiz-code iac scan . in the pipeline at this point — Wiz Code flags an unencrypted volume, a reclaim_policy of Delete on stateful storage, or a public endpoint before any of it reaches the cluster.
2. Install the Confluent for Kubernetes operator
Add the Confluent Helm repo and install CFK into the confluent namespace. The operator is cluster-scoped here so it can manage future namespaces, but the workload runs in confluent.
kubectl create namespace confluent
helm repo add confluentinc https://packages.confluent.io/helm
helm repo update
helm upgrade --install confluent-operator \
confluentinc/confluent-for-kubernetes \
--namespace confluent \
--set namespaced=false \
--version 0.1149.x # CFK chart matching CP 7.7
kubectl -n confluent get pods -l app.kubernetes.io/name=confluent-operator
In production you do not run that helm command by hand. The Helm release is declared as an Argo CD Application, and Argo CD reconciles it from Git. The block below is the GitOps source of truth:
# argocd/confluent-operator.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: confluent-operator
namespace: argocd
spec:
project: platform
source:
repoURL: https://packages.confluent.io/helm
chart: confluent-for-kubernetes
targetRevision: 0.1149.x
helm:
parameters:
- { name: namespaced, value: "false" }
destination:
server: https://kubernetes.default.svc
namespace: confluent
syncPolicy:
automated: { prune: true, selfHeal: true }
3. Wire TLS certificates from Vault
CFK supports auto-generated certs, but here Vault is the CA so leaf certs are short-lived and centrally revocable. Issue the CA bundle and a server keypair from the Vault PKI role provisioned in step 1, and load them as the cluster’s certificate authority. The confluent CLI’s helper generates the right Secret shape, or use Vault directly:
# Pull the CA chain Vault will sign with
vault read -field=certificate pki_kafka/cert/ca > cacerts.pem
# Create the CA secret CFK uses to auto-generate per-component server certs
kubectl -n confluent create secret generic ca-pair-sslcerts \
--from-file=ca.crt=cacerts.pem \
--from-file=ca.key=ca.key # the issuing CA key, injected via Vault Agent, not on disk
For dynamic per-broker leaf certs, annotate the namespace so the Vault Agent injector mounts a freshly issued cert into each broker pod, and point CFK at that path. The practical payoff: when a cert is 30 days from expiry Vault re-issues it and the agent rotates it in place, so nobody is paged at 2 a.m. for an expired Kafka listener.
4. Deploy the KRaft controllers and the Kafka brokers
This is the core. One KRaftController CR (three replicas) forms the metadata quorum; one Kafka CR (three replicas) is the data plane. Note the explicit TLS config, the storage size bound to the kafka-fast class, anti-affinity across zones, and the resource requests.
# kafka-cluster.yaml
apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
name: kraftcontroller
namespace: confluent
spec:
replicas: 3
image:
application: confluentinc/cp-server:7.7.0
init: confluentinc/confluent-init-container:2.9.0
dataVolumeCapacity: 10Gi
storageClass: { name: kafka-fast }
tls:
secretRef: ca-pair-sslcerts # signed by Vault PKI
---
apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
name: kafka
namespace: confluent
spec:
replicas: 3
image:
application: confluentinc/cp-server:7.7.0
init: confluentinc/confluent-init-container:2.9.0
dataVolumeCapacity: 100Gi
storageClass: { name: kafka-fast }
dependencies:
kRaftController:
controllerListener:
tls: { enabled: true }
clusterRef: { name: kraftcontroller }
podTemplate:
resources:
requests: { cpu: "2", memory: 8Gi }
limits: { memory: 8Gi }
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: topology.kubernetes.io/zone
labelSelector:
matchLabels: { app: kafka }
listeners:
internal:
authentication: { type: mtls }
tls: { enabled: true }
external:
externalAccess:
type: loadBalancer
loadBalancer:
domain: kafka.internal.acme.com
bootstrapPrefix: bootstrap
authentication: { type: mtls }
tls: { enabled: true }
configOverrides:
server:
- "min.insync.replicas=2" # with RF=3, tolerate one broker loss
- "default.replication.factor=3"
- "auto.create.topics.enable=false" # topics are declared, never accidental
Apply via Git (kubectl apply -f kafka-cluster.yaml only for a throwaway dev cluster; otherwise commit it and let Argo CD sync):
kubectl apply -f kafka-cluster.yaml
kubectl -n confluent rollout status statefulset/kafka --timeout=600s
kubectl -n confluent get kafka kafka -o jsonpath='{.status.phase}' # want: RUNNING
The min.insync.replicas=2 with default.replication.factor=3 is the durability contract: a produce with acks=all only succeeds when the leader plus one follower have the record, so a single broker failure never loses an acknowledged write — and auto.create.topics.enable=false means a typo’d topic name fails loudly instead of silently spawning an unmanaged topic.
5. Enable RBAC backed by Entra ID
Kafka RBAC binds principals to roles on resources. The principals come from OIDC — engineers authenticate through Entra ID (federated upstream from Okta), CI/connectors use mTLS principals. CFK configures the Metadata Service (MDS) that issues and validates RBAC tokens. Enable it on the Kafka CR and declare role bindings as ConfluentRolebinding CRs:
# rbac.yaml
apiVersion: platform.confluent.io/v1beta1
kind: ConfluentRolebinding
metadata:
name: orders-team-developerwrite
namespace: confluent
spec:
principal:
type: group
name: "kafka-orders-producers" # an Entra ID security group, surfaced via OIDC
role: DeveloperWrite
resourcePatterns:
- { resourceType: Topic, name: "orders.", patternType: PREFIXED }
---
apiVersion: platform.confluent.io/v1beta1
kind: ConfluentRolebinding
metadata:
name: analytics-readonly
namespace: confluent
spec:
principal: { type: group, name: "kafka-analytics-consumers" }
role: DeveloperRead
resourcePatterns:
- { resourceType: Topic, name: "orders.", patternType: PREFIXED }
- { resourceType: Group, name: "analytics-", patternType: PREFIXED }
The win here is least privilege by default: the orders team can write to orders.* and nothing else, the analytics team can only read, and because the principal is an Entra ID group, access is granted and revoked by HR/IT group membership — a leaver loses Kafka access the moment they leave the directory group, with no Kafka-side change. Every new third-party role binding routes through a ServiceNow change request before Argo CD is permitted to sync it.
6. Deploy Schema Registry
Schema Registry enforces the schema contract so a producer cannot ship a breaking change that silently corrupts every consumer. It runs as its own CR, depends on Kafka over TLS, and is set to reject incompatible schemas.
# schema-registry.yaml
apiVersion: platform.confluent.io/v1beta1
kind: SchemaRegistry
metadata:
name: schemaregistry
namespace: confluent
spec:
replicas: 2
image: { application: confluentinc/cp-schema-registry:7.7.0, init: confluentinc/confluent-init-container:2.9.0 }
dependencies:
kafka:
bootstrapEndpoint: kafka.confluent.svc.cluster.local:9071
authentication: { type: mtls }
tls: { enabled: true }
tls: { secretRef: ca-pair-sslcerts }
Set the global compatibility level to BACKWARD (new schema can read old data) so consumers never break, and enforce it in CI — GitHub Actions runs schema-registry-maven-plugin:test-compatibility against the live registry on every PR, so a breaking Avro change fails the build, not production:
kubectl apply -f schema-registry.yaml
kubectl -n confluent rollout status statefulset/schemaregistry --timeout=300s
# Pin global compatibility to BACKWARD via the REST API (through the in-cluster service)
curl -s --cacert cacerts.pem \
-X PUT https://schemaregistry.confluent.svc.cluster.local:8081/config \
-H "Content-Type: application/json" \
-d '{"compatibility": "BACKWARD"}'
7. Deploy Kafka Connect
Kafka Connect runs source and sink connectors — here, a sink that streams orders.* into the analytics warehouse. The connector’s credentials come from Vault (injected, never inline). Deploy the Connect cluster, then register a connector with a Connector CR.
# connect.yaml
apiVersion: platform.confluent.io/v1beta1
kind: Connect
metadata:
name: connect
namespace: confluent
spec:
replicas: 2
image: { application: confluentinc/cp-server-connect:7.7.0, init: confluentinc/confluent-init-container:2.9.0 }
dependencies:
kafka:
bootstrapEndpoint: kafka.confluent.svc.cluster.local:9071
authentication: { type: mtls }
tls: { enabled: true }
podTemplate:
podSecurityContext: { fsGroup: 1000, runAsUser: 1000 }
---
apiVersion: platform.confluent.io/v1beta1
kind: Connector
metadata:
name: orders-to-warehouse
namespace: confluent
spec:
class: "io.confluent.connect.jdbc.JdbcSinkConnector"
taskMax: 3
connectClusterRef: { name: connect }
configs:
topics: "orders.created,orders.shipped"
connection.url: "${vault:secret/data/connect/warehouse#jdbc_url}" # resolved by Vault
insert.mode: "upsert"
pk.mode: "record_key"
kubectl apply -f connect.yaml
kubectl -n confluent rollout status statefulset/connect --timeout=300s
kubectl -n confluent get connector orders-to-warehouse -o jsonpath='{.status.connectorState}' # RUNNING
8. Hand the cluster to observability and runtime security
Point Dynatrace (OneAgent on the node pool, plus the JMX/Prometheus extension) or Datadog (the Datadog Agent with the Kafka and Kafka Connect integrations) at the brokers so you get per-broker request latency, under-replicated partition counts, consumer-group lag, and disk-usage trends — the four metrics that predict a Kafka incident before it happens. Deploy CrowdStrike Falcon as a node-level DaemonSet sensor so runtime threats on the Kafka nodes feed the SOC, and let Wiz run continuous posture scanning so an accidental plaintext listener or a public LoadBalancer drift raises an alert immediately.
# Expose broker JMX as Prometheus for the agent to scrape (CFK ships the exporter)
kubectl -n confluent get svc kafka-0-internal -o yaml | grep -A2 metrics
# Confirm Datadog/Dynatrace is reading consumer lag:
kubectl -n confluent exec kafka-0 -- kafka-consumer-groups \
--bootstrap-server localhost:9071 --describe --group analytics-warehouse \
--command-config /mnt/sslcerts/client.properties
Validation
Walk the data path end to end before declaring success. Use the in-cluster Kafka tooling with a client config that presents an mTLS cert.
# 1. Cluster + components are RUNNING
kubectl -n confluent get kafka,kraftcontroller,schemaregistry,connect
# 2. Create a managed topic (declarative, not auto-created) and confirm RF=3
kubectl apply -f - <<'EOF'
apiVersion: platform.confluent.io/v1beta1
kind: KafkaTopic
metadata: { name: orders.created, namespace: confluent }
spec:
replicas: 3
partitions: 12
configs: { "min.insync.replicas": "2" }
EOF
kubectl -n confluent exec kafka-0 -- kafka-topics \
--bootstrap-server localhost:9071 --describe --topic orders.created \
--command-config /mnt/sslcerts/client.properties
# 3. Produce and consume a record over TLS
kubectl -n confluent exec -it kafka-0 -- bash -c \
'echo "{\"orderId\":\"A-1\"}" | kafka-console-producer \
--bootstrap-server localhost:9071 --topic orders.created \
--producer.config /mnt/sslcerts/client.properties'
kubectl -n confluent exec kafka-0 -- kafka-console-consumer \
--bootstrap-server localhost:9071 --topic orders.created --from-beginning --max-messages 1 \
--consumer.config /mnt/sslcerts/client.properties
# 4. Register a schema and confirm BACKWARD compatibility is enforced
curl -s --cacert cacerts.pem https://schemaregistry.confluent.svc.cluster.local:8081/config
# 5. Verify RBAC denies an unauthorized principal (should return AuthorizationException)
kafka-topics --bootstrap-server kafka.internal.acme.com:9092 --list \
--command-config /tmp/unauthorized-client.properties
A green run is: all four CRs RUNNING, orders.created showing ReplicationFactor: 3 and Isr: 3, a record round-tripping, the registry returning BACKWARD, and the unauthorized client being denied.
Rollback and teardown
Because everything is declarative, rollback is a Git revert that Argo CD reconciles. To tear a cluster down by hand, delete the workload CRs first (so the operator drains and removes StatefulSets cleanly), then the operator, then — deliberately and last — the PersistentVolumeClaims, because the Retain reclaim policy keeps broker data even after the PVCs are gone.
# 1. Remove workloads (operator drains brokers gracefully)
kubectl -n confluent delete connector --all
kubectl -n confluent delete connect connect
kubectl -n confluent delete schemaregistry schemaregistry
kubectl -n confluent delete kafka kafka
kubectl -n confluent delete kraftcontroller kraftcontroller
# 2. Remove the operator
helm -n confluent uninstall confluent-operator
# 3. DELIBERATELY remove data last (irreversible)
kubectl -n confluent delete pvc -l app=kafka
kubectl -n confluent delete pvc -l app=kraftcontroller
For a rollback rather than a teardown, git revert the offending commit; Argo CD’s selfHeal re-applies the previous known-good CR set. Never edit a live CR with kubectl edit in production — the next Argo sync will overwrite it and you will have lost the change.
Common pitfalls
- Using
volumeBindingMode: Immediatestorage. The PVC binds to a zone before the pod is scheduled, so the broker pod and its disk can land in different zones and the pod never starts. Always useWaitForFirstConsumerfor the broker StorageClass. - Setting
reclaim_policy: Deleteon broker storage. A namespace delete or an over-eager Argo prune then wipes Kafka data permanently. UseRetainand delete PVCs manually as a separate, deliberate step. - Leaving
auto.create.topics.enable=true. A consumer subscribing to a misspelled topic silently creates an unmanaged, single-partition, RF-1 topic that quietly drops data on a broker loss. Disable it and declare topics withKafkaTopicCRs. min.insync.replicasequal to the replication factor. With RF=3 andmin.insync.replicas=3, a single broker reboot blocks allacks=allproduces. Keep it at 2 so you can tolerate one broker down.- Forgetting pod anti-affinity. Without it, the scheduler can stack all three brokers on one node, and a single node failure takes the whole cluster offline. Pin anti-affinity to
topology.kubernetes.io/zone. - Self-signed certs with no rotation. A hand-rolled cert expires unnoticed and the entire cluster’s TLS listeners go dark at once. Letting Vault PKI issue and the agent rotate removes that whole class of 2 a.m. page.
Security notes
The cluster is TLS-everywhere by construction: mTLS on the internal and controller listeners, TLS on the external listener, all signed by Vault’s PKI engine so leaf certs are short-lived and revocable. RBAC binds Entra ID groups (federated from Okta) to least-privilege roles, so access follows directory membership and a leaver is de-authorized automatically. Connector secrets and the license live in Vault, injected at runtime, never in a plain Kubernetes Secret or a Git-committed config. Wiz continuously checks posture so a plaintext-listener or public-LoadBalancer drift alarms immediately, and Wiz Code blocks the same misconfigurations in IaC before merge. CrowdStrike Falcon node sensors feed runtime detections to the SOC, and ServiceNow is the documented change gate for any new external listener or third-party role binding. Where a Schema Registry or Control Center surface is exposed to partners, Akamai terminates TLS and provides WAF at the edge.
Cost notes
The dominant cost is persistent disk and the always-on broker compute, not the Kafka software. Size dataVolumeCapacity to real retention — set per-topic retention.ms aggressively (orders events rarely need 7 days) so you are not paying to store data nobody reads, and use gp3 with provisioned IOPS rather than the pricier io2 unless a topic genuinely needs it. Three brokers and three KRaft controllers is the floor for production durability; do not over-provision replicas chasing headroom you can add later with allowVolumeExpansion and broker scale-out. Right-size the podTemplate resource requests to observed usage in Dynatrace/Datadog rather than guessing high, and let the Connect cluster scale taskMax to load instead of running idle workers. Topic compaction (cleanup.policy=compact) on changelog-style topics keeps only the latest value per key and can cut storage for those topics by an order of magnitude.