Flux is deceptively simple to bootstrap and deceptively easy to turn into a tangle of cross-referencing Kustomizations that nobody can reason about. The hard part is never flux bootstrap; it is the repo topology, the overlay strategy, and the isolation model that keep ten teams shipping into shared clusters without stepping on each other. This is the structure I reach for when a platform has to run real multi-tenancy on Flux v2 and survive an audit.
1. The controller architecture you are actually operating
Flux is not one binary. It is the GitOps Toolkit: a set of single-responsibility controllers that watch CRDs and reconcile. You operate all of them, so know what each owns.
- source-controller — fetches and verifies artifacts. Owns
GitRepository,OCIRepository,HelmRepository,Bucket. It produces a checksummed tarball that everything downstream consumes. - kustomize-controller — reconciles
Kustomizationobjects: builds the overlay, applies it server-side, prunes, and runs health checks. - helm-controller — reconciles
HelmReleaseobjects by driving the Helm SDK against a chart sourced by source-controller. - notification-controller — handles both inbound webhooks (
Receiver) and outbound events/alerts (Provider,Alert). - image-reflector-controller and image-automation-controller — scan registries (
ImageRepository,ImagePolicy) and write image bumps back to Git (ImageUpdateAutomation).
The mental model: source-controller answers “what is the desired state in Git/OCI,” kustomize- and helm-controllers answer “make the cluster match it,” and the rest is plumbing. Every CRD reconciles on its own interval, independently. There is no central scheduler.
# See the controllers and their toolkit version
flux check
kubectl -n flux-system get deploy -l app.kubernetes.io/part-of=flux
2. Designing the monorepo: clusters, infrastructure, tenants
A monorepo wins for a platform team: atomic cross-cutting changes, one place to grep, and directory-scoped CODEOWNERS to recover most of the isolation a polyrepo would give you. The layout that has held up for me separates cluster entrypoints from what they reference:
fleet-infra/
clusters/
prod-eu/
flux-system/ # bootstrap output: gotk-components + gotk-sync
infrastructure.yaml # Kustomization -> ../../infrastructure/prod
tenants.yaml # Kustomization -> ../../tenants (prod overlay)
staging/
flux-system/
infrastructure.yaml
tenants.yaml
infrastructure/
base/ # ingress-nginx, cert-manager, kyverno, ...
prod/ # overlay of base
staging/
tenants/
base/ # per-tenant namespace, RBAC, GitRepository, Kustomization
team-payments/
team-search/
prod/ # overlay: prod GitRepository revisions, quotas
staging/
The rule that keeps this sane: a cluster directory only ever contains Kustomization objects that point elsewhere in the repo. It is a manifest of “what runs here,” never the workloads themselves. infrastructure/ is platform-owned and applied with cluster-admin authority. tenants/ is where isolation gets enforced, covered in step 5.
Keep
dependsOnedges flowing one direction: tenants depend on infrastructure, never the reverse. If a tenant Kustomization waits on a controller a tenant could delete, you have built a cross-tenant denial-of-service into your reconciliation graph.
3. Bootstrapping: CLI vs. Terraform, and pinning the toolkit
flux bootstrap is idempotent. It commits gotk-components.yaml (the controller manifests) and gotk-sync.yaml (the GitRepository + Kustomization that makes Flux manage itself) into the cluster path, installs them, and configures deploy-key or token access.
export GITHUB_TOKEN=ghp_xxx
flux bootstrap github \
--owner=acme \
--repository=fleet-infra \
--branch=main \
--path=clusters/prod-eu \
--components-extra=image-reflector-controller,image-automation-controller \
--version=v2.7.4
Two things matter here. Pin --version to an exact toolkit release; never let bootstrap float to latest, or a controller CRD will change shape under you on the next run. And add the image controllers at bootstrap via --components-extra if you intend to use image automation; they are not installed by default.
For fleets, drive bootstrap through the official Terraform provider so cluster onboarding is reviewable infrastructure, not a laptop command:
provider "flux" {
kubernetes = {
host = var.cluster_endpoint
cluster_ca_certificate = base64decode(var.cluster_ca)
token = var.cluster_token
}
git = {
url = "ssh://git@github.com/acme/fleet-infra.git"
branch = "main"
ssh = { username = "git", private_key = var.deploy_key }
}
}
resource "flux_bootstrap_git" "this" {
version = "v2.7.4"
path = "clusters/prod-eu"
components_extra = ["image-reflector-controller", "image-automation-controller"]
cluster_domain = "cluster.local"
network_policy = true
}
network_policy = true makes bootstrap drop deny-by-default NetworkPolicies into flux-system, isolating the controllers. Pair the Terraform version with a Renovate or Dependabot rule so toolkit upgrades arrive as PRs.
4. Kustomize overlays for dev/staging/prod
Flux runs the same Kustomize engine as kubectl kustomize; if it builds locally, it builds in-cluster. Keep a thin base/ and put environment deltas in overlays via strategic-merge and JSON6902 patches.
# tenants/base/team-payments/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- namespace.yaml
- rbac.yaml
- sync.yaml # the tenant's own GitRepository + Kustomization
# tenants/prod/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../base/team-payments
- ../base/team-search
patches:
- target:
kind: Kustomization
group: kustomize.toolkit.fluxcd.io
patch: |
- op: replace
path: /spec/interval
value: 5m
components:
- ../components/prod-quotas
Two overlay features earn their keep at scale:
- Components (
kind: Component) are reusable, composable overlay fragments. Aprod-quotascomponent that adds aResourceQuotaandLimitRangecan be pulled into every prod tenant without copy-paste. - Variable substitution is a Flux feature, not Kustomize. The
Kustomization.spec.postBuild.substituteFromblock injects values from a ConfigMap or Secret after the build, so you keep one base and vary region, domain, or replica count per cluster:
# clusters/prod-eu/infrastructure.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: infrastructure
namespace: flux-system
spec:
interval: 10m
path: ./infrastructure/prod
prune: true
sourceRef:
kind: GitRepository
name: flux-system
postBuild:
substituteFrom:
- kind: ConfigMap
name: cluster-vars # contains region=eu-west-1, domain=eu.acme.io
In manifests you reference ${region} and ${domain:=default}. Use substituteFrom over inline substitute so values live in versioned ConfigMaps, not in the Kustomization spec. Be deliberate: any literal ${...} in your YAML is now a substitution target, which can bite you in shell scripts embedded in manifests.
5. Enforcing multi-tenancy: source boundaries + impersonation
This is the part most teams get wrong. Multi-tenancy in Flux is not RBAC alone; it is the combination of per-tenant sources and Kustomization impersonation. Without impersonation, every tenant Kustomization applies with the kustomize-controller’s service account, which is cluster-admin. That means any tenant can write any manifest anywhere.
The fix is spec.serviceAccountName on the tenant Kustomization. The controller then impersonates that ServiceAccount and applies under its RBAC. Bind it narrowly to the tenant namespace.
# tenants/base/team-payments/rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: reconciler
namespace: team-payments
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: reconciler
namespace: team-payments
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: admin # namespace-admin, NOT cluster-admin
subjects:
- kind: ServiceAccount
name: reconciler
namespace: team-payments
# tenants/base/team-payments/sync.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: team-payments
namespace: team-payments
spec:
interval: 1m
url: https://github.com/acme/team-payments-config
ref:
branch: main
secretRef:
name: git-auth
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: team-payments
namespace: team-payments
spec:
interval: 5m
path: ./deploy
prune: true
serviceAccountName: reconciler # <-- impersonation: the isolation boundary
sourceRef:
kind: GitRepository
name: team-payments
targetNamespace: team-payments
Three boundaries are now closed at once:
- Source isolation — the tenant’s
GitRepositorylives in their namespace and points at their repo. They cannot reference the platform repo or another tenant’s source across namespaces (Flux disallows cross-namespacesourceRefby default; enforce it with--no-cross-namespace-refs=trueon the controllers). - Apply isolation —
serviceAccountName: reconcilermeans even if a tenant commits aClusterRoleBinding, the apply fails because their SA cannot create cluster-scoped objects. - Namespace pinning —
targetNamespaceforces everything into their namespace regardless of what their manifests claim.
Set
--default-service-account=defaulton the kustomize- and helm-controllers cluster-wide. Then any Kustomization that forgetsserviceAccountNamefalls back to the (powerless)defaultSA rather than silently inheriting cluster-admin. This single flag turns “secure by configuration” into “secure by default” and is the most important hardening switch in a Flux multi-tenant install.
6. Dependency ordering, health checks, and wait semantics
Order is explicit, not inferred. dependsOn makes a Kustomization wait until its dependency is Ready.
# clusters/prod-eu/tenants.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: tenants
namespace: flux-system
spec:
interval: 10m
path: ./tenants/prod
prune: true
dependsOn:
- name: infrastructure # CRDs + controllers land first
sourceRef:
kind: GitRepository
name: flux-system
wait: true # block until all applied objects are healthy
timeout: 5m
The semantics that trip people up:
wait: trueblocks the Kustomization’s own readiness on all its objects passing health checks. Combined with a downstreamdependsOn, this gives you ordered rollout. Withoutwait,dependsOnonly waits for the Kustomization to report Ready (i.e. applied), not for the workloads to be healthy.- For object-specific gates, list
healthCheckswith explicit GVK + name. Flux ships built-in health evaluation for Deployments, StatefulSets, DaemonSets, and any Kustomization/HelmRelease; for custom resources, usehealthCheckExprs(CEL) to define readiness. timeoutbounds both the apply and the wait. Set it generously above your slowest rollout, or a slow image pull will mark the Kustomization failed and stall everything that depends on it.
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: payments-api
namespace: team-payments
7. Automated image updates with write-back commits
Image automation is three objects working together. ImageRepository scans tags, ImagePolicy selects the one you want, and ImageUpdateAutomation writes the chosen tag back to Git.
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
name: payments-api
namespace: flux-system
spec:
image: ghcr.io/acme/payments-api
interval: 5m
---
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
name: payments-api
namespace: flux-system
spec:
imageRepositoryRef:
name: payments-api
policy:
semver:
range: ">=1.4.0 <2.0.0" # never auto-cross a major
apiVersion: image.toolkit.fluxcd.io/v1beta1
kind: ImageUpdateAutomation
metadata:
name: payments-api
namespace: flux-system
spec:
interval: 10m
sourceRef:
kind: GitRepository
name: flux-system
git:
checkout:
ref:
branch: main
commit:
author:
name: fluxcdbot
email: fluxcdbot@acme.io
messageTemplate: "chore: bump {{range .Changed.Changes}}{{.NewValue}}{{end}}"
push:
branch: flux-image-updates # PR target, not main
update:
path: ./tenants/prod
strategy: Setters
The controller edits only lines you mark with a setter comment, so it can never rewrite arbitrary YAML:
image: ghcr.io/acme/payments-api:1.4.2 # {"$imagepolicy": "flux-system:payments-api"}
Push to a dedicated branch (push.branch) and gate it with a PR + required checks rather than committing straight to main. That keeps a human (or a policy bot) in the loop for production while the bump itself is fully automated. Constrain the semver.range so automation never crosses a major version on its own.
8. Drift detection, pruning, and recovering a stuck reconciliation
Flux applies server-side and continuously corrects drift: edit a managed Deployment by hand and the next reconcile reverts it. prune: true garbage-collects objects you deleted from Git, tracked by an inventory the controller maintains. Turning prune off is how orphaned resources accumulate; leave it on everywhere except during a deliberate migration.
Operational moves you will reach for:
# Force an immediate reconcile, pulling the latest from Git first
flux reconcile kustomization tenants --with-source
# Suspend during an incident so Flux stops fighting your manual changes
flux suspend kustomization team-payments -n team-payments
flux resume kustomization team-payments -n team-payments
# See why something is stuck
flux get kustomizations -A --status-selector ready=false
kubectl -n team-payments describe kustomization team-payments
For a Kustomization wedged on a single bad object (a finalizer hang, or an immutable-field conflict on an apply), suspend, fix or delete the offending object directly, then resume. If the inventory itself is diverged after a botched cutover, deleting and recreating the Kustomization rebuilds the inventory from Git cleanly. When a server-side apply conflicts because something else owns a field, spec.force: true makes Flux take ownership on the next apply rather than erroring forever.
Verify
Confirm the platform is healthy and the isolation actually holds:
# 1. Toolkit healthy and version-pinned
flux check
flux version
# 2. Everything Ready across all namespaces
flux get all -A
# 3. Sources reconciling and revisions current
flux get sources git -A
# 4. Prove impersonation: a tenant SA CANNOT touch cluster scope
kubectl auth can-i create clusterrolebindings \
--as=system:serviceaccount:team-payments:reconciler
# expected: no
# 5. Prove namespace isolation: tenant SA cannot read another namespace
kubectl auth can-i get secrets -n team-search \
--as=system:serviceaccount:team-payments:reconciler
# expected: no
# 6. Image automation is selecting the tag you expect
flux get image policy -A
If step 4 returns yes, your serviceAccountName is missing or bound to cluster-admin. Stop and fix it before onboarding tenants; nothing downstream is isolated until that returns no.
Enterprise scenario
A payments platform team ran a single shared “prod” cluster for eight product squads on Flux v2. They had namespaces and RBAC, and assumed they had multi-tenancy. During a routine review, a security engineer committed a ClusterRoleBinding granting cluster-admin to a test ServiceAccount into one squad’s application repo, expecting Flux to reject it. Flux applied it successfully. The root cause: every tenant Kustomization omitted serviceAccountName, so all of them reconciled with the kustomize-controller’s cluster-admin identity. Namespace RBAC was decorative; the reconciler ignored it entirely.
The constraint was that they could not stop deployments while fixing this, and could not trust eight teams to add the right field to every Kustomization. So they made the platform secure-by-default instead of per-object. They set the controller-wide fallback to a powerless account, then patched each tenant Kustomization to impersonate a namespace-scoped reconciler:
# Patched onto the kustomize-controller Deployment via the bootstrap kustomization
spec:
template:
spec:
containers:
- name: manager
args:
- --default-service-account=default
- --no-cross-namespace-refs=true
- --watch-all-namespaces=true
With --default-service-account=default, the next reconcile of any Kustomization that lacked serviceAccountName immediately lost cluster-admin and failed loudly on cluster-scoped objects, surfacing every place isolation had been missing. They worked the resulting failure list namespace by namespace, adding a reconciler SA bound to the namespaced admin ClusterRole and setting serviceAccountName on each Kustomization. No outage, because workloads themselves never stopped reconciling, only the privilege they reconciled with changed. The rogue ClusterRoleBinding was pruned on the first reconcile after its tenant gained a scoped SA. The lasting fix was the one flag: isolation became the default state of the platform, not a property each team had to remember to opt into.