DevOps Platform

Deploy Argo CD on Kubernetes with OIDC SSO, RBAC, and ApplicationSets for Multi-Cluster GitOps

A platform team running 14 Kubernetes clusters — prod and non-prod across three regions and two clouds — has hit the wall that every growing fleet hits: every team kubectl applys into its own namespaces by hand, nobody can say with confidence what is actually deployed where, and a config drift on one prod cluster took four hours to find because the “source of truth” was three engineers’ laptops. The mandate from the head of platform is blunt: one declarative control plane, Git as the only source of truth, SSO so nobody shares a kubeconfig, and a way to roll the same workload onto a new cluster without writing it 14 times. This guide builds exactly that — a single hub Argo CD that manages the whole fleet over GitOps, federates login through your corporate IdP, scopes every team to its own projects with RBAC, and uses ApplicationSets to fan one template across every registered cluster. Everything here is real commands you can run today.

Prerequisites

Target topology

Deploy Argo CD on Kubernetes with OIDC SSO, RBAC, and ApplicationSets for Multi-Cluster GitOps — topology

The model is a hub-and-spoke. One Argo CD install on the hub cluster holds all configuration, talks to your Git repos, and reconciles desired state onto every registered spoke (workload) cluster by calling each spoke’s Kubernetes API with a stored service-account credential. Humans never kubectl into a spoke for app changes — they open a pull request, Argo CD detects the new Git commit, and syncs. Login is federated: the argocd-server delegates authentication to bundled Dex, which brokers OIDC to Okta or Entra ID; the returned token’s group claims drive Argo CD RBAC, so the payments team only sees and syncs payments projects. ApplicationSets sit on the hub and generate one Argo CD Application per cluster (or per cluster × per app) from a single template, so onboarding cluster #15 is one label, not 200 lines of YAML.

1. Install Argo CD on the hub cluster

Install via the official Helm chart so configuration is declarative and upgradeable. Create the namespace and a values file first.

kubectl create namespace argocd

helm repo add argo https://argoproj.github.io/argo-helm
helm repo update

A minimal but production-shaped values.yaml — HA redis, the server behind your ingress, and insecure mode off because TLS terminates upstream at Akamai/ingress:

# argocd-values.yaml
global:
  domain: argocd.kloudvin.io

configs:
  params:
    server.insecure: false        # keep TLS; terminate at ingress/Akamai
  cm:
    # OIDC + RBAC config is added in steps 2 and 4
    admin.enabled: "true"         # we disable this in step 4 after SSO works

redis-ha:
  enabled: true                   # HA for a fleet control plane
controller:
  replicas: 1                     # one app-controller; shard later if needed
server:
  replicas: 2
  autoscaling:
    enabled: true
    minReplicas: 2
repoServer:
  replicas: 2
applicationSet:
  replicas: 2                     # the ApplicationSet controller (step 5)
dex:
  enabled: true                   # bundled Dex for OIDC brokering (step 2)

Install it:

helm install argocd argo/argo-cd \
  --namespace argocd \
  --version 7.7.0 \
  -f argocd-values.yaml

kubectl -n argocd rollout status deploy/argocd-server --timeout=180s

Grab the bootstrap admin password (we retire this account in step 4) and log in once to confirm the install:

ARGO_PWD=$(kubectl -n argocd get secret argocd-initial-admin-secret \
  -o jsonpath='{.data.password}' | base64 -d)

argocd login argocd.kloudvin.io --username admin --password "$ARGO_PWD" --grpc-web
argocd version --short

2. Wire OIDC SSO through Dex to Okta or Entra ID

Argo CD ships Dex as an identity broker. You point Dex at your corporate IdP over OIDC; Dex handles the dance and hands Argo CD a token whose groups claim you will use for RBAC. First, create the app on the IdP side.

Okta — create an OIDC Web app, set the sign-in redirect URI to https://argocd.kloudvin.io/api/dex/callback, and add a groups claim (filter: matches regex .*) to the ID token. Note the client ID and client secret.

Entra ID — register an application, add a Web redirect URI of https://argocd.kloudvin.io/api/dex/callback, create a client secret, and under Token configuration add the groups optional claim so the token carries the user’s group object IDs.

Never put the client secret in values.yaml. Pull it from HashiCorp Vault — which stores and leases the OIDC client secret and Git repo credentials so nothing sensitive lives in the chart or Git. Create the Kubernetes secret Argo CD’s Dex reads, sourcing the value from Vault:

# value fetched from Vault at deploy time, never echoed into shell history in CI
OIDC_SECRET=$(vault kv get -field=dex_client_secret secret/argocd/oidc)

kubectl -n argocd create secret generic argocd-dex-oidc \
  --from-literal=dex.okta.clientSecret="$OIDC_SECRET"

Now add the Dex connector to the argocd-cm ConfigMap. For Okta (Entra notes follow):

# add under configs.cm in argocd-values.yaml, then helm upgrade
url: https://argocd.kloudvin.io
dex.config: |
  connectors:
    - type: oidc
      id: okta
      name: Okta
      config:
        issuer: https://kloudvin.okta.com
        clientID: 0oa1exampleClientId
        clientSecret: $argocd-dex-oidc:dex.okta.clientSecret   # ref to the secret above
        insecureEnableGroups: true
        scopes: ["openid", "profile", "email", "groups"]

For Entra ID, the connector instead uses the Microsoft issuer and the app registration’s IDs:

    - type: oidc
      id: entra
      name: Entra ID
      config:
        issuer: https://login.microsoftonline.com/<TENANT_ID>/v2.0
        clientID: <APP_CLIENT_ID>
        clientSecret: $argocd-dex-oidc:dex.okta.clientSecret
        scopes: ["openid", "profile", "email"]
        getUserInfo: true

Apply with helm upgrade argocd argo/argo-cd -n argocd -f argocd-values.yaml, then restart Dex and the server so they reload:

kubectl -n argocd rollout restart deploy/argocd-dex-server deploy/argocd-server

Browse to https://argocd.kloudvin.io and you should now see a “LOG IN VIA OKTA” (or Entra) button. Log in with a corporate account; you are authenticated but not yet authorized — that is step 4.

3. Register workload (spoke) clusters

The hub reconciles onto each spoke by calling that spoke’s API server with a stored credential. The CLI automates the whole handshake — it creates an argocd-manager ServiceAccount and a ClusterRole on the spoke, then stores the resulting bearer token as a cluster Secret on the hub.

Make sure your local kubeconfig has a context per spoke, then register each:

kubectl config get-contexts        # confirm you have spoke contexts

# Register two spokes. The context name is what your kubeconfig calls them.
argocd cluster add prod-eu-west-1   --name prod-eu-west-1   --grpc-web
argocd cluster add prod-ap-south-1  --name prod-ap-south-1  --grpc-web
argocd cluster add staging-eu-west-1 --name staging-eu-west-1 --grpc-web

For production, label clusters at registration so ApplicationSets can target them by selector rather than by name. Re-run cluster add with labels, or patch the stored secret:

argocd cluster add prod-eu-west-1 --name prod-eu-west-1 \
  --label env=prod --label region=eu-west-1 --label tier=gold --grpc-web

Verify the fleet is connected (the hub’s own cluster shows as https://kubernetes.default.svc):

argocd cluster list
# SERVER                          NAME               VERSION  STATUS      MESSAGE
# https://10.0.4.10               prod-eu-west-1     1.29     Successful
# https://10.2.4.10               prod-ap-south-1    1.29     Successful
# https://kubernetes.default.svc  in-cluster         1.29     Successful

In a GitOps-pure setup you would instead commit each cluster Secret (with the credential sourced from Vault via the External Secrets Operator) to the bootstrap repo so cluster registration itself is declarative — but argocd cluster add is the fastest correct path to get running.

4. Define Projects and lock down RBAC

This is the step that turns a shared toy into a multi-tenant platform. AppProjects are the security boundary: each project restricts which Git repos, which destination clusters/namespaces, and which resource kinds its apps may use. RBAC then maps IdP groups to what they can do within those projects.

Create an AppProject per team. The payments team may deploy only from its repo, only to its namespaces, only on prod and staging:

# projects/payments.yaml
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
  name: payments
  namespace: argocd
spec:
  description: Payments squad workloads
  sourceRepos:
    - https://github.com/kloudvin/payments-manifests.git
  destinations:
    - server: '*'
      namespace: 'payments-*'        # only payments-* namespaces, on any cluster
  clusterResourceWhitelist:
    - group: ''
      kind: Namespace
  namespaceResourceBlacklist:
    - group: ''
      kind: ResourceQuota            # platform team owns quotas, not the squad
  roles:
    - name: deployer
      description: CI identity that can sync payments apps
      policies:
        - p, proj:payments:deployer, applications, sync, payments/*, allow

Now the RBAC policy that maps your IdP groups to roles. Argo CD’s built-in roles are role:admin and role:readonly; you define the rest. Map the Okta/Entra group (by group name for Okta, or group object ID for Entra) to a role. Add this to argocd-rbac-cm:

# add under configs.rbac in argocd-values.yaml
configs:
  rbac:
    policy.default: role:readonly          # everyone logged in can view; nothing more
    scopes: '[groups]'
    policy.csv: |
      # Platform admins: full control
      g, kloudvin-platform-admins, role:admin

      # Payments squad: full control of ONLY the payments project
      p, role:payments-admin, applications, *, payments/*, allow
      p, role:payments-admin, logs, get, payments/*, allow
      p, role:payments-admin, exec, create, payments/*, deny
      g, kloudvin-payments-team, role:payments-admin

      # SRE: sync any app fleet-wide, but cannot delete or edit project config
      p, role:sre, applications, sync, */*, allow
      p, role:sre, applications, get, */*, allow
      p, role:sre, applications, delete, */*, deny
      g, kloudvin-sre, role:sre

The mental model: g, <group>, <role> grants a group a role; p, <role>, <resource>, <action>, <object>, allow|deny is a permission line where object is project/app. deny always wins over allow, which is how you carve exec/delete out of an otherwise-powerful role. Apply with helm upgrade.

With SSO and RBAC proven, retire the local admin account so the only way in is your IdP:

# in argocd-values.yaml
configs:
  cm:
    admin.enabled: "false"

helm upgrade once more. From now on, access is corporate identity only — auditable, MFA-backed, and revoked the moment HR offboards someone.

5. Fan out workloads with ApplicationSets

An ApplicationSet is a controller-driven template that generates Argo CD Applications. Instead of writing one Application per cluster, you write one template and a generator that supplies the parameters. This is how “deploy the monitoring agent to every prod cluster” becomes one object.

Cluster generator — generate one Application per registered cluster matching a label selector. Here we roll a fleet-wide observability and security baseline (the Dynatrace OneAgent for distributed tracing and the CrowdStrike Falcon sensor for runtime threat detection) onto every prod cluster:

# applicationsets/platform-baseline.yaml
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: platform-baseline
  namespace: argocd
spec:
  goTemplate: true
  generators:
    - clusters:
        selector:
          matchLabels:
            env: prod              # only clusters labelled env=prod (step 3)
  template:
    metadata:
      name: 'baseline-{{.name}}'   # baseline-prod-eu-west-1, baseline-prod-ap-south-1, ...
    spec:
      project: platform
      source:
        repoURL: https://github.com/kloudvin/platform-baseline.git
        targetRevision: main
        path: 'overlays/{{.metadata.labels.region}}'   # per-region kustomize overlay
      destination:
        server: '{{.server}}'
        namespace: platform-system
      syncPolicy:
        automated:
          prune: true
          selfHeal: true           # revert manual drift automatically
        syncOptions:
          - CreateNamespace=true

Apply it and watch one Application appear per matching cluster:

kubectl apply -f applicationsets/platform-baseline.yaml
argocd appset list
argocd app list -l argocd.argoproj.io/application-set-name=platform-baseline

Matrix generator for the harder case — every app × every cluster. Combine a Git directory generator (each folder under apps/ is a microservice) with the cluster generator so each service lands on each prod cluster:

  generators:
    - matrix:
        generators:
          - git:
              repoURL: https://github.com/kloudvin/payments-manifests.git
              revision: main
              directories:
                - path: apps/*
          - clusters:
              selector:
                matchLabels:
                  env: prod

This is the real payoff: onboarding cluster #15 is argocd cluster add ... --label env=prod, and within a sync window every baseline component and every targeted app reconciles onto it untouched by human hands. The app-of-apps bootstrap pattern ties it together — a single root Application points at the projects/ and applicationsets/ folders in your config repo, so the entire control plane is itself GitOps-managed and reproducible from an empty cluster.

6. Connect the delivery and IaC tooling

GitOps changes how CI/CD draws its boundary. Jenkins or GitHub Actions builds and tests the image, pushes it to the registry, and then its only job is to commit the new image tag to the manifests repo — it does not kubectl apply. Argo CD owns the cluster. A typical GitHub Actions tail:

# .github/workflows/release.yml (final step only)
      - name: Bump image tag in GitOps repo
        run: |
          yq -i '.image.tag = "${{ github.sha }}"' \
            apps/checkout/values.yaml
          git commit -am "checkout: ${{ github.sha }}"
          git push

The clusters themselves are provisioned with Terraform (VPCs, node pools, the hub install) and node-level config converged with Ansible (kernel params, the CrowdStrike Falcon sensor DaemonSet prerequisites on virtual appliances that bridge legacy network segments into the mesh). Security and compliance hook in around the flow: Wiz (with Wiz Code) scans the manifests repo and the running clusters for misconfigurations, exposed secrets, and toxic IAM combinations — failing a PR check in Wiz Code before a bad manifest is ever committed, and continuously flagging posture drift in the live fleet. A failed sync or a Wiz critical finding raises a ServiceNow incident automatically, so the platform team gets a ticket with a change record rather than a buried log line. Where the org runs internal training, Moodle hosts the team’s GitOps runbook and the onboarding course every new squad completes before they get a project.

Validation

Prove the whole chain end to end before you hand it over:

# 1. SSO works and RBAC is scoped: log in as a payments-team user, confirm
#    they see ONLY payments apps and cannot sync the platform project.
argocd login argocd.kloudvin.io --sso --grpc-web
argocd app list                      # should list only payments/* for that user

# 2. Fleet is healthy and synced
argocd app list -o wide              # all rows Healthy / Synced
argocd cluster list                  # every spoke Successful

# 3. ApplicationSet fan-out is correct
argocd app list -l argocd.argoproj.io/application-set-name=platform-baseline
# one app per prod cluster, all Synced

# 4. Self-heal actually heals: introduce drift on a spoke, watch it revert
kubectl --context prod-eu-west-1 -n platform-system \
  scale deploy/oneagent --replicas=0
sleep 30
argocd app get baseline-prod-eu-west-1 | grep -i 'sync status'   # back to Synced

A green argocd app get showing Sync Status: Synced and Health Status: Healthy after you deliberately broke a spoke is the single best proof the control loop is real.

Rollback and teardown

Argo CD makes rollback first-class because every sync maps to a Git revision. To roll an app back, roll Git back (revert the commit) or pin the app to a prior history ID:

argocd app history payments/checkout
argocd app rollback payments/checkout <HISTORY_ID>

To disable auto-sync during an incident so you can stabilize by hand:

argocd app set baseline-prod-eu-west-1 --sync-policy none

Full teardown — remove generated apps first (the ApplicationSet owns them), then spokes, then the hub:

kubectl delete -f applicationsets/platform-baseline.yaml   # removes generated apps
argocd cluster rm prod-eu-west-1
argocd cluster rm prod-ap-south-1
helm uninstall argocd -n argocd
kubectl delete namespace argocd

Deleting the ApplicationSet cascades to its generated Applications; deleting an Application by default prunes the workloads it created on the spoke, so order matters — pull the generators before the clusters.

Common pitfalls

Security notes

Treat the hub as a tier-0 asset — it holds credentials to every cluster it manages, so a hub compromise is a fleet compromise. Keep all secrets out of Git: the OIDC client secret and repo credentials live in HashiCorp Vault, injected via the External Secrets Operator, never in the Helm values or a committed manifest. Disable the local admin account once SSO works so every action is tied to a corporate identity with MFA. Scope every team with an AppProject allow-list — repos, destinations, and resource kinds — and prefer deny lines in RBAC for exec and delete so even powerful roles cannot shell into a pod or nuke an app. Wiz Code gates the manifests repo and Wiz watches the live posture; CrowdStrike Falcon covers node runtime; and the bundled Dex plus Akamai’s WAF at the edge keep the auth surface and the public surface tight. Pin the chart and image versions explicitly — a floating latest is a supply-chain hole.

Cost notes

The control plane itself is cheap — the hub is a handful of pods (server, repo-server, app-controller, applicationset-controller, redis-ha, Dex), comfortably a few vCPU and a few GB of RAM, so a small node pool hosts it. The real savings are operational: ApplicationSets collapse N per-cluster manifests into one template, so the human cost of running 14 clusters approaches the cost of running one. selfHeal and Git-as-truth eliminate the multi-hour drift hunts that started this project. Watch two things: the repo-server can get CPU-hungry rendering many Helm/Kustomize apps on every refresh — raise its --repo-server-timeout and replica count rather than over-provisioning the whole cluster — and tighten the app refresh interval (default 180s) on a large fleet so you are not paying for constant Git polling you do not need. Run the hub on spot/preemptible-backed nodes for non-prod and on-demand for the prod hub, and the entire fleet-wide GitOps control plane lands well inside a modest monthly budget.

Argo CDGitOpsKubernetesApplicationSetsRBACOIDC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading