Containerization Platform

Deploy Harbor Registry on Kubernetes with Trivy Scanning, Replication, and Cosign Signing

A fintech platform team has been pulling base images straight from Docker Hub into production for two years, and an auditor finally asked the question that ends that era: “prove that the image running in your payments cluster is the one your pipeline built, that it was scanned, and that nobody swapped a layer in between.” Nobody could. There was no private registry of record, no vulnerability gate, no signature — just a latest tag and trust. This guide builds the thing that makes that audit a five-minute conversation: a self-hosted Harbor OCI registry on Kubernetes, with Trivy scanning every push and blocking pulls of vulnerable images, Cosign and Notation signing so a deploy can be made to refuse anything unsigned, and replication so a second region (and the air-gapped DR cluster) always has the exact same bits. By the end you have a registry that is the single source of truth for every container in the estate, gated and signed, that a CISO will sign off on.

Prerequisites

Target topology

Deploy Harbor Registry on Kubernetes with Trivy Scanning, Replication, and Cosign Signing — topology

The registry runs as a set of Harbor microservices on the cluster — core, portal, jobservice, registry, trivy, plus a PostgreSQL database and Redis — fronted by an ingress that terminates TLS. Two paths flow through it. The publish path: CI (GitHub Actions or Jenkins) builds an image, pushes it to a staging project in Harbor, Trivy scans it on arrival, and only if the scan passes the gate does the pipeline run cosign sign and promote the image to a production project. The consume path: Argo CD deploys to the cluster, an admission policy verifies the Cosign signature against your public key before the kubelet is ever allowed to pull, and Harbor itself refuses to serve any image whose severity exceeds the project’s threshold. A replication rule continuously mirrors the production project to a second Harbor in the DR region. Wrapped around all of it: Entra ID/Okta for who-can-do-what, HashiCorp Vault holding the signing keys and robot-account tokens, and Wiz independently scanning the running registry and the images it stores for posture drift.

1. Create the namespace, TLS, and storage

Harbor must be served over real TLS or signing breaks. Issue a certificate first. With cert-manager and a ClusterIssuer already configured:

kubectl create namespace harbor

cat <<'EOF' | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: harbor-tls
  namespace: harbor
spec:
  secretName: harbor-tls
  dnsNames:
    - harbor.example.com
    - notary.example.com
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
EOF

kubectl -n harbor get certificate harbor-tls -w

Wait for READY=True before continuing. If you use a corporate CA instead, create the secret directly:

kubectl -n harbor create secret tls harbor-tls \
  --cert=harbor.example.com.crt --key=harbor.example.com.key

2. Install Harbor with Helm

Add the official chart repo and render a values file. The key decisions here are: external HTTPS via the existing ingress, the bundled Trivy scanner enabled, object storage for blobs, and Postgres/Redis sized for real traffic.

helm repo add harbor https://helm.goharbor.io
helm repo update
helm search repo harbor/harbor   # pin to a known chart version, e.g. 1.16.x

Create harbor-values.yaml:

expose:
  type: ingress
  tls:
    enabled: true
    certSource: secret
    secret:
      secretName: harbor-tls
  ingress:
    hosts:
      core: harbor.example.com
    className: nginx
    annotations:
      nginx.ingress.kubernetes.io/proxy-body-size: "0"   # allow large layer pushes

externalURL: https://harbor.example.com

# DO NOT keep this default in production — see step 6 (pull from Vault / set via --set)
harborAdminPassword: "CHANGE_ME_VIA_SECRET"

persistence:
  enabled: true
  imageChartStorage:
    type: s3
    s3:
      region: ap-south-1
      bucket: kloudvin-harbor-prod
      # credentials injected via existingSecret, not inline
  persistentVolumeClaim:
    database:
      size: 20Gi
    redis:
      size: 5Gi

trivy:
  enabled: true
  vulnType: "os,library"
  severity: "CRITICAL,HIGH,MEDIUM"
  ignoreUnfixed: false
  # Trivy pulls its vuln DB from ghcr.io; mirror it for air-gapped installs (step 8)

database:
  type: internal      # swap to 'external' to point at managed Postgres in prod
redis:
  type: internal

jobservice:
  replicas: 2         # replication + scan jobs run here; give them headroom

Install:

helm install harbor harbor/harbor \
  -n harbor \
  -f harbor-values.yaml \
  --set harborAdminPassword="$(vault kv get -field=admin_password secret/harbor/bootstrap)"

kubectl -n harbor rollout status deploy/harbor-core
kubectl -n harbor get pods

You should see harbor-core, harbor-portal, harbor-jobservice, harbor-registry, harbor-trivy, harbor-database, and harbor-redis all Running. Browse to https://harbor.example.com and log in as admin.

3. Wire SSO through Entra ID or Okta

Local accounts do not scale and leave no audit trail your security team trusts. Switch Harbor’s auth mode to OIDC so humans sign in with corporate identity and group membership maps to Harbor roles. Register an app in Entra ID (or an OIDC app in Okta) with redirect URI https://harbor.example.com/c/oidc/callback, then in Harbor go to Administration → Configuration → Authentication and set:

Auth Mode:            OIDC
OIDC Provider Name:   Entra ID
OIDC Endpoint:        https://login.microsoftonline.com/<tenant-id>/v2.0
OIDC Client ID:       <app-registration-client-id>
OIDC Client Secret:   <from Vault — never typed in a ticket>
Group Claim Name:     groups
OIDC Scope:           openid,profile,email,offline_access

With this in place, an engineer in the harbor-developers Entra group lands as a Developer in their projects, and harbor-admins map to project admins. Robot accounts (next steps) are what CI uses — humans never share credentials with pipelines. Keep the admin password in Vault strictly as break-glass.

4. Create projects, robot accounts, and the Trivy gate

Structure the registry around promotion. Create two projects — a permissive staging and a locked-down production — and turn on the scan-on-push and the pull-prevention gate for production. Do it via the API so it is reproducible (this is what your Terraform/Ansible would codify):

HARBOR=https://harbor.example.com
AUTH="admin:$(vault kv get -field=admin_password secret/harbor/bootstrap)"

# create the production project
curl -s -u "$AUTH" -X POST "$HARBOR/api/v2.0/projects" \
  -H "Content-Type: application/json" \
  -d '{"project_name":"production","public":false}'

# enforce: scan on push, and BLOCK pulls of images with a HIGH+ CVE
curl -s -u "$AUTH" -X PUT "$HARBOR/api/v2.0/projects/production/metadatas/auto_scan" \
  -H "Content-Type: application/json" -d '{"auto_scan":"true"}'

curl -s -u "$AUTH" -X PUT "$HARBOR/api/v2.0/projects/production/metadatas/prevent_vul" \
  -H "Content-Type: application/json" -d '{"prevent_vul":"true"}'

curl -s -u "$AUTH" -X PUT "$HARBOR/api/v2.0/projects/production/metadatas/severity" \
  -H "Content-Type: application/json" -d '{"severity":"high"}'

That prevent_vul=true with severity=high is the heart of the supply-chain gate: Harbor will physically refuse to serve any image in production carrying a High or Critical vulnerability, so a vulnerable image cannot reach a node even if a manifest references it. Now mint a scoped robot account for CI to push:

curl -s -u "$AUTH" -X POST "$HARBOR/api/v2.0/robots" \
  -H "Content-Type: application/json" -d '{
    "name":"ci-pusher","duration":-1,"level":"project",
    "permissions":[{"kind":"project","namespace":"staging",
      "access":[{"resource":"repository","action":"push"},
                {"resource":"repository","action":"pull"}]}]
  }'

Store the returned secret in Vault under secret/harbor/ci-pusher; your pipeline reads it at run time, so no long-lived registry password lives in a CI secret store.

5. Push, scan, and gate from CI

Now the publish path. In GitHub Actions (the Jenkins equivalent is a pipeline stage running the same CLIs), build, push to staging, wait for Harbor’s scan, and fail the build on a High finding before promoting:

# .github/workflows/build-and-sign.yml
jobs:
  build:
    runs-on: ubuntu-latest
    permissions:
      id-token: write          # OIDC to Vault for the robot token + signing key
    steps:
      - uses: actions/checkout@v4

      - name: Log in to Harbor
        run: |
          echo "${ROBOT_SECRET}" | docker login harbor.example.com \
            -u 'robot$staging+ci-pusher' --password-stdin

      - name: Build and push to staging
        run: |
          docker build -t harbor.example.com/staging/payments-api:${GITHUB_SHA} .
          docker push harbor.example.com/staging/payments-api:${GITHUB_SHA}

      - name: Gate on Trivy (client-side, fail fast)
        run: |
          trivy image --exit-code 1 --severity HIGH,CRITICAL --ignore-unfixed \
            harbor.example.com/staging/payments-api:${GITHUB_SHA}

Running trivy image in the pipeline gives a fast local gate, while Harbor’s server-side auto_scan is the authoritative record stored against the artifact. Belt and braces: the build fails here, and even if it did not, the production project’s prevent_vul gate would refuse the pull.

6. Sign images with Cosign (and add a Notation signature)

A passing scan proves the image was clean; a signature proves the image is the one you built and nobody altered it. Generate a Cosign key pair and keep the private key in Vault — Cosign has a native Vault key backend, so the key material never touches the runner’s disk:

# one-time: generate into Vault's transit/KV
export VAULT_ADDR=https://vault.example.com
cosign generate-key-pair --kms hashivault://harbor-cosign
# public key for verifiers:
cosign public-key --key hashivault://harbor-cosign > cosign.pub

Add the signing step to the pipeline, right after the Trivy gate, then promote to production:

      - name: Cosign sign and promote
        env:
          COSIGN_KEY: hashivault://harbor-cosign
        run: |
          IMG=harbor.example.com/staging/payments-api:${GITHUB_SHA}
          DIGEST=$(crane digest "$IMG")
          cosign sign --yes --key "$COSIGN_KEY" \
            harbor.example.com/staging/payments-api@${DIGEST}
          # promote the exact digest to production
          crane copy \
            harbor.example.com/staging/payments-api@${DIGEST} \
            harbor.example.com/production/payments-api:${GITHUB_SHA}

Always sign and copy by digest, never tag — tags are mutable, digests are not. For environments standardizing on the OCI/CNCF Notation project instead of (or alongside) Cosign, sign with a cert from your CA:

notation sign --signature-format cose \
  harbor.example.com/production/payments-api@${DIGEST} \
  --id <cert-key-id> --plugin azure-kv

Harbor stores both Cosign and Notation signatures as OCI artifacts attached to the image, and shows a “Signed” badge in the UI. You can now turn on Harbor’s project policy to block unsigned artifacts from being pulled.

7. Enforce signatures at deploy time

Signing is worthless unless something checks it. Two complementary enforcement points:

At the registry, enable the project-level signature requirement so Harbor refuses to serve unsigned images:

curl -s -u "$AUTH" -X PUT \
  "$HARBOR/api/v2.0/projects/production/metadatas/enable_content_trust_cosign" \
  -H "Content-Type: application/json" -d '{"enable_content_trust_cosign":"true"}'

At admission, install a policy controller so the kubelet is never allowed to pull an unverified image. Using the Sigstore policy-controller (Kyverno’s verifyImages rule is an equivalent):

helm install policy-controller sigstore/policy-controller -n cosign-system --create-namespace

cat <<EOF | kubectl apply -f -
apiVersion: policy.sigstore.dev/v1beta1
kind: ClusterImagePolicy
metadata:
  name: require-payments-signature
spec:
  images:
    - glob: "harbor.example.com/production/**"
  authorities:
    - key:
        data: |
$(sed 's/^/          /' cosign.pub)
EOF

Now Argo CD can sync the production deployment, and any pod referencing an unsigned or tampered image is rejected at admission with a clear error — the verifiable chain the auditor asked for is complete: built → scanned → signed → verified-on-pull.

8. Configure replication to the DR region

The DR cluster and any air-gapped site need the exact same signed bits. Harbor replication mirrors a project — images and their signatures and scan reports — on a schedule or on push. Register the remote Harbor as an endpoint, then create a rule:

# register the DR registry as a replication target
curl -s -u "$AUTH" -X POST "$HARBOR/api/v2.0/registries" \
  -H "Content-Type: application/json" -d '{
    "name":"harbor-dr","type":"harbor",
    "url":"https://harbor-dr.example.com",
    "credential":{"type":"basic","access_key":"robot$replication",
                  "access_secret":"'"$(vault kv get -field=token secret/harbor/dr-robot)"'"}
  }'

# push-based rule: mirror production to DR on every push
curl -s -u "$AUTH" -X POST "$HARBOR/api/v2.0/replication/policies" \
  -H "Content-Type: application/json" -d '{
    "name":"prod-to-dr","src_registry":null,
    "dest_registry":{"id":1},
    "dest_namespace":"production",
    "trigger":{"type":"event_based"},
    "filters":[{"type":"name","value":"production/**"},
               {"type":"tag","value":"**"}],
    "override":true,"enabled":true,"copy_by_chunk":true
  }'

event_based means a promoted production image lands in DR within seconds. For the air-gapped cluster that cannot reach the primary, flip to a pull-based rule on the air-gapped Harbor, or export with harbor-cli / crane to a transfer disk. Mirror Trivy’s vulnerability database too, so the offline scanner stays current:

trivy image --download-db-only
oras push harbor.example.com/library/trivy-db:2 \
  db.tar.gz:application/vnd.aquasec.trivy.db.layer.v1.tar+gzip

Point the air-gapped Harbor’s Trivy at that internal DB URL so scanning works with zero internet egress.

Validation

Prove every gate fires before you trust it:

# 1. A clean, signed image deploys
kubectl run ok --image=harbor.example.com/production/payments-api:${GOOD_SHA}
kubectl get pod ok           # Running

# 2. An unsigned image is REJECTED at admission
kubectl run bad --image=harbor.example.com/production/nginx:unsigned
# -> error: admission webhook denied: no matching signatures

# 3. A vulnerable image cannot be pulled from production
docker pull harbor.example.com/production/payments-api:${VULN_SHA}
# -> denied: current image with 1 critical vulnerability cannot be pulled

# 4. Verify a signature by hand
cosign verify --key cosign.pub \
  harbor.example.com/production/payments-api@${DIGEST}

# 5. Confirm DR has the image
crane ls harbor-dr.example.com/production/payments-api

In the Harbor UI, each artifact should show its CVE count, a green Signed badge, and the replication job log should read Succeeded. Check kubectl -n harbor logs deploy/harbor-jobservice if a scan or replication stalls.

Rollback / teardown

Removing Harbor is clean because Helm owns the workloads, but the data lives outside the release — drop it deliberately, not by accident.

# disable enforcement first so you don't lock yourself out mid-rollback
kubectl delete clusterimagepolicy require-payments-signature
helm uninstall policy-controller -n cosign-system

# remove Harbor itself
helm uninstall harbor -n harbor

# PVCs and the bootstrap secret are intentionally NOT deleted by uninstall:
kubectl -n harbor get pvc
kubectl -n harbor delete pvc --all          # destroys DB + Redis state — irreversible
kubectl delete namespace harbor

# object-storage blobs persist in S3/Blob/GCS — delete the bucket separately if decommissioning

To roll back an upgrade rather than tear down, helm rollback harbor <previous-revision> and let the database migration job reconcile. Always snapshot the Postgres PVC before any chart upgrade.

Common pitfalls

Security notes

The whole design is a software-supply-chain control: nothing reaches a node that was not built by your pipeline (signature), known-clean at promotion (Trivy gate), and unaltered since (digest + verification at admission). Keep the Cosign private key in HashiCorp Vault (or a cloud KMS) so it is never on a runner’s disk; rotate it and re-sign on a schedule. Human access is Entra ID/Okta SSO with group-mapped roles and full audit; CI uses short-scoped robot accounts whose tokens live in Vault, not in CI secret stores. Layer independent verification on top: run Wiz (and Wiz Code in the pipeline) against both the running Harbor and the images it stores, so posture drift, an exposed endpoint, or a critical CVE that slipped a gate is caught out-of-band. Run CrowdStrike Falcon sensors on the Harbor node pool for runtime protection of the registry itself. A blocked-pull or a failed-signature event should fan out to ServiceNow as an incident and to Datadog/Dynatrace as a metric, so security responds to a ticket, not a log line buried in jobservice.

Cost notes

Self-hosting Harbor is mostly compute and storage you already run: budget roughly 1.5–2 vCPU and 4–6 GiB across the core components at idle, scaling with scan concurrency, plus 20 GiB for Postgres and however large your blob store grows. Object storage (S3/Blob/GCS) for layers is the variable cost — enable Harbor’s tag retention and garbage collection policies to reap untagged, replaced layers so a busy staging project does not balloon the bucket. Replication egress to a second region is real cross-region transfer cost; the copy_by_chunk and digest-based dedup keep it to only changed layers. Versus a managed registry (ACR/ECR/GAR) priced per-GB plus per-scan, a self-hosted Harbor wins decisively once you store tens of repos and scan thousands of pushes a month — and it is the only option that gives you Trivy gating, Cosign enforcement, and cross-region replication under one policy plane without per-feature SKUs. Observe storage growth and scan volume in Datadog so the GC schedule is tuned before the bucket, not the bill, surprises you.

HarborKubernetesTrivyCosignSupply Chain SecurityHelm
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading