helm create gets you a chart in five seconds and a maintenance liability in five weeks. This guide walks through the practices that separate a throwaway scaffold from a chart you can hand to twenty teams: shared template libraries, fail-fast input validation, deterministic dependency handling, and a CI pipeline that catches breakage before it reaches a cluster.
1. A chart layout that scales
The default scaffold is fine for one service. Once you have a platform, structure the chart so that intent is obvious and overrides are predictable.
myapp/
Chart.yaml
values.yaml # documented defaults, every key present
values.schema.json # contract for what callers may pass
templates/
_helpers.tpl # named templates (fullname, labels, selectors)
deployment.yaml
service.yaml
serviceaccount.yaml
NOTES.txt
charts/ # vendored dependencies (helm dependency build)
ci/ # values files used only by chart-testing
default-values.yaml
ha-values.yaml
Two rules carry most of the weight. First, every value your templates read must appear in values.yaml with a sane default and a comment — even if the default is {} or "". An undocumented value is a bug waiting for a 2 a.m. page. Second, keep templates/ free of business logic that belongs in helpers; a template should read like a manifest, not a program.
Use
helm createonce to remember the layout, then delete the generated boilerplate. The scaffoldedvalues.yamlships opinions (a specific autoscaling block, a sample ingress) you almost certainly do not want as your defaults.
2. DRY templating with named templates and library charts
Named templates (defined with define in _helpers.tpl) are your first lever against duplication. The canonical pair is a name helper and a labels helper:
{{/* templates/_helpers.tpl */}}
{{- define "myapp.fullname" -}}
{{- $name := default .Chart.Name .Values.nameOverride -}}
{{- if .Values.fullnameOverride -}}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{- end -}}
{{- define "myapp.labels" -}}
helm.sh/chart: {{ printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
app.kubernetes.io/name: {{ include "myapp.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end -}}
The trunc 63 is not cosmetic: Kubernetes label values and many resource names are capped at 63 characters, and a long release name will otherwise produce an invalid object that the API server rejects.
When the same helpers need to be shared across many charts, promote them into a library chart. A library chart sets type: library in Chart.yaml, ships only templates/ with define blocks (no rendered manifests), and is consumed as a dependency. The key behavioral difference: Helm does not render a library chart’s templates directly, so it never emits objects on its own — it only exposes named templates.
# common/Chart.yaml
apiVersion: v2
name: common
type: library
version: 1.4.0
# myapp/Chart.yaml
dependencies:
- name: common
version: "1.4.0"
repository: "oci://ghcr.io/myorg/charts"
A widely used pattern is to have the library define a full resource (say, a Deployment) wrapped in tpl, and let each application chart pass overrides. Even at a smaller scale, centralizing just your labels, selectorLabels, and image-reference helpers in a library chart eliminates the most common source of drift across a fleet.
3. Validate inputs with values.schema.json
A values.schema.json file at the chart root is validated by Helm automatically on install, upgrade, lint, and template. It is plain JSON Schema (Draft 7 era), and it is the single highest-leverage reliability improvement you can make to a chart: bad config fails at render time with a clear message instead of producing a broken Deployment.
{
"$schema": "https://json-schema.org/draft-07/schema#",
"type": "object",
"required": ["image", "replicaCount"],
"properties": {
"replicaCount": {
"type": "integer",
"minimum": 1
},
"image": {
"type": "object",
"required": ["repository"],
"properties": {
"repository": { "type": "string", "minLength": 1 },
"tag": { "type": "string" },
"pullPolicy": {
"type": "string",
"enum": ["Always", "IfNotPresent", "Never"]
}
},
"additionalProperties": false
},
"service": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": ["ClusterIP", "NodePort", "LoadBalancer"]
},
"port": { "type": "integer", "minimum": 1, "maximum": 65535 }
}
}
}
}
Two things worth internalizing. JSON Schema validates structure and types, not cross-field business rules — it cannot express “if autoscaling.enabled then replicaCount is ignored.” For those, fail explicitly inside templates with required and fail:
{{- if and .Values.ingress.enabled (not .Values.ingress.className) }}
{{- fail "ingress.enabled=true requires ingress.className" }}
{{- end }}
{{- $repo := required "image.repository is required" .Values.image.repository }}
Also note that additionalProperties: false is strict — it will reject a typo’d key like imagePullPolcy, which is exactly what you want, but it means a caller cannot smuggle in extra keys. Apply it deliberately at the leaf objects you fully control, and be more permissive at the top level if your chart intentionally accepts pass-through blocks.
4. Dependencies, subcharts, and global values
Declare dependencies in Chart.yaml and lock them. helm dependency update resolves versions and writes Chart.lock; commit that lock file so CI and production resolve byte-identical charts.
helm dependency update ./myapp # resolves + writes Chart.lock + populates charts/
helm dependency build ./myapp # rebuilds charts/ from an existing Chart.lock
Use condition and tags to make optional dependencies toggleable without editing Chart.yaml:
dependencies:
- name: postgresql
version: "15.5.x"
repository: "oci://registry-1.docker.io/bitnamicharts"
condition: postgresql.enabled
The subtlety that bites people is the global scope. Values under .Values.global are visible to the parent chart and every subchart, which makes globals perfect for cross-cutting settings (image registry mirror, image pull secrets, environment name) and dangerous for anything else. A parent can also override a subchart’s values by nesting them under the subchart’s name:
# parent values.yaml
global:
imageRegistry: registry.internal.example.com
postgresql: # overrides into the postgresql subchart
primary:
persistence:
size: 50Gi
Resist the urge to push everything into global “just in case.” Globals are an implicit API across all subcharts; once a subchart starts reading one, removing it is a breaking change you cannot see from the parent.
5. Hooks, ordering, and when not to use them
Helm hooks let you run resources at lifecycle points (pre-install, post-install, pre-upgrade, post-delete, and so on), ordered within a phase by helm.sh/hook-weight (lower runs first). The classic use is a schema migration Job before an upgrade.
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "myapp.fullname" . }}-migrate
annotations:
"helm.sh/hook": pre-upgrade,pre-install
"helm.sh/hook-weight": "-5"
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
template:
spec:
restartPolicy: Never
containers:
- name: migrate
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
command: ["/app/migrate", "up"]
The critical caveat: hook resources are not tracked as part of the release. Helm creates them out-of-band and does not manage their lifecycle the way it does normal manifests, which is why you set an explicit hook-delete-policy. A failed hook also does not auto-rollback unless you pass --atomic. Reach for hooks when you genuinely need lifecycle ordering — migrations, one-shot setup — and avoid them for anything that should be a first-class, reconciled part of the release. If a Job needs to keep existing, model it as a normal resource, not a hook.
6. Testing: lint, unit tests, and chart-testing
Layer three independent checks; each catches a different class of failure.
helm lint validates chart structure, runs schema validation, and surfaces obvious template errors. Pass --strict to turn warnings into failures in CI:
helm lint ./myapp --strict --values ./myapp/ci/ha-values.yaml
Unit snapshots with the helm-unittest plugin assert that specific rendered output matches expectations, so a careless template edit that shifts a label or drops a probe fails loudly. Tests live in tests/ and run against the rendered templates:
# myapp/tests/deployment_test.yaml
suite: deployment
templates:
- deployment.yaml
tests:
- it: sets the replica count from values
set:
replicaCount: 3
asserts:
- equal:
path: spec.replicas
value: 3
- it: renders a probe on the main container
asserts:
- isNotNull:
path: spec.template.spec.containers[0].livenessProbe
helm plugin install https://github.com/helm-unittest/helm-unittest
helm unittest ./myapp
chart-testing (the ct tool) is what ties it together in CI: it lints changed charts, validates that the chart version was bumped, and can install each changed chart into an ephemeral cluster (kind works well) to confirm it actually deploys. The ci/*-values.yaml files give ct multiple realistic configurations to exercise.
ct lint --target-branch main --chart-dirs charts
ct install --target-branch main --chart-dirs charts
A minimal GitHub Actions job wiring this up against a kind cluster:
name: chart-ci
on: pull_request
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # ct needs history to diff against the base
- uses: azure/setup-helm@v4
- uses: helm/chart-testing-action@v2
- name: Lint changed charts
run: ct lint --target-branch ${{ github.event.repository.default_branch }}
- uses: helm/kind-action@v1
- name: Install changed charts
run: ct install --target-branch ${{ github.event.repository.default_branch }}
7. Packaging and distribution via OCI
Helm 3 treats OCI registries as a first-class distribution channel, so you can store charts next to your images. Package, push, and pull use the registry directly — no separate chart repo index to maintain.
helm package ./myapp # produces myapp-1.2.0.tgz
helm push myapp-1.2.0.tgz oci://ghcr.io/myorg/charts
helm pull oci://ghcr.io/myorg/charts/myapp --version 1.2.0
helm install myapp oci://ghcr.io/myorg/charts/myapp --version 1.2.0
For supply-chain integrity, Helm supports provenance files. helm package --sign produces a .prov file alongside the .tgz, and helm verify (or helm install --verify) checks the signature against your keyring.
helm package ./myapp --sign --key 'platform-team' --keyring ~/.gnupg/secring.gpg
helm verify myapp-1.2.0.tgz # validates the .prov signature
Many teams now also sign the pushed OCI artifact with cosign in addition to Helm’s PGP provenance. The two are complementary: PGP provenance proves the chart contents, cosign attaches a signature to the registry artifact and integrates with admission policy. Pick at least one and enforce it.
8. Upgrade safety: diff, atomic, and CRDs
Before any production upgrade, render the change, not just the new state. The helm diff plugin shows exactly what will mutate:
helm plugin install https://github.com/databus23/helm-diff
helm diff upgrade myapp oci://ghcr.io/myorg/charts/myapp --version 1.2.0 -f prod-values.yaml
Run upgrades with --atomic --timeout. With --atomic, a failed upgrade automatically rolls back to the prior revision instead of leaving the release wedged half-applied:
helm upgrade myapp oci://ghcr.io/myorg/charts/myapp \
--version 1.2.0 -f prod-values.yaml \
--atomic --timeout 5m
CRDs are the sharpest edge in Helm. Files in a chart’s special crds/ directory are installed before the rest of the chart, but Helm never upgrades or deletes them — this is deliberate, to avoid destroying custom resources cluster-wide. The practical consequence: shipping a new CRD version inside crds/ will not update an existing CRD. Manage CRD lifecycle explicitly, typically by applying CRD updates with kubectl apply as a separate, deliberate step outside the normal chart upgrade.
Enterprise scenario
A platform team running ~40 service charts off a shared common library shipped a “harmless” fix: renaming the selector helper from common.selectorLabels to common.matchLabels and bumping the library to 2.0.0. Lint passed, unit snapshots passed, ct install into kind passed — every check was green. The first production helm upgrade failed with Deployment.apps "checkout" is invalid: spec.selector: field is immutable. The new helper emitted a different spec.selector.matchLabels, and Kubernetes forbids mutating a Deployment’s selector after creation. Their CI only ever ran ct install on a clean cluster, so it never exercised the upgrade path where the immutability rule lives.
The fix had two parts. First, they froze selector labels as a contract: the library’s common.selectorLabels became append-only, asserted by a unit test that fails if the rendered key set changes.
# common/tests/selector_test.yaml
- it: selector labels are frozen (immutable contract)
template: deployment.yaml
asserts:
- equal:
path: spec.selector.matchLabels
value:
app.kubernetes.io/name: checkout
app.kubernetes.io/instance: RELEASE-NAME
Second, they added an upgrade gate to ct so CI installs the chart, then upgrades over it before tearing down:
# ct.yaml
upgrade: true
ct install --upgrade deploys the chart’s previous released version first, then upgrades to the PR’s version, catching exactly the immutable-field class of break that a from-scratch install hides. The lesson: green local renders prove a chart installs; only an upgrade-over-previous test proves it upgrades.
Verify
Run these against a chart before you trust it:
# 1. Schema + lint pass cleanly, strictly
helm lint ./myapp --strict
# 2. Templates render with defaults AND with a real prod values file
helm template myapp ./myapp -f prod-values.yaml > /tmp/rendered.yaml
test -s /tmp/rendered.yaml && echo "rendered OK"
# 3. Bad input is rejected by the schema (expect a non-zero exit)
helm template myapp ./myapp --set replicaCount=0 ; echo "exit=$?"
# 4. Unit snapshots pass
helm unittest ./myapp
# 5. The rendered output is valid against the live API (dry run)
helm install myapp ./myapp --dry-run=server -f prod-values.yaml
--dry-run=server is meaningfully stronger than the default client dry run: it sends the manifests to the API server for validation (including admission), catching errors a purely local render misses.
Checklist
Pitfalls
- Treating
globalas a convenience. Every global is an implicit contract with every subchart; add them sparingly and document them. - Forgetting hooks are untracked. Without a
hook-delete-policyyou accumulate orphaned Jobs; without--atomica failed hook leaves the release inconsistent. - Assuming
crds/upgrades CRDs. It does not. Plan CRD versioning as a first-class, manual operation. additionalProperties: falseeverywhere. Strictness is great at leaves you own and painful on blocks meant to pass through to a subchart — apply it with intent.- Skipping the server dry run. A chart that renders locally can still be rejected by admission controllers;
--dry-run=serveris your last cheap gate before a real install.
Next step: pull your label, selector, and image helpers into a type: library chart, version it, and publish it to your OCI registry. Once every service chart depends on the same library, fixing a labeling bug is one release instead of twenty pull requests.