Containerization Fundamentals

Kubernetes Storage, In Depth: Volumes, PV, PVC, StorageClass & Access Modes

Containers are deliberately forgetful. When a container restarts, its writable layer is thrown away and recreated from the image; when a Pod is rescheduled to another node, anything it wrote to the container filesystem is gone. That amnesia is exactly what makes stateless apps so easy to operate — but the moment you run a database, a message broker, a file upload service, or anything that must remember something across restarts, you have to give Kubernetes a way to attach storage that outlives the container. This lesson is the complete tour of how Kubernetes does that.

We will start with volumes — storage attached to a Pod — and work through every ephemeral type you will actually use: emptyDir, the config-injection volumes (configMap, secret, downwardAPI, projected), and the newer generic ephemeral volumes. Then we move to persistent storage, where the real depth lives: the PersistentVolume (PV) and PersistentVolumeClaim (PVC) pair, every field on each, the four access modes (RWO, ROX, RWX, RWOP), the three reclaim policies, volumeMode, and how a claim binds to a volume. From there we cover the StorageClass and dynamic provisioning — how Kubernetes creates disks on demand — the CSI (Container Storage Interface) model that drives all modern storage, and the day-two operations that matter: volume snapshots, cloning, and online resize. We finish with subPath and the link to StatefulSet volumeClaimTemplates. By the end you will understand every field you are likely to set, why it is there, and the gotcha that bites people who set it wrong.

Learning objectives

By the end of this lesson you can:

Prerequisites & where this fits

You need a working local cluster and basic comfort with kubectl and YAML — if Pods and Deployments are still new, read Pods, ReplicaSets, Deployments & Services first, and make sure you have a free local cluster running per What Is Kubernetes?. It also helps (but is not required) to have seen ConfigMaps & Secrets, because two of the ephemeral volume types simply mount those objects. This is the storage foundation of the Kubernetes Zero-to-Hero course: every stateful lesson later on — StatefulSets, the Postgres operator, CSI snapshots at scale — assumes you know the material here cold. After this lesson you will move on to Ingress, controllers and TLS.

Core concepts: the storage mental model

Before the fields, fix four ideas in your head. They explain everything that follows.

1. A volume is mounted into a Pod, not into a container. You declare volumes in spec.volumes at the Pod level, then each container mounts the ones it needs via volumeMounts. This is precisely how containers in the same Pod share files: they mount the same volume.

2. There are two lifetimes — and that is the whole taxonomy. An ephemeral volume lives and dies with the Pod (some die with the container). A persistent volume lives independently of any Pod: delete the Pod, the data stays; a new Pod can mount it again. Almost every storage decision starts with “does this data need to survive the Pod?”

3. Persistent storage uses a claim-check pattern. Application authors do not want to know whether the cluster runs on AWS EBS, Google Persistent Disk, Ceph, or an NFS server. So Kubernetes splits the concern in two: the PersistentVolume (PV) is the actual piece of storage (the cluster/admin concern), and the PersistentVolumeClaim (PVC) is a request for storage (the app author’s concern). A Pod references a PVC by name; Kubernetes binds that claim to a suitable PV. It is the same idea as a coat-check: you hand over a claim ticket, the system finds your coat.

4. Modern storage is plugged in via CSI. Kubernetes itself does not know how to create an AWS disk or talk to NetApp. A CSI driver — a vendor-written plugin — does that. Kubernetes just calls a standard interface. Every “magic” you will see (a PVC turning into a real cloud disk in seconds) is a StorageClass pointing at a CSI provisioner.

Jargon check. Provisioning means creating the underlying storage. Static provisioning = an admin creates PVs by hand in advance. Dynamic provisioning = Kubernetes creates a PV automatically the moment a PVC asks for one, using a StorageClass. Dynamic is what you will use 95% of the time.

Ephemeral volumes: storage tied to the Pod

Ephemeral volumes need no PV or PVC — you declare them inline in the Pod spec and they exist only as long as the Pod (or, for some, the container) does. Here are all the ones you will use.

emptyDir — scratch space shared in a Pod

An emptyDir is created empty when the Pod is assigned to a node and exists as long as that Pod runs on that node. It is deleted permanently when the Pod is removed from the node. It is the canonical way for two containers in one Pod to share files, and for a single container to get scratch space.

apiVersion: v1
kind: Pod
metadata:
  name: scratch-demo
spec:
  containers:
    - name: writer
      image: busybox
      command: ["sh", "-c", "echo hello > /data/file && sleep 3600"]
      volumeMounts:
        - name: cache
          mountPath: /data
  volumes:
    - name: cache
      emptyDir:
        sizeLimit: 1Gi        # optional cap; eviction if exceeded
        medium: ""            # "" = node disk (default); "Memory" = tmpfs (RAM)
Field What it does Values Default Gotcha
medium Where the dir is backed "" (node disk) or "Memory" "" Memory is a tmpfs in RAM — fast, but counts against the container’s memory limit and is lost on reboot.
sizeLimit Caps total size quantity (e.g. 1Gi) unlimited Exceeding it makes the Pod a candidate for eviction, not a hard write error.

Survives a container crash/restart within the same Pod (the data is at Pod level), but not a Pod reschedule. Use it for caches, scratch, and sharing between sidecars — never for data you must keep.

configMap and secret — injecting configuration as files

These mount the keys of a ConfigMap or Secret as files. Each top-level key becomes a filename; the value becomes the file contents. This is how you ship config files and credentials into a container without baking them into the image.

volumes:
  - name: app-config
    configMap:
      name: my-config        # the ConfigMap to mount
      defaultMode: 0644       # file permissions (octal); default 0644
      optional: false         # if true, Pod starts even when ConfigMap is missing
      items:                  # optional: project only specific keys, rename them
        - key: app.properties
          path: conf/app.properties   # relative to mountPath
  - name: app-secret
    secret:
      secretName: my-secret
      defaultMode: 0400       # secrets often 0400 (owner-read only)
      optional: true
Field Applies to What it does Gotcha
defaultMode both Default permission bits for projected files (octal, e.g. 0644) When fsGroup or non-root users are involved, modes interact with ownership; secrets that scripts must read sometimes need 0444/0440.
items both Select a subset of keys and rename/relocate them If you use items, only the listed keys appear — keys you forget are silently absent.
optional both Pod may start even if the object is missing Default is false for these in a volume — a missing ConfigMap/Secret blocks Pod start.

A subtle but important behaviour: mounted ConfigMaps and Secrets are updated in place when the source object changes (eventually — via kubelet sync, typically tens of seconds), except when mounted with subPath (covered later) or when the object is marked immutable. Values consumed as environment variables, by contrast, are not live-updated — only the volume form is.

downwardAPI — exposing Pod metadata as files

The downward API lets a container read information about itself — labels, annotations, the Pod name, namespace, resource limits — as files (or env vars). Useful when an app needs its own identity without calling the API server.

volumes:
  - name: podinfo
    downwardAPI:
      items:
        - path: "labels"
          fieldRef:
            fieldPath: metadata.labels
        - path: "cpu_limit"
          resourceFieldRef:
            containerName: app
            resource: limits.cpu
            divisor: "1m"

You can expose metadata.name, metadata.namespace, metadata.uid, metadata.labels, metadata.annotations via fieldRef, and requests/limits for cpu/memory/ephemeral-storage via resourceFieldRef. Labels and annotations exposed via a volume are updated live when they change; the same data exposed as env vars is fixed at start.

projected — combining several sources into one directory

A projected volume merges multiple sources — configMap, secret, downwardAPI, and serviceAccountToken — into a single directory. The killer feature is serviceAccountToken, which mounts a short-lived, audience-scoped, auto-rotated token for the Pod’s ServiceAccount. This is the modern, secure way Pods authenticate to the API server (and the pattern external secret stores build on).

volumes:
  - name: combined
    projected:
      defaultMode: 0420
      sources:
        - serviceAccountToken:
            path: token
            audience: vault           # who the token is for
            expirationSeconds: 3600   # min 600; kubelet rotates before expiry
        - configMap:
            name: app-config
        - secret:
            name: app-secret
        - downwardAPI:
            items:
              - path: "namespace"
                fieldRef:
                  fieldPath: metadata.namespace

Generic ephemeral volumes — per-Pod volumes with full storage features

Sometimes you want scratch space that is bigger than node disk allows, on a specific StorageClass, or even snapshottable — but still tied to the Pod lifetime. That is a generic ephemeral volume: it dynamically provisions a real volume (via a StorageClass and CSI driver) that is created when the Pod starts and deleted when the Pod is removed. It gives you the power of persistent storage with ephemeral semantics.

spec:
  containers:
    - name: app
      image: busybox
      command: ["sh", "-c", "sleep 3600"]
      volumeMounts:
        - name: scratch
          mountPath: /scratch
  volumes:
    - name: scratch
      ephemeral:
        volumeClaimTemplate:
          metadata:
            labels: { type: scratch }
          spec:
            accessModes: ["ReadWriteOnce"]
            storageClassName: "fast-ssd"
            resources:
              requests:
                storage: 20Gi

Behind the scenes Kubernetes creates a PVC named <pod-name>-<volume-name>, owned by the Pod, so it is garbage-collected automatically when the Pod dies. Contrast with CSI ephemeral inline volumes (csi: in volumes), which let specialised drivers (e.g. secrets-store CSI) inject ephemeral data directly — those do not use a PVC at all and are driver-specific.

Container Storage Interface ephemeral note. There are two distinct “ephemeral + CSI” things: generic ephemeral volumes (above, use any provisioner, full PVC features) and CSI ephemeral inline volumes (driver-provided, lightweight, e.g. mounting secrets). Reach for generic ephemeral when you want normal storage that just happens to be Pod-scoped.

PersistentVolume and PersistentVolumeClaim

Now the core. A PersistentVolume (PV) is a cluster resource representing a real piece of storage; a PersistentVolumeClaim (PVC) is a namespaced request that binds to a PV. Pods reference the PVC, never the PV directly.

The PersistentVolume spec — every field

Here is a statically-defined PV (e.g. an admin wiring up an existing NFS export or a pre-created cloud disk). With dynamic provisioning you rarely write these by hand, but you must be able to read one.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-data
spec:
  capacity:
    storage: 100Gi                 # how much storage this PV offers
  volumeMode: Filesystem           # Filesystem (default) | Block
  accessModes:
    - ReadWriteOnce                # how it can be mounted (see access modes)
  persistentVolumeReclaimPolicy: Retain   # Retain | Delete | Recycle(deprecated)
  storageClassName: ""             # "" = no class; matched by PVCs asking for ""
  mountOptions:                    # passed to the mount command (driver-dependent)
    - hard
    - nfsvers=4.1
  nodeAffinity:                    # restrict which nodes can use it (local/zonal)
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: topology.kubernetes.io/zone
              operator: In
              values: ["eu-west-1a"]
  nfs:                             # the actual storage backend (one of many)
    server: 10.0.0.10
    path: /exports/data
Field What it does Values Notes / gotcha
capacity.storage Size the PV advertises quantity (100Gi) A PVC binds only if PV capacity the request. Statically, exact-fit is wise; dynamically, the PV is created at the requested size.
volumeMode Filesystem vs raw block Filesystem | Block Block exposes a raw device (no filesystem) for DBs that manage their own; mount via volumeDevices, not volumeMounts.
accessModes How it may be mounted RWO / ROX / RWX / RWOP A list, but a PVC binds on a single matching mode; the backend must actually support it.
persistentVolumeReclaimPolicy What happens to data when the PVC is deleted Retain | Delete | Recycle Dynamically-provisioned PVs inherit this from the StorageClass (default Delete).
storageClassName Class this PV belongs to string or "" Must match the PVC’s storageClassName for binding. "" ≠ unset.
mountOptions Extra mount flags list Not validated by Kubernetes; an invalid option fails the mount at attach time.
nodeAffinity Which nodes can access it node selector Required for local and zonal volumes so the scheduler co-locates the Pod with the disk.
backend (nfs, csi, hostPath, …) The actual storage source one block Exactly one. Modern PVs use csi:; hostPath is single-node/testing only; in-tree types like awsElasticBlockStore are deprecated in favour of CSI.

A PV moves through phases you will see in kubectl get pv: Available (free, unbound), Bound (matched to a PVC), Released (the PVC was deleted but the PV is not yet reclaimed — common with Retain), and Failed (automatic reclamation failed). A Released PV is not automatically reusable: with Retain you must manually scrub the data and clear spec.claimRef to make it Available again.

The PersistentVolumeClaim spec — every field

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-claim
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 20Gi          # minimum size required
    # limits:                # optional upper bound (rarely used)
    #   storage: 20Gi
  storageClassName: fast-ssd # which StorageClass to provision from
  selector:                  # optional: bind only to PVs with these labels
    matchLabels:
      tier: gold
  volumeName: pv-data        # optional: bind to a specific PV by name
  dataSource:                # optional: clone from a PVC or restore a snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
    name: nightly-snap
Field What it does Gotcha
accessModes Modes the claim needs Must be satisfiable by the bound PV / provisioner.
volumeMode Filesystem or Block Must match the PV’s mode to bind.
resources.requests.storage Minimum capacity The PVC may bind to a larger PV (static) or provision exactly this (dynamic). To grow later, edit this field — see resize.
storageClassName Class to use Omitting it uses the cluster’s default StorageClass; setting "" explicitly disables dynamic provisioning (static binding only). These two are different!
selector Label-match a specific PV Only meaningful for static binding; ignored once a class dynamically provisions.
volumeName Bind to one named PV Pre-binding; the named PV must match modes/size or the claim stays Pending.
dataSource / dataSourceRef Populate from a snapshot or clone a PVC dataSourceRef is the newer, more general form (allows cross-namespace and custom populators).

A Pod consumes the claim like this:

spec:
  containers:
    - name: app
      image: postgres:16
      volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: data-claim     # reference the PVC by name
        readOnly: false

Binding: how a claim finds its volume

When you create a PVC, the control plane tries to bind it:

Binding is one-to-one and exclusive — a bound PV serves exactly one PVC. If nothing matches, the PVC sits in Pending until a suitable PV appears (static) or provisioning succeeds (dynamic). kubectl describe pvc <name> shows the events that explain a stuck claim.

The “different storageClassName” trap, stated plainly. Omit storageClassName → use the default class (dynamic). Set it to a name → use that class (dynamic). Set it to "" (empty string) → no dynamic provisioning; bind only to a pre-created PV that also has "". Mixing these up is the number-one reason a PVC is unexpectedly Pending (or unexpectedly provisions a disk you did not want).

Access modes: RWO, ROX, RWX, RWOP

Access modes describe how many nodes can mount a volume and in what way. They are a property of capability — the backend must support the mode you ask for; asking for ReadWriteMany on a plain block device (EBS, GCE PD) will not work because block devices attach to one node at a time.

Mode Short Meaning Typical backends When to use
ReadWriteOnce RWO Read-write by one node (many Pods on that node may share it) Cloud block disks (EBS, GCE PD, Azure Disk), Ceph RBD Databases, single-writer apps — the default and most common.
ReadOnlyMany ROX Read-only by many nodes at once NFS, CephFS, object-backed FS, pre-loaded disks Shared read-only data (static assets, ML model artefacts).
ReadWriteMany RWX Read-write by many nodes at once NFS, CephFS, Azure Files, EFS Shared upload dirs, CMS media, anything multiple Pods on different nodes must write.
ReadWriteOncePod RWOP Read-write by exactly one Pod in the whole cluster CSI drivers supporting it (k8s 1.27+ GA) Strict single-writer guarantee — leader-only databases where even two Pods writing would corrupt data.

Two clarifications people miss. First, RWO is per-node, not per-Pod: several Pods scheduled to the same node can all mount one RWO volume — which is why RWOP exists when you need a true single-Pod lock. Second, the mode is enforced by the kubelet/CSI at attach/mount time, not by Kubernetes guessing — so the underlying storage genuinely has to support concurrency for RWX/ROX.

Reclaim policy and volumeMode

Reclaim policy decides what happens to the underlying storage when its PVC is deleted:

Policy What happens on PVC deletion Use when
Delete The PV and the backing storage (cloud disk, etc.) are deleted Default for dynamic provisioning; fine for reproducible data. Dangerous for anything you cannot lose.
Retain The PV becomes Released; data is kept; an admin must manually reclaim Production databases and anything where accidental PVC deletion must not destroy data.
Recycle (Deprecated) basic scrub (rm -rf) then made Available Do not use; gone in favour of dynamic provisioning.

For dynamically-provisioned volumes the policy comes from the StorageClass (reclaimPolicy), defaulting to Delete. A common production pattern is a StorageClass with reclaimPolicy: Retain for stateful data so a fat-fingered kubectl delete pvc does not wipe the disk.

volumeMode controls the abstraction the Pod sees:

StorageClass and dynamic provisioning

A StorageClass is the template that turns a PVC into a real PV automatically. It names a provisioner (a CSI driver) and the parameters that driver needs (disk type, IOPS, filesystem, encryption keys), plus policy fields.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"   # makes this the default
provisioner: ebs.csi.aws.com          # the CSI driver that creates volumes
parameters:                            # driver-specific (NOT validated by k8s)
  type: gp3
  iops: "5000"
  throughput: "250"
  encrypted: "true"
  csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete                  # Delete (default) | Retain
volumeBindingMode: WaitForFirstConsumer  # Immediate | WaitForFirstConsumer
allowVolumeExpansion: true             # permit growing PVCs later
mountOptions:
  - noatime
allowedTopologies:                     # restrict where volumes are created
  - matchLabelExpressions:
      - key: topology.kubernetes.io/zone
        values: ["eu-west-1a", "eu-west-1b"]
Field What it does Values Default When to set / gotcha
provisioner Which driver creates volumes CSI driver name (e.g. ebs.csi.aws.com, disk.csi.azure.com, pd.csi.storage.gke.io) or kubernetes.io/no-provisioner for static-only Must match an installed driver. no-provisioner is used for local volumes you create by hand.
parameters Driver-specific options key/values none Opaque to Kubernetes — typos are caught only when provisioning fails. Check the driver’s docs.
reclaimPolicy Policy stamped onto provisioned PVs Delete | Retain Delete Use Retain for irreplaceable data.
volumeBindingMode When binding/provisioning happens Immediate | WaitForFirstConsumer Immediate Use WaitForFirstConsumer for zonal block storage so the disk is created in the zone the Pod lands in — otherwise Pods become unschedulable across zones.
allowVolumeExpansion Whether PVCs can grow true | false false Must be true before you try to resize; you cannot retro-enable expansion on an already-bound volume by editing the class alone for some drivers — set it up front.
mountOptions Mount flags for provisioned PVs list none Driver/filesystem dependent.
allowedTopologies Constrain placement topology selector none Pin volumes to specific zones; pairs with WaitForFirstConsumer.

Setting the default class. A cluster can have at most one StorageClass annotated storageclass.kubernetes.io/is-default-class: "true". PVCs that omit storageClassName get it. If two are marked default, the newest wins and you get a warning — a classic source of “why did my PVC use the wrong disk type” confusion.

Immediate vs WaitForFirstConsumer, the single most important storage tuning knob. Immediate provisions the volume the instant the PVC is created — before any Pod is scheduled — so Kubernetes has to guess the zone, and then the scheduler must place the Pod where that disk already is. In a multi-zone cluster this routinely strands Pods (the disk is in zone A, but the only spare capacity is in zone B). WaitForFirstConsumer delays provisioning until a Pod consumes the claim, so the volume is cut in the same zone the scheduler chose. For any zonal block storage, set it.

The CSI model

The Container Storage Interface (CSI) is the standard that lets storage vendors write a single driver that works across container orchestrators. Since the in-tree volume plugins were deprecated and migrated (the “CSI migration” effort), CSI is how essentially all real storage works in modern Kubernetes.

A CSI driver typically ships as:

You interact with all of this indirectly: you install the driver (often a Helm chart), you create a StorageClass with provisioner: <driver-name>, and from then on you only ever touch PVCs. The CSIDriver object advertises the driver’s capabilities (does it support fsGroup? volume expansion? ephemeral inline?), and CSINode objects track which drivers each node runs. You almost never edit these — but knowing they exist explains how a PVC becomes a disk.

Volume snapshots, cloning, and online resize

These are the day-two operations that separate “I can create a PVC” from “I can operate stateful workloads.”

Snapshots

A VolumeSnapshot is a point-in-time copy of a PVC, created through CSI. It needs a VolumeSnapshotClass (analogous to a StorageClass, naming the CSI driver and its snapshot parameters) and the snapshot CRDs + the external-snapshotter controller installed.

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-snapclass
driver: ebs.csi.aws.com
deletionPolicy: Delete          # Delete | Retain (mirrors reclaim policy)
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: nightly-snap
spec:
  volumeSnapshotClassName: csi-snapclass
  source:
    persistentVolumeClaimName: data-claim   # the PVC to snapshot

A VolumeSnapshot (namespaced, the user’s request) binds to a VolumeSnapshotContent (cluster-scoped, the actual snapshot) — exactly mirroring the PVC↔PV pattern. You then restore by creating a new PVC with dataSource pointing at the snapshot (shown earlier).

Cloning

A clone creates a new, independent PVC pre-populated from an existing PVC (no snapshot needed), if the CSI driver supports it. Same dataSource mechanism, kind: PersistentVolumeClaim:

spec:
  storageClassName: fast-ssd
  accessModes: ["ReadWriteOnce"]
  resources:
    requests:
      storage: 20Gi
  dataSource:
    kind: PersistentVolumeClaim
    name: data-claim          # source PVC to clone

The clone must use the same StorageClass and (usually) be ≥ the source size. Great for spinning up a copy of production data for a test environment.

Online resize (volume expansion)

To grow a volume, the StorageClass must have allowVolumeExpansion: true. Then you simply increase spec.resources.requests.storage on the PVC and apply. With modern CSI drivers the disk expands and the filesystem grows online — no Pod restart needed. You can only ever grow, never shrink. If a filesystem expansion needs the node, the PVC carries a FileSystemResizePending condition until the Pod next mounts it.

kubectl patch pvc data-claim -p '{"spec":{"resources":{"requests":{"storage":"40Gi"}}}}'

subPath: mounting one file or sub-directory

By default a volume mount replaces the entire contents of mountPath. subPath lets you mount just one sub-directory or file of a volume into a path, leaving the rest of the container’s directory intact. The classic use is mounting a single config file into /etc without hiding everything else there, or giving two containers different sub-directories of one shared volume.

volumeMounts:
  - name: app-config
    mountPath: /etc/myapp/app.conf
    subPath: app.conf              # mount only this key/file
  - name: data
    mountPath: /var/lib/db
    subPath: postgres             # mount the "postgres" sub-dir of the volume

subPathExpr is a variant that lets you build the sub-path from environment variables (e.g. per-Pod directories using the downward-API Pod name).

The big subPath gotcha. A volume mounted with subPath does NOT receive live updates when the source ConfigMap/Secret changes — unlike a normal mount. If you rely on hot-reloading config, mount the whole volume (no subPath) and point your app at the specific file, or restart the Pod on config change. Historically subPath also had CVEs around symlink traversal; keep your kubelet patched.

StatefulSets and volumeClaimTemplates

A Deployment’s Pods are interchangeable, so they cannot each own a stable, individual disk. A StatefulSet can — via volumeClaimTemplates. Each replica (web-0, web-1, …) gets its own PVC, created from the template, with a stable name (<claim>-<statefulset>-<ordinal>), and that PVC follows the Pod across reschedules. This is how you run replicated databases where each member keeps its own data.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: web
  replicas: 3
  selector:
    matchLabels: { app: web }
  template:
    metadata:
      labels: { app: web }
    spec:
      containers:
        - name: nginx
          image: nginx
          volumeMounts:
            - name: data
              mountPath: /usr/share/nginx/html
  volumeClaimTemplates:            # each Pod gets its own PVC from this
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: fast-ssd
        resources:
          requests:
            storage: 1Gi

Two behaviours to remember. First, the per-Pod PVCs are not deleted when you scale down or delete the StatefulSet by default — so scaling 3→1 then 1→3 re-attaches the same data to web-1 and web-2. (The newer persistentVolumeClaimRetentionPolicy field can opt into deletion on scale-down/delete if you want that.) Second, pair volumeClaimTemplates with a StorageClass using WaitForFirstConsumer so each replica’s disk is provisioned in the zone its Pod is scheduled to.

Kubernetes storage: volumes, PV, PVC, StorageClass

The diagram traces the full path: a Pod mounts a PVC, which binds to a PV; for dynamic provisioning the StorageClass drives a CSI provisioner that creates the real disk on demand, while ephemeral volumes (top) live and die with the Pod.

Hands-on lab

Everything below runs on a free local cluster. We will use kind, whose default StorageClass (standard, backed by the local-path-provisioner) supports dynamic provisioning — so you can practise PV/PVC/StorageClass mechanics without any cloud account.

1. Create a cluster and confirm the default StorageClass.

kind create cluster --name storage-lab
kubectl get storageclass
# NAME                 PROVISIONER             ... DEFAULT
# standard (default)   rancher.io/local-path  ... true

2. emptyDir sharing between two containers in one Pod.

cat <<'EOF' | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata: { name: shared }
spec:
  containers:
    - name: writer
      image: busybox
      command: ["sh","-c","echo 'from writer' > /shared/msg && sleep 3600"]
      volumeMounts: [{ name: scratch, mountPath: /shared }]
    - name: reader
      image: busybox
      command: ["sh","-c","sleep 3600"]
      volumeMounts: [{ name: scratch, mountPath: /shared }]
  volumes:
    - name: scratch
      emptyDir: {}
EOF
kubectl wait --for=condition=Ready pod/shared --timeout=60s
kubectl exec shared -c reader -- cat /shared/msg     # -> from writer

The reader sees the writer’s file: that is volume sharing inside a Pod.

3. Dynamic provisioning — a PVC that becomes a PV automatically.

cat <<'EOF' | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata: { name: data-claim }
spec:
  accessModes: ["ReadWriteOnce"]
  resources: { requests: { storage: 1Gi } }
  # no storageClassName -> uses the default class
EOF

kubectl get pvc data-claim          # may show Pending with WaitForFirstConsumer

With kind’s local-path class (which uses WaitForFirstConsumer), the PVC stays Pending until a Pod uses it — exactly the behaviour we discussed.

4. Consume the PVC from a Pod and watch it bind.

cat <<'EOF' | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata: { name: app }
spec:
  containers:
    - name: app
      image: busybox
      command: ["sh","-c","echo persisted-$(date +%s) > /data/state && sleep 3600"]
      volumeMounts: [{ name: data, mountPath: /data }]
  volumes:
    - name: data
      persistentVolumeClaim: { claimName: data-claim }
EOF

kubectl wait --for=condition=Ready pod/app --timeout=90s
kubectl get pvc data-claim          # now Bound
kubectl get pv                      # a PV was created automatically
kubectl exec app -- cat /data/state # shows your value

5. Prove persistence across a Pod delete.

VAL=$(kubectl exec app -- cat /data/state)
kubectl delete pod app
# recreate the same Pod spec (re-run the apply from step 4)
kubectl wait --for=condition=Ready pod/app --timeout=90s
kubectl exec app -- cat /data/state    # SAME value as $VAL -> data survived

The Pod was destroyed and recreated, yet the data is intact — because it lives on the PV, not in the Pod.

6. (Optional) Online resize. kind’s local-path class does not support expansion, so this is read-only learning: on a cloud cluster you would set allowVolumeExpansion: true on the class, then kubectl patch pvc data-claim -p '{"spec":{"resources":{"requests":{"storage":"2Gi"}}}}' and watch the volume grow without restarting the Pod.

Validation. You should have seen: an emptyDir shared between containers; a PVC go Pending → Bound; a PV created on demand; and data survive a Pod deletion. If your PVC is stuck Pending, run kubectl describe pvc data-claim and read the events.

Cleanup.

kubectl delete pod shared app --ignore-not-found
kubectl delete pvc data-claim --ignore-not-found
kind delete cluster --name storage-lab

Cost note. Entirely free — kind runs in Docker on your laptop and provisions volumes as directories on the node. On a real cloud, every dynamically-provisioned PVC with reclaimPolicy: Delete is a billable disk that disappears when you delete the PVC; PVCs with Retain keep billing until you delete the disk manually.

Common mistakes & troubleshooting

Symptom Likely cause Fix
PVC stuck Pending forever No matching PV (static) / no default StorageClass / storageClassName: "" set by mistake kubectl describe pvc; set a valid class or mark a default; remove the empty "" if you meant dynamic.
Pod Pending, events say “node(s) had volume node affinity conflict” Volume provisioned in one zone (Immediate), Pod must run elsewhere Use volumeBindingMode: WaitForFirstConsumer on the StorageClass.
Pod ContainerCreating, “Multi-Attach error” An RWO volume is being attached to a second node before the first detaches (e.g. fast reschedule) Wait for detach; ensure only one node mounts RWO; consider RWOP or RWX if you truly need multi-node.
Resize edit ignored StorageClass has allowVolumeExpansion: false, or driver lacks support, or you tried to shrink Set allowVolumeExpansion: true (before creating); only grow, never shrink.
Mounted ConfigMap not updating in the container Mounted with subPath, or marked immutable, or value used as env var Mount without subPath; for env vars, restart the Pod.
Deleting a PVC wiped the data unexpectedly StorageClass reclaimPolicy: Delete (the default) Use Retain for important data; back up with snapshots.
RWX PVC won’t mount on a block-storage class Block disks (EBS/GCE PD/Azure Disk) cannot do ReadWriteMany Use a file/shared backend (NFS, EFS, Azure Files, CephFS).
Released PV won’t rebind Retain policy leaves claimRef set Scrub the data, then kubectl edit pv to clear spec.claimRef, returning it to Available.

Best practices

Security notes

Interview & exam questions

  1. What is the difference between a PV and a PVC? A PV is the actual storage resource (admin/cluster concern); a PVC is a namespaced request for storage (app concern). Pods reference the PVC, which binds to a PV one-to-one.

  2. Static vs dynamic provisioning? Static: an admin pre-creates PVs and PVCs bind to matching ones. Dynamic: a StorageClass + CSI provisioner creates a PV automatically when a PVC asks. Dynamic is the norm.

  3. Explain the four access modes. RWO = read-write by one node; ROX = read-only by many nodes; RWX = read-write by many nodes; RWOP = read-write by exactly one Pod cluster-wide. RWO is per-node (multiple Pods on the same node can share), which is why RWOP exists for strict single-writer.

  4. What does volumeBindingMode: WaitForFirstConsumer solve? It delays volume provisioning until a Pod is scheduled, so the volume is created in the same zone/node as the Pod — preventing unschedulable Pods when Immediate provisions in the wrong zone.

  5. What are the reclaim policies and what do they do? Delete removes the PV and backing storage when the PVC is deleted (default for dynamic); Retain keeps the data and marks the PV Released for manual reclaim; Recycle is deprecated.

  6. Difference between omitting storageClassName, setting it to a name, and setting it to ""? Omit → default class (dynamic); a name → that class (dynamic); "" → disable dynamic provisioning, bind only to a static PV that also has "".

  7. What is CSI and why does it matter? The Container Storage Interface is the standard plugin API for storage; vendors ship one driver (controller Deployment + node DaemonSet + community sidecars) and Kubernetes drives it through StorageClasses. In-tree plugins are deprecated/migrated to CSI.

  8. How do you grow a volume, and what are the constraints? Ensure the StorageClass has allowVolumeExpansion: true, then increase spec.resources.requests.storage on the PVC. Modern CSI does it online; you can only grow, never shrink.

  9. Why and how do StatefulSets use volumeClaimTemplates? So each replica gets its own stable, individually-named PVC that follows the Pod across reschedules — essential for replicated databases. PVCs persist on scale-down by default.

  10. When would you use volumeMode: Block? When the app wants a raw device with no filesystem (high-performance databases managing their own I/O); mounted via volumeDevices.

  11. What is the subPath update gotcha? Volumes mounted with subPath do not receive live ConfigMap/Secret updates; mount the whole volume if you need hot reload.

  12. What is a generic ephemeral volume vs an emptyDir? Both are Pod-lifetime, but a generic ephemeral volume is dynamically provisioned through a StorageClass/CSI (so it can be large, on specific media, snapshottable) whereas emptyDir is plain node disk or RAM.

Quick check

  1. A Pod must keep data across rescheduling to another node. emptyDir or PVC?
  2. Your StorageClass uses Immediate and Pods are unschedulable across zones. What one field do you change, and to what?
  3. Three Pods on different nodes must all write to one shared volume. Which access mode, and which kind of backend?
  4. You delete a PVC and its cloud disk vanishes. Which StorageClass field caused this, and what value would have preserved it?
  5. You edit a mounted ConfigMap but the container never sees the change. Name two reasons.

Answers

  1. PVCemptyDir dies with the Pod; a PVC binds to a PV that persists independently.
  2. volumeBindingMode: WaitForFirstConsumer on the StorageClass, so the volume is provisioned in the Pod’s zone.
  3. RWX (ReadWriteMany) on a shared/file backend (NFS, CephFS, Azure Files, EFS) — block disks cannot do RWX.
  4. reclaimPolicy: Delete (the default); Retain would have kept the disk.
  5. It was mounted with subPath, the ConfigMap is immutable, or the value is consumed as an environment variable (env vars are not live-updated). (Any two.)

Exercise

On a local kind cluster, build a tiny stateful app end to end:

  1. Create a StorageClass named lab-retain that uses the cluster’s local-path provisioner, with reclaimPolicy: Retain and volumeBindingMode: WaitForFirstConsumer.
  2. Create a StatefulSet with 2 replicas and a volumeClaimTemplates entry using lab-retain, each Pod writing its own hostname into a file on its volume.
  3. Confirm two distinct PVCs (...-0, ...-1) were created and bound, and that each Pod’s file contains its own ordinal name.
  4. Delete the StatefulSet, observe that the PVCs and PVs remain (because they are not auto-deleted and the policy is Retain).
  5. Recreate the StatefulSet and confirm each Pod re-attaches its original data.
  6. Clean up: delete the StatefulSet, the leftover PVCs, then any Released PVs, then the cluster. Note in a sentence why the PVs needed manual deletion.

Certification mapping

Glossary

Next steps

Next, learn how to expose your stateful and stateless apps to the outside world in Kubernetes Ingress, In Depth: Controllers, Rules, TLS, IngressClass & the Gateway API. To go deeper on the storage operations introduced here — snapshots across regions, cloning at scale, and topology-aware provisioning — read CSI Volume Snapshots, Cloning, Resize & Topology, and to put persistent storage to work in a real database, see the StatefulSet Postgres operator lesson.

KubernetesStoragePersistentVolumePersistentVolumeClaimStorageClassCSI
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading