Containers & Docker Basics: Images, Layers, and Registries

Before you can run anything on Kubernetes, you need the thing Kubernetes actually runs: a container. This is lesson 1 of the Kubernetes Zero-to-Hero course, and it deliberately starts one level below Kubernetes — with Docker, images, and registries — because every confusing thing you’ll hit later (“ImagePullBackOff”, “it works on my laptop but not in the cluster”, “why is my image 1.2 GB?”) traces straight back to the concepts here.

By the end you’ll know what a container really is (and how it differs from a virtual machine), the difference between an image and a container, the full Dockerfile → build → image → registry → run lifecycle, and why layer caching and base-image choice make builds fast or slow. Then you’ll build, run, and inspect a real image on your own laptop — free, no cloud account required.

Learning objectives

By the end of this lesson you will be able to:

Explain what a container is in plain terms, and how it differs from a virtual machine (and where namespaces and cgroups fit in).
Distinguish an image (the read-only template) from a container (a running instance of that image).
Read a simple Dockerfile and trace the full lifecycle: Dockerfile → docker build → image (layers) → registry → docker run.
Reason about tags, base images, and layer caching, and explain why instruction order changes build speed.
Identify the common registries (Docker Hub, GHCR, ACR/ECR/Artifact Registry) and what an image reference like ghcr.io/acme/web:1.4.2 means.
Build, run, inspect, and clean up a container image locally.

Prerequisites & where this fits

You need only basic comfort with a terminal — cd, ls, editing a small text file — and Docker Desktop (macOS/Windows) or Docker Engine / Podman (Linux) installed. No Kubernetes, no cloud account, no prior container experience is assumed.

This is the first lesson in the Kubernetes Fundamentals module of the Kubernetes Zero-to-Hero course. Everything Kubernetes orchestrates is a container, so we build that foundation here. The next lesson, What Is Kubernetes? Control Plane, Nodes, etcd & the kubelet, introduces the cluster that schedules and runs these images at scale.

What a container actually is (vs a virtual machine)

A container packages an application with everything it needs to run — code, runtime, libraries, system tools — into one isolated, portable unit that behaves the same on your laptop, a CI runner, and a production server. That portability (“build once, run anywhere”) is the whole point.

The usual starting comparison is a virtual machine (VM). A VM virtualizes hardware: a hypervisor (VMware, Hyper-V, KVM) splits a physical machine into VMs, each running its own full operating system — its own kernel, its own boot process. That’s powerful but heavy: gigabytes in size, tens of seconds to boot.

A container virtualizes the operating system instead. All containers on a host share the host’s Linux kernel — there’s no second OS to boot. Each container just gets its own isolated view of the system (its own filesystem, process tree, and network), so it feels like it owns the machine while really being a set of normal processes behind strong fences. The result: megabytes, not gigabytes, and startup in milliseconds.

That isolation comes from two Linux kernel features. Docker sets them up for you, but knowing the names demystifies a lot:

Namespaces provide isolation — a container’s own private view of a resource: the process tree (it sees its app as PID 1, not host processes), the network (its own interfaces and ports), mounts (its own filesystem root), and more. Namespaces are what a container can see.
cgroups (control groups) provide limits — caps on CPU, memory, and I/O so one container can’t starve the others. cgroups are how much a container can use.

In one line: a container is a normal process, boxed in by namespaces (its view of the world) and cgroups (its resource budget), sharing the host kernel.

	Virtual machine	Container
Isolates by virtualizing	Hardware (own kernel + OS)	The OS (shares host kernel)
Typical size	Gigabytes	Megabytes
Start time	Tens of seconds	Milliseconds
Isolation strength	Stronger (separate kernel)	Strong, but shared kernel
Density per host	Tens	Hundreds to thousands

The trade-off is real: sharing the host kernel makes the isolation boundary thinner than a VM’s. That’s fine for most workloads and is exactly what makes containers cheap and fast — but it’s why dropping privileges and (at the high end) sandboxed runtimes exist. We touch security at the end of this lesson.

Images vs containers: template vs running instance

This is the most common point of confusion for beginners, so let’s nail it.

An image is a read-only template — an immutable snapshot of a filesystem plus a little metadata (which command to run, which ports to expose). It does nothing on its own; it just sits there, like a .iso file or a class definition in code.

A container is a running instance of an image. docker run takes that read-only template, adds a thin writable layer on top, and starts the process inside. You can start many containers from one image — like creating many objects from one class — and each gets its own writable layer, so one container’s changes don’t affect the image or the others.

Image	Container
Read-only template	Running (or stopped) instance
Immutable; built once	Has a writable top layer
Like a class / blueprint	Like an object / a process
Stored in a registry	Lives on a host while it runs
`docker build` produces it	`docker run` creates it

A practical consequence: containers are ephemeral. Anything written inside a running container’s writable layer disappears when the container is removed. That’s by design — it’s why we keep state in volumes and databases, not inside containers — and it’s a foundational idea you’ll meet again the moment you learn about Pods in Kubernetes.

The lifecycle: Dockerfile → build → image → registry → run

Here is the path every container takes, from a text file on your machine to a running process. This is the backbone of the whole lesson.

The Docker image lifecycle: a Dockerfile is built into a layered, read-only image, pushed to a registry, then pulled and run as a container.

As the diagram shows, a Dockerfile (your recipe) is turned by docker build into a layered, read-only image; that image is pushed to a registry (a shared store); later, any host pulls the image and runs it as a container. Let’s walk each stage.

1. The Dockerfile (the recipe)

A Dockerfile is a plain-text file of instructions describing how to assemble an image. A minimal one for a tiny static site looks like this:

# syntax=docker/dockerfile:1
FROM nginx:1.27-alpine          # 1. start FROM a base image
COPY index.html /usr/share/nginx/html/index.html   # 2. add our content
EXPOSE 80                       # 3. document the port the app listens on
# nginx's own image already defines the start command (CMD), so we inherit it

The instructions you’ll meet first:

FROM — the base image you build on (every image starts FROM something).
COPY (and ADD) — copy files from your project into the image.
RUN — run a command at build time (e.g. install packages); each RUN bakes its result into a layer.
EXPOSE — documents which port the app listens on (metadata, not a firewall rule).
CMD / ENTRYPOINT — the command that runs when the container starts. This build-time vs run-time split trips up newcomers: RUN happens once while building; CMD happens every time you start a container.

2. `docker build` → an image made of layers

docker build executes the Dockerfile top to bottom and produces an image. The crucial detail: each instruction creates a new read-only layer, stacked on the one before. An image is therefore an ordered stack of layers, each a diff (a set of filesystem changes) over the previous one.

┌─────────────────────────────┐  ← COPY index.html   (your content)
├─────────────────────────────┤  ← nginx config / binaries
├─────────────────────────────┤  ← Alpine base packages
└─────────────────────────────┘  ← FROM nginx:1.27-alpine (base)

Layers are content-addressed and shared between images. If ten of your images all start FROM node:20, that base is stored once on the host and reused — saving disk and download time.

3. Push to a registry

A built image lives only on the machine that built it until you push it to a registry — a server that stores and distributes images — making it available to teammates, CI, and clusters.

4. Pull and run

On any other host, docker run (or, later, Kubernetes) will pull the image if it isn’t already present, then start a container from it. Same image, same behavior, anywhere.

Keep this loop in your head: author the Dockerfile, build the image, push to a registry, pull & run as a container. When something breaks in Kubernetes later, you’ll almost always be debugging one of these four stages.

Tags, base images, and layer caching

These three ideas separate someone who “can build an image” from someone who builds good images.

Tags: naming and versioning images

An image reference has the shape registry/repository:tag, e.g. ghcr.io/acme/web:1.4.2:

registry — where it lives (ghcr.io); if omitted, Docker assumes Docker Hub.
repository — the image’s name/namespace (acme/web).
tag — a version label (1.4.2); if omitted, Docker assumes latest.

A tag is just a movable label, not a guaranteed version. latest is the classic trap: it doesn’t mean “newest and stable,” it’s simply the default tag and can be re-pointed to different content over time. In production, pin specific tags (1.4.2) or, for true immutability, pin by digest (web@sha256:...), which refers to exact bytes and can never move. Treat latest as “unspecified.”

Base images: what you build FROM

Your base image dictates your image’s size, attack surface, and which tools are available inside. Common choices:

Base	Size (rough)	When to use
`ubuntu` / `debian`	~70–120 MB	Familiar, lots of packages, easy debugging
`*-slim` (e.g. `python:3.12-slim`)	~40–80 MB	Trimmed distro; a sensible default
`alpine`	~5–10 MB	Tiny; uses musl libc (occasional compatibility quirks)
`distroless` / `scratch`	near-zero	No shell or package manager; smallest + most secure for compiled apps

Smaller is generally better — less to download, fewer packages that can carry vulnerabilities — but it also means fewer debugging tools inside, so it’s a trade-off. Prefer official images and pin a real version tag (python:3.12-slim, not python:latest).

Layer caching: why instruction order matters

Because layers are diffs, Docker caches them. On a rebuild, Docker reuses a cached layer as long as that instruction and everything before it is unchanged. The moment one instruction changes, that layer and every layer after it must be rebuilt.

The practical rule: put rarely-changing things early, frequently-changing things late. The classic case is dependency install vs. copying source:

# syntax=docker/dockerfile:1
FROM node:20-slim
WORKDIR /app

# 1. Copy ONLY the dependency manifests first, then install.
#    These files change rarely, so this expensive layer stays cached.
COPY package.json package-lock.json ./
RUN npm ci

# 2. Copy the source LAST. It changes on every commit, but it's a
#    cheap layer, and the npm install above is reused from cache.
COPY . .
CMD ["node", "server.js"]

If you instead did COPY . . before RUN npm ci, then any one-character change to your source would invalidate the cache and force a full reinstall of dependencies on every build — slow and wasteful. Ordering your Dockerfile for the cache is one of the highest-leverage habits in containerization.

Registries: where images live

A container registry is a server that stores, versions, and distributes images. You push to it and pull from it. The ones you’ll encounter:

Registry	Who runs it	Typical use
Docker Hub (`docker.io`)	Docker	The default; home of most official base images
GHCR (`ghcr.io`)	GitHub	Images built from GitHub repos / Actions
Amazon ECR	AWS	Private images for workloads on AWS
Azure ACR	Azure	Private images for workloads on Azure
Google Artifact Registry	Google Cloud	Private images for workloads on GCP

Registries can be public (anyone can pull) or private (pull requires authentication). The cloud registries (ECR/ACR/Artifact Registry) hold a team’s own application images close to where they run; Docker Hub is where you pull base images. One note for later: Docker Hub rate-limits anonymous pulls, a real cause of ImagePullBackOff errors you’ll meet in a few lessons.

Hands-on lab

You’ll build a tiny image, run it, inspect its layers, and clean up. Everything here is free and runs entirely on your laptop — no cloud account, no cluster yet. We skip pushing to a remote registry (to stay zero-cost and zero-signup); the optional step shows how using a local registry container.

0. Verify Docker is working

docker version
docker run --rm hello-world

Expected (trimmed): the hello-world container prints a confirmation message, including:

Hello from Docker!
This message shows that your installation appears to be working correctly.

If you use Podman, every docker command below works as podman — the CLI is compatible.

1. Create a tiny project

mkdir kv-docker-lab && cd kv-docker-lab

cat > index.html <<'EOF'
<!doctype html>
<h1>Hello from KloudVin 🐳</h1>
<p>Served by nginx inside a container.</p>
EOF

cat > Dockerfile <<'EOF'
# syntax=docker/dockerfile:1
FROM nginx:1.27-alpine
COPY index.html /usr/share/nginx/html/index.html
EXPOSE 80
EOF

2. Build the image

docker build -t kv-web:1.0 .

Expected output ends with lines like:

 => => naming to docker.io/library/kv-web:1.0
 => => writing image sha256:...

You just turned a Dockerfile into a tagged image, kv-web:1.0. Confirm it exists:

docker images kv-web

REPOSITORY   TAG   IMAGE ID       CREATED         SIZE
kv-web       1.0   a1b2c3d4e5f6   2 seconds ago   53.2MB

3. Run a container from the image

docker run -d --name kv-web -p 8080:80 kv-web:1.0

-d runs it detached (in the background).
--name kv-web gives the container a friendly name.
-p 8080:80 maps port 8080 on your laptop to port 80 inside the container.

Now visit http://localhost:8080 in a browser, or:

curl -s http://localhost:8080

Expected:

<!doctype html>
<h1>Hello from KloudVin 🐳</h1>
<p>Served by nginx inside a container.</p>

You now have a running container (an instance) created from an image (the template). Confirm it’s running:

docker ps

CONTAINER ID   IMAGE        COMMAND                  STATUS         PORTS                  NAMES
f9e8d7c6b5a4   kv-web:1.0   "/docker-entrypoint.…"   Up 5 seconds   0.0.0.0:8080->80/tcp   kv-web

4. Inspect the layers

See exactly how the image was assembled, layer by layer:

docker history kv-web:1.0

IMAGE          CREATED          CREATED BY                                      SIZE
a1b2c3d4e5f6   2 minutes ago    COPY index.html /usr/share/nginx/html/index…   141B
<missing>      3 weeks ago      /bin/sh -c #(nop)  EXPOSE 80                     0B
<missing>      3 weeks ago      /bin/sh -c #(nop)  CMD ["nginx" "-g" "daemon…   0B
...
<missing>      3 weeks ago      /bin/sh -c #(nop) ADD file:... in /            8.4MB

The top line is your COPY layer (tiny — just your HTML); everything below it came from the nginx:1.27-alpine base image. The <missing> IDs are simply the base image’s own layers, which don’t have local build records. This is the layer stack from earlier, made real.

To see the writable layer caching pay off, rebuild without changing anything:

docker build -t kv-web:1.0 .

Every step now reports CACHED and the build finishes almost instantly — proof that unchanged instructions reuse cached layers.

5. Look inside the running container (optional)

docker exec -it kv-web sh
# now you're inside the container:
ls /usr/share/nginx/html
cat /etc/os-release   # note: this is Alpine, from the base image
exit

This demonstrates the isolated filesystem view from the namespaces discussion — inside the container, you see its world, not your laptop’s.

6. Validation

You’re done with the core lab if all of these are true:

docker images kv-web lists kv-web:1.0.
docker ps shows the kv-web container Up.
curl http://localhost:8080 returns your HTML.
docker history kv-web:1.0 shows your COPY layer on top of the nginx base.

7. (Optional) Push to a local registry — still free

If you want to feel the push/pull half of the lifecycle without any cloud signup, run a registry as a container on your own machine:

docker run -d -p 5000:5000 --name registry registry:2     # a local registry
docker tag kv-web:1.0 localhost:5000/kv-web:1.0           # retag for that registry
docker push localhost:5000/kv-web:1.0                     # push
docker rmi kv-web:1.0 localhost:5000/kv-web:1.0           # remove local copies
docker pull localhost:5000/kv-web:1.0                     # pull it back

That’s the exact tag → push → pull flow you’d use with Docker Hub or a cloud registry, just pointed at localhost.

8. Cleanup

docker rm -f kv-web                    # stop + remove the container
docker rmi kv-web:1.0                  # remove the image
docker rm -f registry 2>/dev/null      # remove the local registry (if you ran step 7)
docker rmi registry:2 2>/dev/null      # and its image
docker rmi localhost:5000/kv-web:1.0 2>/dev/null
cd .. && rm -rf kv-docker-lab          # remove the project folder
docker system prune -f                 # reclaim dangling layers/build cache

Cost note: Free / local. Docker Desktop (personal use), Podman, and a local registry container all run on your own machine at no charge. Nothing in this lab provisions a cloud resource, so there is nothing to bill and nothing to leave running. The docker system prune -f at the end reclaims any leftover disk space.

Common mistakes & troubleshooting

Symptom	Likely cause	Fix
`docker: Cannot connect to the Docker daemon`	Docker Desktop / the daemon isn’t running	Start Docker Desktop (or `sudo systemctl start docker` on Linux) and retry
`port is already allocated` on `docker run -p 8080:80`	Something already uses host port 8080	Pick another host port, e.g. `-p 8081:80`, or stop the other process
Edited `index.html` but the page didn’t change	The running container still uses the old image	Rebuild (`docker build`) and recreate the container (`docker rm -f` then `docker run`) — running containers don’t auto-update
Every build reinstalls dependencies (slow)	`COPY . .` placed before the install step, busting the cache	Copy dependency manifests first, `RUN` the install, then `COPY` the source last
Image is surprisingly large	Heavy base image, or files copied in that aren’t needed	Use a `-slim`/`alpine`/distroless base and add a `.dockerignore`
`denied: requested access` on push	Not logged in, or wrong repository/registry name	`docker login <registry>` and verify the `registry/repo:tag` reference
`latest` pulled something unexpected	`latest` is just the default tag and can move	Pin a real version tag (`1.4.2`) or a digest (`@sha256:...`)

Best practices

Pin real tags, never rely on latest. Use python:3.12-slim, not python:latest. In production, consider pinning by digest for full immutability.
Order instructions for the cache: rarely-changing steps (base, dependency install) first; frequently-changing steps (your source) last.
Choose the smallest base that’s still debuggable. -slim is a great default; reach for alpine/distroless when size and attack surface matter most.
Add a .dockerignore (excluding .git, node_modules, build output, .env) so you don’t ship junk into the build context.
One concern per image. Build the app into the image; keep configuration and state out (inject config via environment/volumes). This pays off directly when you reach Kubernetes ConfigMaps and Secrets.
Prefer official and verified base images over random third-party ones.
Keep images stateless. The writable layer is ephemeral; never treat it as durable storage.

Security notes

The kernel is shared, so reduce privilege. Don’t run as root inside the container when you can avoid it — add a USER instruction to drop to a non-root user. A compromised root process in a container is more dangerous precisely because the kernel is shared with the host.
Smaller base = smaller attack surface. Distroless/slim images contain fewer packages, and fewer packages mean fewer potential vulnerabilities. Scan images (e.g. docker scout, Trivy, Grype) before shipping.
Never bake secrets into images. Anything COPY’d or set with ENV lives in the image layers and can be extracted by anyone who pulls it — even if a later layer “deletes” it. Pass secrets at runtime instead.
Pull from trusted registries and pin versions; treat latest from an unknown source as untrusted.

Quick check

In one sentence, what’s the core difference between a container and a virtual machine?
What’s the difference between an image and a container?
In the reference ghcr.io/acme/web:1.4.2, name each of the three parts.
Why does putting COPY . . before RUN npm ci slow your builds down?
What does the tag latest actually mean, and why shouldn’t you depend on it in production?

Answers

A VM virtualizes hardware and runs its own full OS/kernel; a container virtualizes the OS and shares the host kernel, isolated by namespaces and limited by cgroups — making it far smaller and faster to start.
An image is a read-only template (a blueprint); a container is a running instance created from that image, with its own thin writable layer. One image, many containers.
ghcr.io is the registry, acme/web is the repository (name/namespace), and 1.4.2 is the tag (the version label).
Because each instruction is a cached layer that’s invalidated when it or anything before it changes. Copying source first means any code change busts the cache for the npm ci step below it, forcing a full dependency reinstall every build. Copy manifests first, install, then copy source.
latest is simply the default tag used when you don’t specify one — not a guarantee of “newest” or “stable.” It can be re-pointed to different content, so builds become non-reproducible; pin a real version tag or a digest instead.

Exercise

Take a tiny program in any language you like (a one-file Python, Node, or Go “hello” works well) and containerize it from scratch:

Write a Dockerfile that starts from an appropriate official base image, copies your file in, and sets a CMD to run it.
Build it as myapp:1.0 and run it.
Run docker history myapp:1.0 and identify which layer is yours versus the base image’s.
Make a deliberate one-line change to your source and rebuild. Note in docker history/build output which layers were CACHED and which were rebuilt — and confirm the ordering matched your expectation.
Then reorder the Dockerfile to copy dependency files before source (if your language has dependencies), rebuild twice, and observe the cache behaviour improve.
Clean up everything (docker rm -f, docker rmi, docker system prune -f) and confirm with docker images and docker ps that nothing is left.

Bonus: replace your base image with a -slim or alpine variant and compare docker images sizes before and after.

Interview questions

Q: What’s the difference between a container and a VM, and when would you still choose a VM? A: A VM virtualizes hardware and runs a full guest OS with its own kernel via a hypervisor — strong isolation, but heavy (GBs, slow boot). A container shares the host kernel and is isolated by namespaces/cgroups — lightweight (MBs, instant start) and dense. You’d still choose a VM when you need a different OS kernel, the strongest isolation boundary (e.g. running untrusted multi-tenant code without a sandboxed runtime), or kernel-level features a container can’t get.

Q: Explain the difference between an image and a container. A: An image is an immutable, read-only template — a stack of filesystem layers plus metadata. A container is a runtime instance of an image with an added writable layer and a running process. You can start many containers from one image; the image is the class, the container is the object.

Q: What is an image layer, and why does it matter for build performance? A: Each Dockerfile instruction creates a read-only layer that’s a diff over the previous one; an image is an ordered stack of these layers. Layers are cached and shared. Build performance depends on ordering: Docker reuses cached layers until the first changed instruction, then rebuilds it and everything after. Putting stable steps (base, dependency install) early and volatile steps (source copy) late maximizes cache hits.

Q: Why is relying on the latest tag considered an anti-pattern? A: latest is just the default tag, not a stable channel — it can be moved to point at different image content, so the “same” reference can yield different bytes over time. That breaks reproducibility and makes rollbacks unreliable. Pin explicit version tags, and pin by digest (@sha256:...) when you need guaranteed immutability.

Q: What’s a container registry, and what kinds have you used? A: A registry stores and distributes images; you push to it and pull from it. Public ones like Docker Hub host base/official images; private ones like Amazon ECR, Azure ACR, Google Artifact Registry, and GHCR hold an organization’s own images near where they run. Access can be public or authenticated, and rate limits (e.g. Docker Hub anonymous pulls) can affect deployments.

Q: How do you keep an image small and secure? A: Start from a minimal, official, version-pinned base (-slim/alpine/distroless); use multi-stage builds so build tools don’t ship to runtime; add a .dockerignore; run as a non-root USER; never bake secrets into layers; and scan the image for vulnerabilities before shipping.

Certification mapping

This lesson maps to the KCNA (Kubernetes and Cloud Native Associate) exam — the entry-level, multiple-choice certification that’s the natural first goal for this course.

KCNA → Cloud Native Architecture / Container Orchestration: understanding what containers are, images vs containers, and the role of container runtimes and registries is foundational KCNA material. KCNA expects conceptual fluency (no hands-on terminal tasks), so the mental models in this lesson — containers vs VMs, the image lifecycle, registries — are exactly what’s tested.
It also lays groundwork for the hands-on certs later in the course — CKAD (developer) and CKA (administrator) both assume you’re comfortable with images and docker/build basics before you ever touch kubectl — but those exams themselves are covered in later lessons.

Glossary

Container — an isolated, portable unit packaging an app and its dependencies, run as a process that shares the host kernel.
Image — a read-only, layered template from which containers are created.
Layer — a read-only filesystem diff produced by one Dockerfile instruction; images are stacks of layers.
Dockerfile — the text recipe of instructions used to build an image.
Base image — the image you build FROM; it forms the foundation of your image.
Tag — a movable label identifying a version of an image (e.g. 1.4.2); defaults to latest.
Digest — a content hash (sha256:...) that identifies exact image bytes immutably.
Registry — a server that stores and distributes images (Docker Hub, GHCR, ECR, ACR, Artifact Registry).
Namespaces — Linux kernel feature giving a container its own isolated view of resources (processes, network, filesystem).
cgroups — Linux kernel feature that limits a container’s CPU, memory, and I/O usage.

Next steps

You can now package and run software as containers — the unit Kubernetes is built to orchestrate. Next, meet the system that runs these images across many machines:

Next lesson: What Is Kubernetes? Control Plane, Nodes, etcd & the kubelet