Containerization Fundamentals

Containers & Docker Basics: Images, Layers, and Registries

Before you can run anything on Kubernetes, you need the thing Kubernetes actually runs: a container. This is lesson 1 of the Kubernetes Zero-to-Hero course, and it deliberately starts one level below Kubernetes — with Docker, images, and registries — because every confusing thing you’ll hit later (“ImagePullBackOff”, “it works on my laptop but not in the cluster”, “why is my image 1.2 GB?”) traces straight back to the concepts here.

By the end you’ll know what a container really is (and how it differs from a virtual machine), the difference between an image and a container, the full Dockerfile → build → image → registry → run lifecycle, and why layer caching and base-image choice make builds fast or slow. Then you’ll build, run, and inspect a real image on your own laptop — free, no cloud account required.

Learning objectives

By the end of this lesson you will be able to:

Prerequisites & where this fits

You need only basic comfort with a terminal — cd, ls, editing a small text file — and Docker Desktop (macOS/Windows) or Docker Engine / Podman (Linux) installed. No Kubernetes, no cloud account, no prior container experience is assumed.

This is the first lesson in the Kubernetes Fundamentals module of the Kubernetes Zero-to-Hero course. Everything Kubernetes orchestrates is a container, so we build that foundation here. The next lesson, What Is Kubernetes? Control Plane, Nodes, etcd & the kubelet, introduces the cluster that schedules and runs these images at scale.

What a container actually is (vs a virtual machine)

A container packages an application with everything it needs to run — code, runtime, libraries, system tools — into one isolated, portable unit that behaves the same on your laptop, a CI runner, and a production server. That portability (“build once, run anywhere”) is the whole point.

The usual starting comparison is a virtual machine (VM). A VM virtualizes hardware: a hypervisor (VMware, Hyper-V, KVM) splits a physical machine into VMs, each running its own full operating system — its own kernel, its own boot process. That’s powerful but heavy: gigabytes in size, tens of seconds to boot.

A container virtualizes the operating system instead. All containers on a host share the host’s Linux kernel — there’s no second OS to boot. Each container just gets its own isolated view of the system (its own filesystem, process tree, and network), so it feels like it owns the machine while really being a set of normal processes behind strong fences. The result: megabytes, not gigabytes, and startup in milliseconds.

That isolation comes from two Linux kernel features. Docker sets them up for you, but knowing the names demystifies a lot:

In one line: a container is a normal process, boxed in by namespaces (its view of the world) and cgroups (its resource budget), sharing the host kernel.

Virtual machine Container
Isolates by virtualizing Hardware (own kernel + OS) The OS (shares host kernel)
Typical size Gigabytes Megabytes
Start time Tens of seconds Milliseconds
Isolation strength Stronger (separate kernel) Strong, but shared kernel
Density per host Tens Hundreds to thousands

The trade-off is real: sharing the host kernel makes the isolation boundary thinner than a VM’s. That’s fine for most workloads and is exactly what makes containers cheap and fast — but it’s why dropping privileges and (at the high end) sandboxed runtimes exist. We touch security at the end of this lesson.

Images vs containers: template vs running instance

This is the most common point of confusion for beginners, so let’s nail it.

An image is a read-only template — an immutable snapshot of a filesystem plus a little metadata (which command to run, which ports to expose). It does nothing on its own; it just sits there, like a .iso file or a class definition in code.

A container is a running instance of an image. docker run takes that read-only template, adds a thin writable layer on top, and starts the process inside. You can start many containers from one image — like creating many objects from one class — and each gets its own writable layer, so one container’s changes don’t affect the image or the others.

Image Container
Read-only template Running (or stopped) instance
Immutable; built once Has a writable top layer
Like a class / blueprint Like an object / a process
Stored in a registry Lives on a host while it runs
docker build produces it docker run creates it

A practical consequence: containers are ephemeral. Anything written inside a running container’s writable layer disappears when the container is removed. That’s by design — it’s why we keep state in volumes and databases, not inside containers — and it’s a foundational idea you’ll meet again the moment you learn about Pods in Kubernetes.

The lifecycle: Dockerfile → build → image → registry → run

Here is the path every container takes, from a text file on your machine to a running process. This is the backbone of the whole lesson.

The Docker image lifecycle: a Dockerfile is built into a layered, read-only image, pushed to a registry, then pulled and run as a container.

As the diagram shows, a Dockerfile (your recipe) is turned by docker build into a layered, read-only image; that image is pushed to a registry (a shared store); later, any host pulls the image and runs it as a container. Let’s walk each stage.

1. The Dockerfile (the recipe)

A Dockerfile is a plain-text file of instructions describing how to assemble an image. A minimal one for a tiny static site looks like this:

# syntax=docker/dockerfile:1
FROM nginx:1.27-alpine          # 1. start FROM a base image
COPY index.html /usr/share/nginx/html/index.html   # 2. add our content
EXPOSE 80                       # 3. document the port the app listens on
# nginx's own image already defines the start command (CMD), so we inherit it

The instructions you’ll meet first:

2. docker build → an image made of layers

docker build executes the Dockerfile top to bottom and produces an image. The crucial detail: each instruction creates a new read-only layer, stacked on the one before. An image is therefore an ordered stack of layers, each a diff (a set of filesystem changes) over the previous one.

┌─────────────────────────────┐  ← COPY index.html   (your content)
├─────────────────────────────┤  ← nginx config / binaries
├─────────────────────────────┤  ← Alpine base packages
└─────────────────────────────┘  ← FROM nginx:1.27-alpine (base)

Layers are content-addressed and shared between images. If ten of your images all start FROM node:20, that base is stored once on the host and reused — saving disk and download time.

3. Push to a registry

A built image lives only on the machine that built it until you push it to a registry — a server that stores and distributes images — making it available to teammates, CI, and clusters.

4. Pull and run

On any other host, docker run (or, later, Kubernetes) will pull the image if it isn’t already present, then start a container from it. Same image, same behavior, anywhere.

Keep this loop in your head: author the Dockerfile, build the image, push to a registry, pull & run as a container. When something breaks in Kubernetes later, you’ll almost always be debugging one of these four stages.

Tags, base images, and layer caching

These three ideas separate someone who “can build an image” from someone who builds good images.

Tags: naming and versioning images

An image reference has the shape registry/repository:tag, e.g. ghcr.io/acme/web:1.4.2:

A tag is just a movable label, not a guaranteed version. latest is the classic trap: it doesn’t mean “newest and stable,” it’s simply the default tag and can be re-pointed to different content over time. In production, pin specific tags (1.4.2) or, for true immutability, pin by digest (web@sha256:...), which refers to exact bytes and can never move. Treat latest as “unspecified.”

Base images: what you build FROM

Your base image dictates your image’s size, attack surface, and which tools are available inside. Common choices:

Base Size (rough) When to use
ubuntu / debian ~70–120 MB Familiar, lots of packages, easy debugging
*-slim (e.g. python:3.12-slim) ~40–80 MB Trimmed distro; a sensible default
alpine ~5–10 MB Tiny; uses musl libc (occasional compatibility quirks)
distroless / scratch near-zero No shell or package manager; smallest + most secure for compiled apps

Smaller is generally better — less to download, fewer packages that can carry vulnerabilities — but it also means fewer debugging tools inside, so it’s a trade-off. Prefer official images and pin a real version tag (python:3.12-slim, not python:latest).

Layer caching: why instruction order matters

Because layers are diffs, Docker caches them. On a rebuild, Docker reuses a cached layer as long as that instruction and everything before it is unchanged. The moment one instruction changes, that layer and every layer after it must be rebuilt.

The practical rule: put rarely-changing things early, frequently-changing things late. The classic case is dependency install vs. copying source:

# syntax=docker/dockerfile:1
FROM node:20-slim
WORKDIR /app

# 1. Copy ONLY the dependency manifests first, then install.
#    These files change rarely, so this expensive layer stays cached.
COPY package.json package-lock.json ./
RUN npm ci

# 2. Copy the source LAST. It changes on every commit, but it's a
#    cheap layer, and the npm install above is reused from cache.
COPY . .
CMD ["node", "server.js"]

If you instead did COPY . . before RUN npm ci, then any one-character change to your source would invalidate the cache and force a full reinstall of dependencies on every build — slow and wasteful. Ordering your Dockerfile for the cache is one of the highest-leverage habits in containerization.

Registries: where images live

A container registry is a server that stores, versions, and distributes images. You push to it and pull from it. The ones you’ll encounter:

Registry Who runs it Typical use
Docker Hub (docker.io) Docker The default; home of most official base images
GHCR (ghcr.io) GitHub Images built from GitHub repos / Actions
Amazon ECR AWS Private images for workloads on AWS
Azure ACR Azure Private images for workloads on Azure
Google Artifact Registry Google Cloud Private images for workloads on GCP

Registries can be public (anyone can pull) or private (pull requires authentication). The cloud registries (ECR/ACR/Artifact Registry) hold a team’s own application images close to where they run; Docker Hub is where you pull base images. One note for later: Docker Hub rate-limits anonymous pulls, a real cause of ImagePullBackOff errors you’ll meet in a few lessons.

Hands-on lab

You’ll build a tiny image, run it, inspect its layers, and clean up. Everything here is free and runs entirely on your laptop — no cloud account, no cluster yet. We skip pushing to a remote registry (to stay zero-cost and zero-signup); the optional step shows how using a local registry container.

0. Verify Docker is working

docker version
docker run --rm hello-world

Expected (trimmed): the hello-world container prints a confirmation message, including:

Hello from Docker!
This message shows that your installation appears to be working correctly.

If you use Podman, every docker command below works as podman — the CLI is compatible.

1. Create a tiny project

mkdir kv-docker-lab && cd kv-docker-lab

cat > index.html <<'EOF'
<!doctype html>
<h1>Hello from KloudVin 🐳</h1>
<p>Served by nginx inside a container.</p>
EOF

cat > Dockerfile <<'EOF'
# syntax=docker/dockerfile:1
FROM nginx:1.27-alpine
COPY index.html /usr/share/nginx/html/index.html
EXPOSE 80
EOF

2. Build the image

docker build -t kv-web:1.0 .

Expected output ends with lines like:

 => => naming to docker.io/library/kv-web:1.0
 => => writing image sha256:...

You just turned a Dockerfile into a tagged image, kv-web:1.0. Confirm it exists:

docker images kv-web
REPOSITORY   TAG   IMAGE ID       CREATED         SIZE
kv-web       1.0   a1b2c3d4e5f6   2 seconds ago   53.2MB

3. Run a container from the image

docker run -d --name kv-web -p 8080:80 kv-web:1.0

Now visit http://localhost:8080 in a browser, or:

curl -s http://localhost:8080

Expected:

<!doctype html>
<h1>Hello from KloudVin 🐳</h1>
<p>Served by nginx inside a container.</p>

You now have a running container (an instance) created from an image (the template). Confirm it’s running:

docker ps
CONTAINER ID   IMAGE        COMMAND                  STATUS         PORTS                  NAMES
f9e8d7c6b5a4   kv-web:1.0   "/docker-entrypoint.…"   Up 5 seconds   0.0.0.0:8080->80/tcp   kv-web

4. Inspect the layers

See exactly how the image was assembled, layer by layer:

docker history kv-web:1.0
IMAGE          CREATED          CREATED BY                                      SIZE
a1b2c3d4e5f6   2 minutes ago    COPY index.html /usr/share/nginx/html/index…   141B
<missing>      3 weeks ago      /bin/sh -c #(nop)  EXPOSE 80                     0B
<missing>      3 weeks ago      /bin/sh -c #(nop)  CMD ["nginx" "-g" "daemon…   0B
...
<missing>      3 weeks ago      /bin/sh -c #(nop) ADD file:... in /            8.4MB

The top line is your COPY layer (tiny — just your HTML); everything below it came from the nginx:1.27-alpine base image. The <missing> IDs are simply the base image’s own layers, which don’t have local build records. This is the layer stack from earlier, made real.

To see the writable layer caching pay off, rebuild without changing anything:

docker build -t kv-web:1.0 .

Every step now reports CACHED and the build finishes almost instantly — proof that unchanged instructions reuse cached layers.

5. Look inside the running container (optional)

docker exec -it kv-web sh
# now you're inside the container:
ls /usr/share/nginx/html
cat /etc/os-release   # note: this is Alpine, from the base image
exit

This demonstrates the isolated filesystem view from the namespaces discussion — inside the container, you see its world, not your laptop’s.

6. Validation

You’re done with the core lab if all of these are true:

7. (Optional) Push to a local registry — still free

If you want to feel the push/pull half of the lifecycle without any cloud signup, run a registry as a container on your own machine:

docker run -d -p 5000:5000 --name registry registry:2     # a local registry
docker tag kv-web:1.0 localhost:5000/kv-web:1.0           # retag for that registry
docker push localhost:5000/kv-web:1.0                     # push
docker rmi kv-web:1.0 localhost:5000/kv-web:1.0           # remove local copies
docker pull localhost:5000/kv-web:1.0                     # pull it back

That’s the exact tag → push → pull flow you’d use with Docker Hub or a cloud registry, just pointed at localhost.

8. Cleanup

docker rm -f kv-web                    # stop + remove the container
docker rmi kv-web:1.0                  # remove the image
docker rm -f registry 2>/dev/null      # remove the local registry (if you ran step 7)
docker rmi registry:2 2>/dev/null      # and its image
docker rmi localhost:5000/kv-web:1.0 2>/dev/null
cd .. && rm -rf kv-docker-lab          # remove the project folder
docker system prune -f                 # reclaim dangling layers/build cache

Cost note: Free / local. Docker Desktop (personal use), Podman, and a local registry container all run on your own machine at no charge. Nothing in this lab provisions a cloud resource, so there is nothing to bill and nothing to leave running. The docker system prune -f at the end reclaims any leftover disk space.

Common mistakes & troubleshooting

Symptom Likely cause Fix
docker: Cannot connect to the Docker daemon Docker Desktop / the daemon isn’t running Start Docker Desktop (or sudo systemctl start docker on Linux) and retry
port is already allocated on docker run -p 8080:80 Something already uses host port 8080 Pick another host port, e.g. -p 8081:80, or stop the other process
Edited index.html but the page didn’t change The running container still uses the old image Rebuild (docker build) and recreate the container (docker rm -f then docker run) — running containers don’t auto-update
Every build reinstalls dependencies (slow) COPY . . placed before the install step, busting the cache Copy dependency manifests first, RUN the install, then COPY the source last
Image is surprisingly large Heavy base image, or files copied in that aren’t needed Use a -slim/alpine/distroless base and add a .dockerignore
denied: requested access on push Not logged in, or wrong repository/registry name docker login <registry> and verify the registry/repo:tag reference
latest pulled something unexpected latest is just the default tag and can move Pin a real version tag (1.4.2) or a digest (@sha256:...)

Best practices

Security notes

Quick check

  1. In one sentence, what’s the core difference between a container and a virtual machine?
  2. What’s the difference between an image and a container?
  3. In the reference ghcr.io/acme/web:1.4.2, name each of the three parts.
  4. Why does putting COPY . . before RUN npm ci slow your builds down?
  5. What does the tag latest actually mean, and why shouldn’t you depend on it in production?

Answers

  1. A VM virtualizes hardware and runs its own full OS/kernel; a container virtualizes the OS and shares the host kernel, isolated by namespaces and limited by cgroups — making it far smaller and faster to start.
  2. An image is a read-only template (a blueprint); a container is a running instance created from that image, with its own thin writable layer. One image, many containers.
  3. ghcr.io is the registry, acme/web is the repository (name/namespace), and 1.4.2 is the tag (the version label).
  4. Because each instruction is a cached layer that’s invalidated when it or anything before it changes. Copying source first means any code change busts the cache for the npm ci step below it, forcing a full dependency reinstall every build. Copy manifests first, install, then copy source.
  5. latest is simply the default tag used when you don’t specify one — not a guarantee of “newest” or “stable.” It can be re-pointed to different content, so builds become non-reproducible; pin a real version tag or a digest instead.

Exercise

Take a tiny program in any language you like (a one-file Python, Node, or Go “hello” works well) and containerize it from scratch:

  1. Write a Dockerfile that starts from an appropriate official base image, copies your file in, and sets a CMD to run it.
  2. Build it as myapp:1.0 and run it.
  3. Run docker history myapp:1.0 and identify which layer is yours versus the base image’s.
  4. Make a deliberate one-line change to your source and rebuild. Note in docker history/build output which layers were CACHED and which were rebuilt — and confirm the ordering matched your expectation.
  5. Then reorder the Dockerfile to copy dependency files before source (if your language has dependencies), rebuild twice, and observe the cache behaviour improve.
  6. Clean up everything (docker rm -f, docker rmi, docker system prune -f) and confirm with docker images and docker ps that nothing is left.

Bonus: replace your base image with a -slim or alpine variant and compare docker images sizes before and after.

Interview questions

Q: What’s the difference between a container and a VM, and when would you still choose a VM? A: A VM virtualizes hardware and runs a full guest OS with its own kernel via a hypervisor — strong isolation, but heavy (GBs, slow boot). A container shares the host kernel and is isolated by namespaces/cgroups — lightweight (MBs, instant start) and dense. You’d still choose a VM when you need a different OS kernel, the strongest isolation boundary (e.g. running untrusted multi-tenant code without a sandboxed runtime), or kernel-level features a container can’t get.

Q: Explain the difference between an image and a container. A: An image is an immutable, read-only template — a stack of filesystem layers plus metadata. A container is a runtime instance of an image with an added writable layer and a running process. You can start many containers from one image; the image is the class, the container is the object.

Q: What is an image layer, and why does it matter for build performance? A: Each Dockerfile instruction creates a read-only layer that’s a diff over the previous one; an image is an ordered stack of these layers. Layers are cached and shared. Build performance depends on ordering: Docker reuses cached layers until the first changed instruction, then rebuilds it and everything after. Putting stable steps (base, dependency install) early and volatile steps (source copy) late maximizes cache hits.

Q: Why is relying on the latest tag considered an anti-pattern? A: latest is just the default tag, not a stable channel — it can be moved to point at different image content, so the “same” reference can yield different bytes over time. That breaks reproducibility and makes rollbacks unreliable. Pin explicit version tags, and pin by digest (@sha256:...) when you need guaranteed immutability.

Q: What’s a container registry, and what kinds have you used? A: A registry stores and distributes images; you push to it and pull from it. Public ones like Docker Hub host base/official images; private ones like Amazon ECR, Azure ACR, Google Artifact Registry, and GHCR hold an organization’s own images near where they run. Access can be public or authenticated, and rate limits (e.g. Docker Hub anonymous pulls) can affect deployments.

Q: How do you keep an image small and secure? A: Start from a minimal, official, version-pinned base (-slim/alpine/distroless); use multi-stage builds so build tools don’t ship to runtime; add a .dockerignore; run as a non-root USER; never bake secrets into layers; and scan the image for vulnerabilities before shipping.

Certification mapping

This lesson maps to the KCNA (Kubernetes and Cloud Native Associate) exam — the entry-level, multiple-choice certification that’s the natural first goal for this course.

Glossary

Next steps

You can now package and run software as containers — the unit Kubernetes is built to orchestrate. Next, meet the system that runs these images across many machines:

Related reading on KloudVin:

DockerContainersImagesRegistriesKubernetes
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading