GCP Lesson 9 of 98

Google Compute Engine, In Depth: Machine Types, Disks, Images, Metadata & Every Option

A Google Compute Engine (GCE) instance is the most fundamental piece of compute you can rent on Google Cloud: a virtual machine — vCPUs, memory, disks, a network interface — running on Google’s infrastructure, that you control from the operating system upward. It is pure Infrastructure as a Service (IaaS). Google runs the physical host, the hypervisor, the datacentre, the power and the network fabric; you own the OS, the patches, the software you install, and your data. If you have ever installed Ubuntu or Windows Server on a laptop, you already understand most of what a VM is. The remaining part — the part interviewers and the ACE and Professional Cloud Architect exams probe relentlessly — is the dozens of choices GCE asks you to make when you create an instance, and the operations you can (and cannot) perform afterwards.

This lesson is deliberately exhaustive. We go family by family through every machine type, then through images, disks (from Balanced Persistent Disk all the way to Hyperdisk), every provisioning and discount model (on-demand, Spot/preemptible, sole-tenant, committed and sustained use), networking, the metadata server with startup and shutdown scripts, OS Login vs SSH keys, service accounts and access scopes, and the security shells Shielded VM and Confidential VM. Every option gets the same treatment: what it is · the choices · the default · when to pick which · the trade-off · the limit · the cost impact · the gotcha. Each core operation comes with a real gcloud command so you can do this by hand or wire it into Terraform later. By the end you will know the Compute Engine instance end to end — enough to ace an ACE or PCA question, sail through an interview, and run VMs safely in production.

Learning objectives

By the end of this lesson you can:

Prerequisites & where this fits

You should already understand Google Cloud’s resource hierarchy — organisation → folder → project → resource — what a region and a zone are, and how to run gcloud from Cloud Shell or a local SDK install (covered in the Fundamentals module). No prior VM experience is assumed; we define every term. This is the anchor lesson of the Compute module in the GCP Zero-to-Hero course: it introduces the machine types, disks, images, metadata, and identity model that the rest of the compute track — managed instance groups, Cloud Run, GKE — builds on. Once you can drive a single instance fluently, the leap to a self-healing fleet in Regional Managed Instance Groups: Autohealing, Canary Rollouts, and Stateful MIGs is small.

Core concepts

Before the options, fix five mental models. They explain why the settings are shaped the way they are.

An instance is an assembly, not a single resource. When you “create a VM” you actually create and wire several objects: the instance itself, one or more disks (a boot disk, optional data disks), a network interface attached to a subnet, optionally an external IP, and an attached service account. The console hides this behind one form; gcloud and Terraform make it explicit. It matters for deletion too: by default the boot disk is deleted with the instance, but additional disks are not unless you set auto-delete — a classic source of orphaned-disk cost.

Compute and storage are decoupled. The instance is the CPU and RAM; Persistent Disk (PD) and Hyperdisk are independent network-attached block devices. This is the single most important architectural idea: you can stop an instance (largely stop paying for compute) while keeping its disks, change the machine type without touching the disks, or detach a disk and attach it to another instance for recovery. The exceptions are Local SSD (physically attached to the host, blisteringly fast but ephemeral) and the small in-host scratch — data there is lost on stop or host migration.

Zonal vs regional resources. An instance is a zonal resource — it lives in exactly one zone (e.g. europe-west2-a). A standard Persistent Disk is also zonal and must be in the same zone as the instance it attaches to. Regional Persistent Disk synchronously replicates across two zones in a region for high availability. Images, snapshots, instance templates, and firewall rules are global; subnets are regional. Knowing the scope of each resource explains where you can move things and what survives a zone outage.

Live migration keeps you running through maintenance. Unlike many clouds, GCE can live-migrate a running instance to another host during planned host maintenance with no reboot — controlled by the instance’s availability policy (onHostMaintenance = MIGRATE by default for standard VMs). Spot VMs and some accelerator/Confidential configurations cannot migrate and are instead terminated on maintenance. This is why GCE rarely forces a reboot for host patching.

Projects carry quota, defaults, and billing. vCPU counts, IP addresses, and disk capacity are all governed by per-region quotas on the project. Every instance also runs as a service account identity and bills to the project’s billing account. Key terms used throughout: vCPU (one hyperthread on most families), machine type (a named shape such as n2-standard-4 = 4 vCPU, 16 GiB), image (the OS template you boot from), metadata (key/value config the instance can read about itself), and access scope (a legacy ceiling on what the attached service account’s token may do).

Choosing a machine type: every family

A machine type defines the instance’s vCPU count, memory, and the underlying CPU platform. Google groups machine types into families by purpose, and within a family into series (a hardware generation) and types (standard, highmem, highcpu, sometimes highgpu/ultramem). The four broad categories are general-purpose, compute-optimised, memory-optimised, and accelerator-optimised.

Family Series Category CPU platform vCPU:memory feel Live migration Typical use cases
E2 e2 General-purpose (cost) Intel/AMD (abstracted) Balanced (standard 1:4) Yes Dev/test, small/medium web and app servers, microservices on a budget
N2 n2 General-purpose (balanced) Intel Cascade/Ice Lake standard 1:4, highmem 1:8, highcpu 1:1 Yes Most production workloads needing predictable Intel performance
N2D n2d General-purpose (balanced) AMD EPYC (Rome/Milan) Same ratios as N2 Yes Same as N2 but cheaper per vCPU on AMD; scale-out web/app
T2D / T2A t2d (AMD), t2a (Arm) General-purpose (scale-out) AMD Milan / Ampere Altra Arm 1:4, no SMT (vCPU = physical core) T2D yes; T2A no High throughput-per-cost scale-out; T2A for Arm-native workloads
C3 / C3D c3 (Intel), c3d (AMD) Compute-optimised (latest) Intel Sapphire Rapids / AMD Genoa standard 1:4, highcpu 1:2, highmem 1:8 Yes (with newer support) CPU-bound, latency-sensitive: gaming, HPC front-ends, ad serving, high-traffic web
C2 / C2D c2, c2d Compute-optimised (prior gen) Intel Cascade Lake / AMD Milan High clock, 1:4 Yes Single-thread-sensitive, HPC, electronic design
M3 / M2 / M1 m3, m2, m1 Memory-optimised Intel megamem/ultramem up to ~1:28+ Limited SAP HANA, large in-memory databases, big analytics
A3 / A2 / G2 a3, a2 (NVIDIA), g2 Accelerator-optimised Intel + NVIDIA H100/A100 / L4 GPU-attached No (terminated) AI/ML training and inference, rendering, GPU compute

A few rules make sense of the table:

Custom machine types

The predefined types may not fit your ratio — perhaps you need 6 vCPU with 40 GiB rather than the 24 GiB a standard gives. Custom machine types (E2, N2, N2D, and others) let you choose vCPU and memory independently within the family’s limits, with extended memory available above the normal per-vCPU ceiling at a small premium.

Aspect Predefined Custom
vCPU/memory Fixed combinations You pick both (within family rules)
When to use Standard ratios, simplest billing Right-sizing to avoid paying for unused RAM or CPU
Constraints n/a vCPU even numbers above 1; memory between 0.5–8 GiB per vCPU (more with extended memory)
Cost List price per shape Per-vCPU + per-GiB pricing; extended memory billed higher
Gotcha May force you up a size Slightly higher unit price than the closest predefined; not every family supports custom

Create a custom shape with --custom-cpu and --custom-memory (use --custom-extensions for extended memory). Right-sizing with custom types is one of the cheapest wins on a GCE bill, and Google’s rightsizing recommendations in the console will suggest moves based on observed utilisation.

Choosing an image

An image is the OS template the boot disk is created from. GCE offers three categories.

Image kind What it is When to use Gotcha
Public images Google- and partner-maintained OS images: Debian, Ubuntu, RHEL, Rocky, SLES, Windows Server, Container-Optimized OS (COS) The default starting point; COS for running containers directly on a VM Some (RHEL, SLES, Windows) carry a per-second premium licence charge on top of the VM
Custom images Your own image baked from a configured disk (a “golden image”) Bake dependencies and hardening once, boot identical VMs fast You own patching and lifecycle; store in a dedicated image project
Image families A named pointer (e.g. debian-12, or your my-app-prod) that always resolves to the latest non-deprecated image Templates and MIGs — get patches without editing the template Pin a specific image instead when you need fully reproducible builds

Reference a public image by --image-family + --image-project (e.g. --image-family=debian-12 --image-project=debian-cloud). Image families are the right default for instance templates because they let a rebuild pick up the latest patched image automatically; pin an exact image (--image=...) when you need byte-for-byte reproducibility. Images are global resources and can be shared across projects via IAM. Machine images (a related but distinct object) capture the whole instance — config plus all disks — and are handy for cloning or backup.

Disks: every type

Storage is decoupled from compute, so the disk choice is its own decision. The boot disk holds the OS; data disks hold everything you want to survive a machine-type change or a rebuild. The modern, recommended block-storage line is Hyperdisk; the long-standing line is Persistent Disk (PD); Local SSD is ephemeral host-attached storage.

Disk type Media Performance model Boot disk? Best for Cost feel Gotcha
Standard PD (pd-standard) HDD Throughput scales with size; low IOPS Yes Cold/sequential, logs, cheap bulk Lowest Poor random-IOPS; avoid for databases
Balanced PD (pd-balanced) SSD Good IOPS/throughput per GB; the sensible default Yes Most boot disks and general workloads Mid n/a — this is the default pick
SSD PD (pd-ssd) SSD Higher IOPS/throughput per GB than Balanced Yes Latency-sensitive databases, high-IOPS apps Higher More expensive; size still gates performance
Extreme PD (pd-extreme) SSD Provisioned IOPS independent of size Yes (limited) Highest-performance PD workloads (large DBs) High Only on larger machine types; you pay for provisioned IOPS
Hyperdisk Balanced SSD (next-gen) Independently provisioned IOPS and throughput Yes New general workloads wanting tuned performance Mid–high Family/region support varies; the strategic default going forward
Hyperdisk Extreme SSD (next-gen) Very high provisioned IOPS Data Mission-critical DBs (SAP HANA, large SQL) High Larger machine types only
Hyperdisk Throughput SSD (next-gen) Provisioned throughput, cost-efficient Data Throughput-oriented analytics, Kafka, Hadoop Mid Optimised for MB/s, not IOPS
Hyperdisk ML SSD (next-gen) Very high read throughput, multi-attach read-only Data Loading large ML datasets/models to many VMs Varies Read-optimised; specialised use
Local SSD Physically attached NVMe Highest IOPS/lowest latency; ephemeral No Scratch, caches, temp, shuffle space Per-device Data lost on stop/terminate/migration; back it up if it matters

Three properties cut across all disk types:

Provisioning and discount models

How you acquire the instance changes both its resilience and its price. There are three provisioning models and two automatic/committed discount programmes.

Model Discount Eviction/termination SLA When to use Gotcha
On-demand (standard) None (list price) None Standard VM SLA Steady production you cannot interrupt Most expensive per hour
Spot VMs ~60–91% off Google can preempt any time (30-second notice); cannot live-migrate No SLA Fault-tolerant, stateless, batch, CI, rendering, MIG burst capacity Can vanish mid-run; never for stateful primaries
Preemptible VMs (legacy) Similar discount Preempted, hard 24-hour cap No SLA Legacy; prefer Spot Spot is the modern replacement with no 24h cap
Sole-tenant nodes n/a (premium) None Standard Licensing (BYOL per-core), compliance/isolation requirements You pay for the whole physical host; more expensive

And the two discount programmes that apply on top:

Programme What it is Commitment Typical saving Applies to Gotcha
Sustained Use Discounts (SUDs) Automatic discount the longer an instance runs in a month None — applied automatically Up to ~20–30% (general-purpose/memory-optimised) E2 is excluded; N2/N2D/C2 etc. qualify Nothing to do; not stackable with CUDs on the same vCPUs
Committed Use Discounts (CUDs) A 1- or 3-year commitment to spend/usage 1 or 3 years Up to ~57% (resource-based) / flexible (spend-based) Resource-based (per family/region) or spend-based (flexible) Pay even if unused; plan to baseline, not peak

The mental model: Spot for anything interruptible (huge savings), CUDs for your steady baseline (commit to what you always run), and let SUDs apply automatically to the rest. Reserve on-demand for the spiky top of the curve. Create a Spot instance with --provisioning-model=SPOT --instance-termination-action=STOP (or DELETE); set --max-run-duration if you want a self-terminating box.

Networking

Every instance attaches to a VPC network and a subnet through a network interface, and that placement governs its IP addressing and reachability.

You will go far deeper on subnets, routes, firewall rules, and Cloud NAT in the Networking module; here, just place the instance in the right subnet and tag it for the firewall rules it needs.

The metadata server, startup and shutdown scripts

Every instance can query a metadata server at the link-local address http://metadata.google.internal/ (i.e. 169.254.169.254) to learn about itself and its project, and to fetch credentials for its service account. Requests must send the header Metadata-Flavor: Google — a deliberate guard against confused-deputy SSRF attacks.

Metadata kind Scope Examples Set/changed by
Project metadata All instances in the project Project-wide SSH keys, enable-oslogin, custom keys Project editors
Instance metadata A single instance startup-script, shutdown-script, custom app config, per-instance SSH keys Instance owner
Default/derived Per instance Hostname, zone, machine type, network, service-account token Google (read-only)

Two metadata keys are workhorses:

Fetch the service-account token or any attribute from inside the VM:

# Read this instance's zone and the active service-account access token
curl -s -H "Metadata-Flavor: Google" \
  http://metadata.google.internal/computeMetadata/v1/instance/zone

curl -s -H "Metadata-Flavor: Google" \
  http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token

This token is how Application Default Credentials authenticate on a VM with no key file — the recommended pattern. Gotcha: never expose the metadata endpoint through a proxy or SSRF-prone app; the Metadata-Flavor header requirement and firewalling exist precisely because the token is sensitive.

OS Login vs SSH keys

There are two ways to grant Linux SSH access, and choosing correctly is a recurring exam and security topic.

Aspect OS Login (recommended) Metadata SSH keys (legacy)
Where identity lives Tied to Google/Cloud Identity users via IAM Public keys stored in project or instance metadata
Access control IAM roles: roles/compute.osLogin, roles/compute.osAdminLogin Whoever’s public key is in metadata can log in
Provisioning POSIX accounts created automatically from the directory You manually add/rotate keys in metadata
2FA / centralisation Supports 2-step verification; central audit None; sprawls across instances
Audit Logins tied to a Google identity in Cloud Audit Logs Hard to attribute; keys outlive people
Enable enable-oslogin=TRUE in project or instance metadata Default if OS Login is off
Gotcha IAM role needed in addition to network access; org policy can enforce it Stale keys are a real breach vector; avoid at scale

Use OS Login. Set enable-oslogin=TRUE at the project level, grant users roles/compute.osLogin (or osAdminLogin for sudo), and you get IAM-governed, auditable, centrally revocable SSH with optional 2FA — and you can enforce it org-wide with an org policy. Metadata keys remain useful only for break-glass or automation that cannot use a Google identity. For day-to-day admin without an external IP, combine OS Login with IAP TCP forwarding (gcloud compute ssh --tunnel-through-iap) so you never open port 22 to the internet.

Service account & access scopes

Every instance runs as a service account identity. By default that is the Compute Engine default service account, but you should attach a dedicated, least-privilege service account per workload. What the instance can actually do is the intersection of two controls:

Best practice: set the scope to cloud-platform (--scopes=cloud-platform) and control everything precisely with IAM roles on a dedicated service account (--service-account=...). Scopes are an artefact from before fine-grained IAM existed; treating IAM as the single source of truth avoids the confusing “I granted the role but it still says permission denied” trap caused by a too-narrow scope.

Shielded VM & Confidential VM

Two security shells harden the instance itself.

From an instance to a fleet: templates & MIGs

A single instance is fine for a pet server, but production runs fleets. An instance template is an immutable, global definition of an instance — machine type, image, disks, network, metadata, service account, every option above frozen into one object. A Managed Instance Group (MIG) stamps out identical instances from a template and then autoheals (recreates failed instances against a health check), autoscales (adds/removes instances on a signal), and performs rolling and canary updates (shift the fleet to a new template gradually). Build the template once with gcloud compute instance-templates create, then drive the fleet — zonal or, for production, regional across zones — exactly as covered in Regional Managed Instance Groups: Autohealing, Canary Rollouts, and Stateful MIGs. Everything you have learned about machine types, disks, provisioning (including Spot for cheap burst capacity), metadata, and identity flows straight into the template.

Google Compute Engine anatomy & options

The diagram above shows the full anatomy of a Compute Engine instance — the machine type and CPU platform at the centre, the boot and data disks (and ephemeral Local SSD) attaching from the storage plane, the network interface into a regional subnet with optional external IP and network tags, the metadata server feeding startup/shutdown scripts and the service-account token, and the OS Login, Shielded, and Confidential security shells wrapping it — and how a template projects all of this onto a managed instance group.

Hands-on lab

We will create a small instance on the Free Tier, inspect it, read its metadata, attach a data disk, then clean everything up. The e2-micro in an eligible US region is part of the GCE Always Free allowance, so this lab is effectively free; a $300 free-trial credit covers it comfortably regardless.

1. Set your project and a default zone.

gcloud config set project YOUR_PROJECT_ID
gcloud config set compute/zone us-central1-a

2. Create the instance — an e2-micro, Debian 12, Balanced boot disk, no external IP, with a startup script and the broad scope so IAM governs permissions.

gcloud compute instances create gce-lab-01 \
  --machine-type=e2-micro \
  --image-family=debian-12 --image-project=debian-cloud \
  --boot-disk-type=pd-balanced --boot-disk-size=10GB \
  --no-address \
  --shielded-secure-boot --shielded-vtpm --shielded-integrity-monitoring \
  --scopes=cloud-platform \
  --metadata=enable-oslogin=TRUE \
  --metadata-from-file=startup-script=<(echo '#!/bin/bash
echo "hello from $(hostname) in $(curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/zone | cut -d/ -f4)" > /var/tmp/lab.txt')

Expected output: a table showing gce-lab-01, its zone, machine type e2-micro, an internal IP, and STATUS: RUNNING.

3. Validate. Confirm it is running and read back the metadata and machine type.

gcloud compute instances describe gce-lab-01 \
  --format="value(status, machineType.basename(), networkInterfaces[0].networkIP)"

You should see RUNNING e2-micro 10.x.x.x. Because there is no external IP, SSH in over IAP (OS Login is enabled, so use your Google identity):

gcloud compute ssh gce-lab-01 --tunnel-through-iap --command="cat /var/tmp/lab.txt"

It should print hello from gce-lab-01 in us-central1-a, proving the startup script and metadata server both worked.

4. Attach a data disk — create a 10 GB Balanced PD and attach it, surviving the instance if needed.

gcloud compute disks create gce-lab-data --size=10GB --type=pd-balanced
gcloud compute instances attach-disk gce-lab-01 \
  --disk=gce-lab-data --device-name=data1

Confirm the OS sees a new block device (lsblk over the IAP SSH session will show /dev/disk/by-id/google-data1).

5. Cleanup. Delete the instance (and its boot disk), then the data disk, to stop all charges.

gcloud compute instances delete gce-lab-01 --quiet
gcloud compute disks delete gce-lab-data --quiet

Cost note. An e2-micro in us-central1/us-west1/us-east1 falls under the Always Free tier (one per month) with a 30 GB-month standard PD allowance; this lab’s 10 GB Balanced boot disk plus a short-lived 10 GB data disk costs only a few pennies even outside the free allowance, and nothing once deleted. The number-one source of surprise GCE cost is leftover disks and external IPs after the instance is gone — the explicit disk delete above is why we clean up by hand.

Common mistakes & troubleshooting

Symptom Likely cause Fix
Permission denied calling an API from the VM despite the right IAM role Access scope on the instance is too narrow (legacy default) Recreate/stop-edit with --scopes=cloud-platform; let IAM gate access
Cannot SSH; gcloud compute ssh times out No external IP and no IAP, or firewall blocks 22/IAP range Use --tunnel-through-iap and allow IAP source range 35.235.240.0/20 to port 22
Data lost after stopping a VM Data was on Local SSD (ephemeral) Move persistent data to PD/Hyperdisk; back up Local SSD before stop
Orphaned disks/IPs still billing after deleting VMs Additional disks default to auto-delete=no; reserved IPs persist Delete leftover disks and release static IPs explicitly
Database disk feels slow pd-standard or a tiny pd-ssd (perf scales with size) Use Balanced/SSD/Hyperdisk; size up or provision IOPS (Extreme/Hyperdisk)
Instance unexpectedly terminated It is a Spot VM and was preempted Expected — handle with a shutdown script and MIG recreate; use on-demand for stateful
OS Login user cannot log in Missing roles/compute.osLogin even with network access Grant the OS Login (or osAdminLogin) IAM role to the user
Secure Boot blocks a custom kernel module Module is unsigned; Shielded Secure Boot rejects it Sign the module or temporarily disable Secure Boot to validate

Best practices

Security notes

Interview & exam questions

  1. What is the difference between E2, N2, and N2D, and when would you choose each? E2 is the cost-optimised general-purpose family (Intel/AMD abstracted, no SUDs); N2 is balanced Intel; N2D is the same balance on AMD EPYC at a lower per-vCPU price. Use E2 for dev/test and budget services, N2 for predictable Intel production, N2D when AMD price/performance wins.
  2. Explain Spot VMs vs preemptible VMs. Both are deeply discounted surplus capacity with no SLA that Google can reclaim. Preemptible (legacy) has a hard 24-hour lifetime; Spot is the modern replacement with no 24-hour cap and configurable termination action. Use either only for interruptible, stateless, or checkpointed work.
  3. CUDs vs SUDs? Sustained Use Discounts are automatic, applied as an instance runs longer in a month (E2 excluded). Committed Use Discounts require a 1- or 3-year commitment for a larger, predictable saving. Commit CUDs to your baseline; SUDs need no action.
  4. Why might an API call from a VM fail even though the service account has the IAM role? The instance’s legacy access scope is too narrow and caps the token below what IAM allows. Set the scope to cloud-platform and govern with IAM.
  5. OS Login vs metadata SSH keys — which and why? OS Login ties SSH to Google identities and IAM (auditable, centrally revocable, 2FA-capable); metadata keys are static public keys that sprawl and go stale. Prefer OS Login, enforced by org policy.
  6. What does the metadata server do, and how do you query it safely? It serves instance/project metadata and the service-account token at 169.254.169.254; requests require the header Metadata-Flavor: Google, which (with firewalling) guards against SSRF. It powers Application Default Credentials.
  7. Difference between a startup script and a shutdown script? A startup script runs as root on every boot; a shutdown script runs best-effort on graceful stop/delete and on Spot preemption (~30 s) — keep it short and idempotent.
  8. Standard PD vs Balanced vs SSD vs Extreme vs Hyperdisk? Standard is HDD (cheap, low IOPS); Balanced is the SSD default; SSD is higher IOPS per GB; Extreme and Hyperdisk let you provision IOPS/throughput independently of size for the most demanding databases.
  9. Persistent Disk vs Local SSD? PD/Hyperdisk are network-attached, durable, and survive a stop; Local SSD is host-attached, fastest, but ephemeral — data is lost on stop, terminate, or host migration.
  10. What is live migration and when does it not happen? GCE can move a running standard VM to a new host during maintenance with no reboot (onHostMaintenance=MIGRATE). Spot, GPU/accelerator, and Confidential VMs cannot migrate and are terminated on maintenance instead.
  11. Shielded VM vs Confidential VM? Shielded protects boot integrity (Secure Boot, vTPM, integrity monitoring); Confidential encrypts memory in use (AMD SEV-SNP / Intel TDX) so the hypervisor cannot read RAM.
  12. When do you reach for a custom machine type? When no predefined shape matches your vCPU:memory ratio — right-size to avoid paying for unused RAM or CPU, using extended memory for very RAM-heavy needs.

Quick check

  1. Which machine family gives a vCPU as a full physical core (no SMT) and offers an Arm variant?
  2. Which discount applies automatically the longer an instance runs in a month, and which family is excluded from it?
  3. What header must a request to the metadata server include?
  4. Which disk types let you provision IOPS independently of disk size?
  5. What single instance setting should you set to cloud-platform so that IAM becomes the sole permission gate?

Answers

  1. T2D (AMD Milan; one vCPU = one physical core) and its Arm sibling T2A — T2A is the Arm variant. T2D does not use SMT.
  2. Sustained Use Discounts (SUDs) apply automatically; the E2 family is excluded.
  3. Metadata-Flavor: Google.
  4. Extreme Persistent Disk and the Hyperdisk family (Hyperdisk Balanced/Extreme), which decouple performance from capacity.
  5. The instance access scope (--scopes=cloud-platform); permissions are then governed entirely by the attached service account’s IAM roles.

Exercise

Provision a small production-shaped web instance and harden it. Using gcloud: (a) create a dedicated service account with only roles/logging.logWriter and roles/monitoring.metricWriter; (b) launch an e2-small Debian 12 instance with no external IP, Balanced boot disk, --scopes=cloud-platform, enable-oslogin=TRUE, Shielded VM on, the dedicated service account attached, and a startup script that installs nginx; © add the network tag web and confirm a firewall rule allowing the IAP range to port 22; (d) SSH in through IAP, verify nginx is running, and read the instance’s zone from the metadata server; (e) take a snapshot of the boot disk; then (f) delete the instance, the disk, and the snapshot. Note in a sentence why you set the scope to cloud-platform rather than relying on legacy narrow scopes.

Certification mapping

Glossary

Next steps

You can now drive a single Compute Engine instance end to end. The natural next step is to turn one instance into a resilient, self-managing fleet: read Regional Managed Instance Groups: Autohealing, Canary Rollouts, and Stateful MIGs to learn instance templates, autohealing health checks, autoscaling signals, and canary rollouts across zones. From there, continue into the deep dive on Cloud Run for serverless containers when you want to stop managing VMs altogether, and the VPC networking module to master the subnets, firewall rules, and Cloud NAT that the instances in this lesson depend on.

gcpcompute-enginemachine-typespersistent-diskACE
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments