AWS Certification

AWS Certification Prep Kit: CLF, SAA, SOA, DVA, SAP & DOP — Checklists, Practice Questions & Tips

Certifications do not make you an architect, but they do two useful things: they force you to fill the gaps you have been quietly working around, and they give a hiring manager a cheap signal that you have breadth. The trouble with most prep is that it teaches you to recognise answers rather than to reason about scenarios — which is exactly what the modern AWS exams refuse to reward. Since the SAA-C03 and the C02-generation exams landed, almost every question is a short scenario: a workload, a constraint or two (cost, latency, operational overhead, compliance), and four plausible designs. You pass by eliminating the three that violate a constraint, not by remembering a definition.

This kit is built for that reality. It covers the whole AWS ladder — the foundational CLF-C02, the three associates SAA-C03 / SOA-C02 / DVA-C02, and the two professionals SAP-C02 / DOP-C02 — plus a touch of the specialties. For each exam you get the domain breakdown with official weightings, a one-page cheat sheet, and a bank of scenario questions with worked answers and an explanation of why each wrong option is wrong, because the distractor analysis is where the real learning lives. There is a dedicated section on the services examiners deliberately confuse you between (SQS vs SNS vs EventBridge, ALB vs NLB, EBS vs EFS vs S3, security group vs NACL), a recommended order, a study-plan template you can copy, and a plain explanation of how the scaled 100–1000 score actually works so you stop panicking about “how many can I get wrong”.

Learning objectives

By the end of this lesson you will be able to:

Prerequisites

You should already have hands-on AWS exposure roughly equal to the earlier lessons in this course: comfort with the global infrastructure and pricing model (AWS Cloud Fundamentals), IAM (IAM Fundamentals), core compute/storage/networking, and ideally the architecting and portfolio lessons (the Architecting Ladder and Portfolio Projects). This lesson is the readiness layer on top of that knowledge — it assumes the concepts and drills the exam. If a service name here is unfamiliar, treat it as a gap to close before booking the test. This is the final study lesson before the Well-Architected capstone.

The AWS certification ladder and how to choose

AWS groups its certifications into four tiers. The ladder is not strictly linear — there are no formal prerequisites any more — but there is a sensible order, and trying to skip rungs usually wastes money on a failed sitting.

Tier Exam Code Questions Time Cost (USD) Who it is for
Foundational Cloud Practitioner CLF-C02 65 90 min 100 Anyone new to AWS; non-engineers; sales/PM/finance
Associate Solutions Architect – Associate SAA-C03 65 130 min 150 The default engineer/architect cert; broadest value
Associate SysOps Administrator – Associate SOA-C02 65 130 min 150 Operations, SRE, on-call; ops-heavy roles
Associate Developer – Associate DVA-C02 65 130 min 150 Application developers building on AWS SDKs/serverless
Professional Solutions Architect – Professional SAP-C02 75 180 min 300 Senior architects; complex, multi-account, migration scope
Professional DevOps Engineer – Professional DOP-C02 75 180 min 300 Senior platform/SRE; CI/CD, IaC, observability at scale
Specialty Advanced Networking ANS-C01 65 170 min 300 Network specialists; hybrid, Transit Gateway, Direct Connect
Specialty Security SCS-C02 65 170 min 300 Security engineers; the most broadly useful specialty
Specialty Machine Learning MLS-C01 65 180 min 300 ML engineers/data scientists (legacy flagship)
Specialty ML Engineer – Associate MLA-C01 65 130 min 150 Operationalising ML on AWS (the newer, narrower cert)
Specialty Data Engineer – Associate DEA-C01 65 130 min 150 Pipelines, analytics, Glue/Redshift/Kinesis

A few practical notes. Question and time figures are the published targets; AWS includes ~15 unscored items in each exam (you will not be told which), which is why the visible count and your study expectations should never assume every question counts. Prices are the standard global fee in US dollars and vary by region and currency. The professional exams are a genuine step up in difficulty and reading load — 75 dense scenarios in 180 minutes is roughly two and a half minutes per question with a long stem to parse each time.

AWS certification ladder

The diagram above lays the ladder out as a path: start at the foundational rung if you are new, take the associate that matches your job, then climb to the professional in the same column — most people go CLF (optional) → SAA → SAP, or SAA → DOP if their work is platform/CI-CD heavy, and bolt on the Security specialty when their role demands it.

Recommended order

Question formats and how the exam is built

Every AWS exam draws from two basic item types, with a handful of newer styles appearing on some exams:

Format What it is How to handle it
Multiple choice One correct answer, three distractors Read the stem for the deciding constraint, eliminate to one
Multiple response “Select TWO” / “Select THREE” — each correct option scored Treat each option as an independent true/false; partial selections score zero
Ordering Arrange steps into the correct sequence Anchor the first and last steps you are certain of, fill the middle
Matching Pair items across two columns Do the pairs you are sure of first; they constrain the rest
Case study One scenario, several linked questions Read the scenario once carefully; constraints carry across questions

There is no penalty for wrong answers — the score is based only on correct ones — so never leave a question blank. Flag-and-review is available; mark anything that takes more than your per-question budget and come back. The exams are scenario-led: a typical SAA or professional stem describes a workload and then asks for the option that is “MOST cost-effective”, “with the LEAST operational overhead”, “MOST highly available”, or “with the FEWEST changes”. Those capitalised qualifiers are the whole question — two options are often both technically correct and only one satisfies the qualifier.

A repeatable technique that works across all of them:

  1. Read the last sentence first to find what is actually being asked and the deciding qualifier (cost / overhead / latency / availability / changes).
  2. Extract the hard constraints from the stem (compliance, RTO/RPO, “no servers to manage”, “existing on-prem”, a specific protocol).
  3. Eliminate options that violate a constraint — usually two fall immediately.
  4. Choose between the survivors using the qualifier, not your personal preference.
  5. Flag and move on if you are over budget; speed on easy questions buys time for hard ones.

CLF-C02 — Cloud Practitioner

Foundational breadth: cloud value, security and compliance basics, core services, and billing. No deep architecture, no code. The goal is vocabulary and the shape of the platform.

Domain Weighting
1. Cloud Concepts 24%
2. Security and Compliance 30%
3. Cloud Technology and Services 34%
4. Billing, Pricing, and Support 12%

Checklist: shared-responsibility model (who secures what); the value proposition of cloud (capex→opex, elasticity, agility, global reach); the global infrastructure (Regions, AZs, edge locations); core compute (EC2, Lambda, ECS/EKS at a name level), storage (S3 classes, EBS, EFS), database (RDS, Aurora, DynamoDB), networking (VPC, Route 53, CloudFront); IAM basics (users, groups, roles, MFA, root-account protection); the Well-Architected Framework’s six pillars by name; pricing models (On-Demand, Reserved, Savings Plans, Spot, Free Tier) and what drives cost; Billing tools (Cost Explorer, Budgets, Cost and Usage Report); support plans (Basic, Developer, Business, Enterprise On-Ramp, Enterprise) and what each includes; AWS Organizations and consolidated billing; the Trusted Advisor and Health Dashboard at a concept level.

CLF-C02 cheat sheet

SAA-C03 — Solutions Architect Associate

The flagship associate. Heavily scenario-based around designing resilient, performant, secure and cost-optimised architectures.

Domain Weighting
1. Design Secure Architectures 30%
2. Design Resilient Architectures 26%
3. Design High-Performing Architectures 24%
4. Design Cost-Optimized Architectures 20%

Checklist: IAM deep enough to reason about policy evaluation, roles, and cross-account access; S3 (storage classes, lifecycle, replication, encryption, Block Public Access, pre-signed URLs); EC2 + EBS + EFS + the purchasing options; Auto Scaling and ELB (ALB vs NLB vs GWLB); VPC design (subnets, route tables, IGW/NAT, security groups vs NACLs, VPC endpoints, peering, Transit Gateway at a concept level); RDS Multi-AZ vs read replicas, Aurora, DynamoDB (capacity modes, GSIs, global tables, DAX); decoupling with SQS/SNS/EventBridge; serverless (Lambda, API Gateway, Step Functions); CloudFront + Route 53 routing policies; caching (CloudFront, ElastiCache, DAX); encryption with KMS; resilience patterns (Multi-AZ, multi-Region, backup/restore vs pilot light vs warm standby vs active-active); cost tools and the cheapest-that-meets-requirements instinct.

SAA-C03 cheat sheet

SOA-C02 — SysOps Administrator Associate

Operations focus: deploy, manage, and operate workloads; monitoring, automation, security and compliance, networking, and cost/performance. Historically the only AWS exam with hands-on lab questions; AWS has at times paused the labs, so confirm the current format on the official exam guide before you book.

Domain Weighting
1. Monitoring, Logging, and Remediation 20%
2. Reliability and Business Continuity 16%
3. Deployment, Provisioning, and Automation 18%
4. Security and Compliance 16%
5. Networking and Content Delivery 18%
6. Cost and Performance Optimization 12%

Checklist: CloudWatch (metrics, custom metrics, alarms, composite alarms, dashboards, Logs, Logs Insights, agent); CloudTrail (management vs data events, organisation trails); AWS Config (rules, conformance packs, remediation); Systems Manager (Parameter Store, Session Manager, Run Command, Patch Manager, State Manager, Automation runbooks); EventBridge for automated remediation; Auto Scaling lifecycle hooks and health checks; backup (AWS Backup, EBS snapshots, RDS automated backups, lifecycle); CloudFormation (stacks, change sets, drift, StackSets, nested stacks); ELB health checks and access logs; VPC operations (flow logs, Reachability Analyzer, route troubleshooting); Trusted Advisor and Cost Explorer; quotas/Service Quotas; encryption operations and certificate management with ACM.

SOA-C02 cheat sheet

DVA-C02 — Developer Associate

For application developers building on AWS — serverless, SDK behaviour, deployment, security from the code’s point of view, and troubleshooting.

Domain Weighting
1. Development with AWS Services 32%
2. Security 26%
3. Deployment 24%
4. Troubleshooting and Optimization 18%

Checklist: Lambda in depth (handlers, environment variables, layers, versions/aliases, concurrency — reserved vs provisioned, event source mappings, destinations, SnapStart); API Gateway (REST vs HTTP APIs, stages, authorizers, throttling, caching, mapping templates); DynamoDB for developers (queries vs scans, partition-key design, conditional writes, optimistic locking, DynamoDB Streams, TTL, transactions); S3 SDK patterns (multipart upload, pre-signed URLs, event notifications); messaging (SQS visibility timeout, long polling, DLQs; SNS; EventBridge); IAM for code (roles vs keys, STS, least privilege, resource policies); Secrets Manager and Parameter Store; the exponential backoff with jitter retry pattern and idempotency; X-Ray tracing and instrumentation; deployment with SAM, CodeDeploy (in-place vs blue/green, canary/linear), CodePipeline/CodeBuild; caching strategies (write-through vs lazy loading) with ElastiCache/DAX; envelope encryption with KMS.

DVA-C02 cheat sheet

SAP-C02 — Solutions Architect Professional

The senior architecture exam: complex, multi-account, organisation-scale design; migration and modernisation; cost control and continuous improvement across large estates. Long stems, multiple defensible options, decided by subtle constraints.

Domain Weighting
1. Design Solutions for Organizational Complexity 26%
2. Design for New Solutions 29%
3. Continuous Improvement for Existing Solutions 25%
4. Accelerate Workload Migration and Modernization 20%

Checklist: multi-account strategy with Organizations, SCPs, Control Tower, landing zones, IAM Identity Center; cross-account networking (Transit Gateway, PrivateLink, Direct Connect, hybrid DNS with Route 53 Resolver); advanced resilience and DR (RTO/RPO trade-offs across the four strategies, multi-Region active-active with DynamoDB global tables and Aurora Global Database, Route 53 failover/latency/geo routing); migration tooling (Application Migration Service/MGN, DMS + SCT, DataSync, Snow family, Migration Hub, the 7 Rs); cost governance at scale (Savings Plans across accounts, consolidated billing, allocation tags, Budgets, anomaly detection); security at scale (GuardDuty, Security Hub, Macie, KMS multi-Region keys, Secrets Manager rotation); decoupling and modernisation (containers vs serverless trade-offs, event-driven, Step Functions); data strategy across analytics services. The skill being tested is judgement under competing constraints, not recall.

SAP-C02 cheat sheet

DOP-C02 — DevOps Engineer Professional

Senior platform/SRE: CI/CD, infrastructure as code, configuration management, monitoring/logging, incident response, and security automation across the SDLC. It overlaps heavily with the combination of DVA and SOA experience.

Domain Weighting
1. SDLC Automation 22%
2. Configuration Management and IaC 17%
3. Resilient Cloud Solutions 15%
4. Monitoring and Logging 15%
5. Incident and Event Response 14%
6. Security and Compliance Automation 17%

Checklist: the CodeCatalyst/Code* suite (CodePipeline, CodeBuild, CodeDeploy, CodeArtifact) and integrating third-party CI; deployment strategies in depth (in-place, blue/green with ELB/Route 53, canary/linear for Lambda and ECS, all-at-once) and automated rollback on CloudWatch alarms; CloudFormation mastery (StackSets, nested stacks, change sets, drift, custom resources, hooks) plus CDK and an awareness of Terraform; configuration management with Systems Manager and OpsWorks legacy; resilience automation (Auto Scaling, multi-AZ/Region, AWS Backup, self-healing via EventBridge→SSM/Lambda); observability (CloudWatch metrics/alarms/Logs Insights, X-Ray, synthetics, ServiceLens, centralised logging); incident response (EventBridge patterns, Systems Manager Incident Manager, runbooks, GuardDuty→remediation); security automation (Config rules + auto-remediation, Security Hub, Secrets Manager rotation, IAM Access Analyzer, image scanning). The exam rewards automation that removes humans from the loop.

DOP-C02 cheat sheet

A touch of the specialties

You will not study these from this kit, but a SAP/DOP candidate should recognise where they begin:

Specialty Code The one-line scope The signature services
Advanced Networking ANS-C01 Hybrid + complex VPC connectivity Transit Gateway, Direct Connect, Route 53 Resolver, Global Accelerator, Network Firewall
Security SCS-C02 Detective + preventive + data protection GuardDuty, Security Hub, Macie, KMS, IAM, WAF/Shield, Detective
Machine Learning MLS-C01 End-to-end ML lifecycle SageMaker, data engineering for ML, modelling, ops
ML Engineer – Associate MLA-C01 Operationalising ML SageMaker pipelines, deployment, monitoring
Data Engineer – Associate DEA-C01 Pipelines and analytics Glue, Redshift, Kinesis, EMR, Lake Formation, Athena

SCS-C02 (Security) is the highest-value addition for most engineers because security questions leak into every other exam. ANS-C01 is worth it if hybrid networking is your day job. The data/ML certs are discipline-specific.

Scenario practice questions with explained answers

This is the core of the kit. Work each one cold: read the stem, decide your answer, then read the explanation. Pay attention to the distractor analysis — being able to say why a wrong option is wrong is the skill the exam tests.

Q1 (SAA-C03) — decoupling and fan-out

A retail application must, on each new order, (a) update inventory, (b) email the customer, and © push the event to an analytics pipeline — independently, durably, and with each consumer able to retry without affecting the others. Which design is most appropriate?

A. Publish the order to an SNS topic; subscribe three SQS queues; each downstream service polls its own queue. B. Write the order to a single SQS queue that all three services poll. C. Invoke three Lambda functions synchronously from the order service. D. Publish to an SNS topic with three direct Lambda subscriptions.

Answer: A. SNS fan-out into per-consumer SQS queues gives each consumer its own durable buffer, independent retries, and a DLQ — the classic durable fan-out pattern.

Distractor analysis. B is wrong because a single shared queue means each message is consumed once by one poller; the three services would compete for the same messages, not each get a copy. C couples the order service’s latency and availability to all three downstreams and has no durability — a failed downstream fails the order. D fans out but loses the buffer: if a Lambda subscriber throttles or errors past its retries, the message can be lost; SQS between SNS and the consumer is what makes it durable and independently retryable.

Q2 (SAA-C03) — load balancer choice

A multiplayer game backend needs a load balancer that handles millions of TCP connections at ultra-low latency and must expose a static IP for an allow-list partners maintain. Which load balancer?

A. Application Load Balancer B. Network Load Balancer C. Gateway Load Balancer D. Classic Load Balancer

Answer: B. NLB operates at layer 4 (TCP/UDP/TLS), scales to millions of connections with very low latency, and provides a static IP per AZ (and supports Elastic IPs) — ideal for partner allow-listing.

Distractor analysis. A is layer 7 HTTP/HTTPS; it does not expose a static IP (only a DNS name) and adds latency unsuited to raw TCP gaming traffic. C is for inserting third-party network appliances inline, not for serving application traffic. D is legacy and should not be chosen for new designs.

Q3 (SAA-C03) — shared file storage

Three EC2 instances across two Availability Zones must read and write the same files concurrently with POSIX semantics. Which storage service?

A. Amazon EBS io2 volume attached to all three instances B. Amazon S3 mounted via the SDK C. Amazon EFS D. Instance store

Answer: C. EFS is a managed, multi-AZ NFS file system that many instances can mount and share with POSIX semantics — exactly the requirement.

Distractor analysis. A is wrong: a standard EBS volume attaches to a single instance in a single AZ (Multi-Attach exists only for io1/io2 within one AZ and needs a cluster-aware filesystem — it does not span AZs). B is object storage, not a POSIX filesystem; concurrent read/write file semantics do not apply. D is ephemeral, instance-local, and lost on stop — never shared.

Q4 (SOA-C02) — missing metrics

An operator needs a CloudWatch alarm on memory utilisation of a fleet of EC2 instances but cannot find the metric. What is the correct fix?

A. Enable detailed monitoring on the instances. B. Install and configure the CloudWatch agent to publish a memory metric. C. Raise a support case to enable the metric. D. Use Compute Optimizer instead.

Answer: B. Memory (and disk) are guest-OS metrics that AWS cannot see from the hypervisor; you must install the CloudWatch agent to publish them as custom metrics.

Distractor analysis. A detailed monitoring only changes EC2 metric granularity from 5-minute to 1-minute — it does not add memory. C is unnecessary; this is a configuration task, not an account flag. D Compute Optimizer gives right-sizing recommendations, not a real-time alarmable memory metric.

Q5 (SOA-C02) — who changed it

A security group rule changed unexpectedly and the operator must find who made the change and when. Which service answers this?

A. Amazon CloudWatch Logs B. AWS CloudTrail C. AWS Config D. VPC Flow Logs

Answer: B. CloudTrail records the API call — the principal, the time, the parameters — for the AuthorizeSecurityGroupIngress/Revoke... action. That is the “who did what, when” audit.

Distractor analysis. C Config tells you the security group’s state changed and can show a before/after configuration item, but the authoritative actor/identity attribution is CloudTrail (Config even references the CloudTrail event). A holds application/system logs, not the AWS API audit. D captures network traffic metadata, not control-plane changes. (In practice Config + CloudTrail are used together — but the who is CloudTrail.)

Q6 (DVA-C02) — duplicate processing

A Lambda consumer reading from SQS occasionally processes the same message twice. The processing takes up to 90 seconds. What is the most likely cause and fix?

A. The DLQ is misconfigured; add a DLQ. B. The queue is standard not FIFO; switch to FIFO. C. The visibility timeout is shorter than the processing time; increase it. D. Long polling is disabled; enable it.

Answer: C. If processing (90 s) exceeds the visibility timeout, the message becomes visible again and a second consumer picks it up — classic double-processing. Set the visibility timeout safely above the max processing time (and ideally 6× the function timeout for Lambda event source mappings).

Distractor analysis. A a DLQ handles poison messages after repeated failures; it does not stop a successfully-processing message from being redelivered early. B FIFO guarantees ordering and exactly-once processing within the dedup window, but the root cause here is the timeout; switching queue type is a heavier, often unnecessary change and FIFO has throughput limits. D long polling reduces empty receives and cost; it has nothing to do with redelivery.

Q7 (DVA-C02) — safe concurrent updates

Two Lambda invocations may update the same DynamoDB item concurrently; the application must prevent a lost update without a separate lock service. Which approach?

A. Enable DynamoDB Streams. B. Use a conditional write with a version attribute (optimistic locking). C. Switch the table to provisioned capacity. D. Use a global secondary index.

Answer: B. Optimistic locking — a version attribute plus a ConditionExpression that the version is unchanged — makes the write fail if another writer got there first, so the loser retries. No external lock needed.

Distractor analysis. A Streams capture changes for downstream processing; they do not coordinate concurrent writers. C capacity mode affects throughput/cost, not consistency between writers. D a GSI is an alternate query path, irrelevant to write conflicts.

Q8 (SAP-C02) — org-wide guardrail

A platform team must guarantee that no account in a production OU can disable CloudTrail, regardless of any IAM permissions an account admin grants. What enforces this?

A. An IAM policy attached to every role in those accounts. B. A Service Control Policy on the production OU denying cloudtrail:StopLogging and cloudtrail:DeleteTrail. C. AWS Config rules detecting the change. D. A permission boundary on each admin user.

Answer: B. An SCP sets the maximum permissions for every principal in the OU; an explicit deny on the CloudTrail stop/delete actions cannot be overridden by any IAM grant inside the account. That is the only option that is preventive and unconditional across the OU.

Distractor analysis. A per-role IAM policies can be changed or bypassed by an account admin and must be maintained on every principal — not a guarantee. C Config is detective: it tells you after the fact, it does not prevent the action. D permission boundaries limit specific principals, not the whole account, and an admin could create principals outside the boundary or alter it; they are not an org-wide guarantee.

Q9 (SAP-C02) — multi-Region low RTO

A global write-heavy application needs active-active in two Regions with a recovery point and time measured in seconds for its primary data store. Which data layer?

A. RDS Multi-AZ with a cross-Region read replica. B. DynamoDB global tables. C. Aurora with a cross-Region snapshot copy schedule. D. S3 Cross-Region Replication.

Answer: B. DynamoDB global tables provide multi-active, multi-Region replication with last-writer-wins conflict resolution — writes accepted in every Region, RPO/RTO in seconds. That matches “active-active, write-heavy, seconds”.

Distractor analysis. A Multi-AZ is single-Region HA; a cross-Region read replica is read-only and promotion is manual — not active-active and not seconds. C snapshot copies give an RPO of however often you copy (hours), not seconds, and are restore-based. D S3 is object storage and asynchronous; it is not the application’s transactional write store. (Aurora Global Database would be the relational answer for low-RTO multi-Region, but the option given is snapshot copy, which is the trap.)

Q10 (DOP-C02) — automatic rollback

A team deploys an ECS service via CodeDeploy blue/green and wants the deployment to automatically roll back if error rates spike during the canary. What wires this up?

A. A manual approval action in CodePipeline. B. A CloudWatch alarm associated with the CodeDeploy deployment group so a breach triggers automatic rollback. C. A Lambda function polling logs after deployment. D. Enabling termination protection on the tasks.

Answer: B. CodeDeploy can be configured with CloudWatch alarms; if an alarm goes into ALARM during deployment, CodeDeploy halts and rolls back automatically — humans stay out of the loop, which is exactly the DevOps-professional instinct.

Distractor analysis. A a manual approval inserts a human and a delay; it does not react to error rates. C a polling Lambda is a fragile reinvention of a built-in feature and runs after the window. D termination protection prevents accidental task termination; it has nothing to do with rollback on metrics.

Q11 (DOP-C02) — self-healing remediation

When GuardDuty detects an EC2 instance making connections to a known crypto-mining endpoint, the platform must automatically isolate the instance with zero human action. What pattern achieves this?

A. A CloudWatch dashboard with an alarm emailing the on-call. B. An EventBridge rule matching the GuardDuty finding that triggers an SSM Automation runbook (or Lambda) to apply an isolation security group. C. AWS Config with a conformance pack. D. A scheduled Lambda that scans for findings hourly.

Answer: B. GuardDuty emits findings as events; an EventBridge rule on the finding type invokes an SSM Automation runbook / Lambda that swaps the instance into an isolation security group — event-driven, immediate, no human.

Distractor analysis. A emailing on-call is detection plus a human, not automatic remediation. C Config evaluates resource configuration compliance; it does not react to threat findings. D an hourly scan adds up to an hour of dwell time and reinvents the native event integration.

Q12 (CLF-C02) — shared responsibility

Under the AWS shared-responsibility model, which task is the customer’s responsibility?

A. Patching the hypervisor on the EC2 host. B. Configuring security groups and encrypting application data. C. Maintaining the physical security of data centres. D. Replacing failed disks in the storage fleet.

Answer: B. The customer is responsible for security in the cloud — IAM, security group rules, OS patching on EC2, and choosing/managing encryption of their data.

Distractor analysis. A, C and D are all AWS’s responsibility of the cloud — the hypervisor, physical security, and hardware lifecycle are managed by AWS.

Q13 (SAA-C03) — cost optimisation with a constraint

A nightly batch job runs for two hours, is fully fault-tolerant (checkpoints and resumes), and the team wants the lowest compute cost. Which purchasing option?

A. On-Demand Instances. B. A 3-year Standard Reserved Instance. C. Spot Instances. D. A 1-year Compute Savings Plan.

Answer: C. Spot is the cheapest (up to ~90% off On-Demand) and is appropriate precisely because the workload is interruptible and fault-tolerant — the deciding constraint in the stem.

Distractor analysis. B and D commit you to 1–3 years of baseline usage; for a job that runs two hours a night you would pay for capacity you do not use, so they are not lowest cost here. A On-Demand is more expensive than Spot and brings no benefit for a fault-tolerant job. The fault-tolerance is the signal that Spot is safe.

Commonly-confused services — the exam tips

A surprising share of associate-level questions reduce to telling two similar services apart. Burn these distinctions in.

SQS vs SNS vs EventBridge

SQS SNS EventBridge
Model Queue (point-to-point, pull) Pub/sub (push, fan-out) Event bus / router (push, filtered)
Consumers One consumer per message Many subscribers, each gets a copy Many targets via rules
Buffering/retry Yes — durable buffer, DLQ Limited; pair with SQS for durability Retries + DLQ to targets
Filtering No (consumer filters) Subject/attribute filtering Rich content-based pattern matching
Sources Your producers Your publishers AWS services, SaaS partners, custom
Pick when Decouple + smooth load + retries Broadcast one message to many Route/filter events, schedule, integrate SaaS

Tip: “fan-out durably” = SNS → SQS. “Route events from AWS/SaaS with filtering or on a schedule” = EventBridge. “Buffer work between a producer and a worker” = SQS.

ALB vs NLB (vs GWLB)

ALB NLB GWLB
Layer 7 (HTTP/HTTPS) 4 (TCP/UDP/TLS) 3/4 (GENEVE)
Routing Path, host, header, method Connection (flow hash) To/from appliances
Static IP No (DNS only) Yes (per-AZ; Elastic IP) n/a
Latency Higher (L7 processing) Very low Inline appliance
Pick when Web apps, microservice routing, WebSockets Extreme performance, TCP/UDP, static IP, TLS passthrough Insert firewalls/IDS/IPS inline

Tip: static IP or non-HTTP or millions of low-latency connections → NLB. HTTP routing on path/host → ALB. Third-party security appliance inline → GWLB.

EBS vs EFS vs S3

EBS EFS S3
Type Block File (NFS, POSIX) Object
Access One instance (one AZ) Many instances, multi-AZ Internet-scale, many clients
Durability/scope AZ-scoped volume Regional, elastic 11 nines, global namespace
Use Boot/database volumes Shared app files, lift-and-shift Backups, data lake, static assets, large objects

Tip: “attached to one instance / database disk” → EBS. “Several instances share the same files” → EFS. “Objects, web assets, backups, virtually unlimited” → S3.

Security group vs NACL

Security group Network ACL
Scope Instance/ENI level Subnet level
State Stateful (return traffic auto-allowed) Stateless (must allow return explicitly)
Rules Allow only Allow and deny
Evaluation All rules evaluated Numbered, lowest first, first match wins
Default Deny inbound, allow outbound Default allows all; custom denies all until rules added

Tip: need an explicit deny (e.g. block one IP) or subnet-wide control → NACL. Per-instance allow-listing with automatic return traffic → security group. The single most-tested fact: security groups are stateful, NACLs are stateless.

Hands-on lab — a free, self-marking practice harness

You cannot replicate the real exam, but you can build the habit of timed, scenario-style practice for free. This lab spins up nothing chargeable — it uses the AWS Free Tier only to confirm the service facts behind a few questions, then a tiny local quiz loop to drill the elimination technique.

Step 1 — confirm a fact the exam will test (free, read-only). Verify that a default EC2 instance has no memory metric, which is Q4’s point:

# Lists CloudWatch metrics in the EC2 namespace for your account.
# You will see CPUUtilization, NetworkIn/Out, etc. — but NO mem_used_percent
# unless the CloudWatch agent is installed. That absence IS the lesson.
aws cloudwatch list-metrics --namespace AWS/EC2 \
  --query "Metrics[].MetricName" --output text | tr '\t' '\n' | sort -u

Expected output: a list including CPUUtilization, NetworkIn, NetworkOut, StatusCheckFailed, and similar — and the conspicuous absence of any memory metric. list-metrics is a read-only call and is free.

Step 2 — confirm security-group statefulness conceptually (free). Describe the default security group and note it has an outbound allow-all and a restrictive inbound — return traffic for allowed inbound is automatic because the group is stateful:

aws ec2 describe-security-groups \
  --filters Name=group-name,Values=default \
  --query "SecurityGroups[0].{In:IpPermissions,Out:IpPermissionsEgress}"

Step 3 — build a local timed quiz loop (no AWS, no cost). Save a few questions as JSON and drill them with a timer so you practise the budget (about 2 minutes each):

cat > /tmp/quiz.json <<'JSON'
[
  {"q":"Durable fan-out to 3 independent consumers?","a":"SNS->SQS per consumer"},
  {"q":"Static IP + millions of low-latency TCP connections?","a":"NLB"},
  {"q":"Several instances share the same POSIX files?","a":"EFS"},
  {"q":"Stop any account in an OU from disabling CloudTrail?","a":"SCP deny"},
  {"q":"Who changed a security group rule, and when?","a":"CloudTrail"}
]
JSON

python3 - <<'PY'
import json, time
qs = json.load(open("/tmp/quiz.json"))
score = 0
for i, item in enumerate(qs, 1):
    start = time.time()
    print(f"\nQ{i}: {item['q']}")
    input("  (think, then press Enter to reveal) ")
    print(f"  Answer: {item['a']}   [{time.time()-start:0.0f}s]")
    if input("  Did you get it right? (y/n) ").strip().lower() == "y":
        score += 1
print(f"\nScore: {score}/{len(qs)}  — aim for sub-120s per question.")
PY

Validation: Step 1 should show no memory metric (proving Q4); Step 3 should report your score and per-question time. If any single question took more than ~120 seconds, that is a topic to revise.

Cleanup: there is nothing chargeable to delete — only remove the temp files:

rm -f /tmp/quiz.json

Cost note: every command here is either a read-only API call (list-metrics, describe-security-groups — free) or runs locally. The lab cost is 0. The lesson: build the timed-elimination habit before you pay for the real sitting.

Common mistakes & troubleshooting

Symptom Likely cause Fix
You “know the services” but fail practice scenarios Answering on recognition, not by eliminating against the constraint Read the last sentence first; find the qualifier (cost/overhead/latency); eliminate
Multiple-response questions score zero despite “mostly right” Partial credit does not exist — one wrong selection voids the item Treat each option as independent true/false; only select what you can defend
Running out of time on the professional exams Spending too long on hard items early Budget ~2 min (assoc) / ~2.5 min (pro); flag-and-move; never leave blanks
Confusing two similar services repeatedly (SQS/SNS, ALB/NLB) Studied features in isolation, not side by side Drill the comparison tables in this lesson until the distinctions are reflexive
Picking a “correct but not best” option Ignoring the capitalised qualifier (MOST/LEAST/FEWEST) Underline the qualifier mentally; choose among technically-correct survivors by it
Over-engineering the answer Reaching for the most advanced service Prefer the option that meets the stated requirement with the least overhead/cost
Booking too early and failing No timed full-length practice at passing standard Sit timed mocks; only book when consistently above the passing range
Panicking over “how many can I miss” Misunderstanding scaled scoring It is scaled 100–1000 with ~15 unscored items; calibrate on practice %, not raw counts

Best practices

Security notes

Certification study is also security study — much of every blueprint is security, and the habits transfer to production:

Interview & exam questions

  1. Q: When would you choose SNS→SQS over a direct SNS→Lambda subscription? A: When each consumer needs a durable buffer, independent retries, and a DLQ; SQS decouples consumer availability/throttling from delivery so a slow or failing consumer cannot lose messages.

  2. Q: Security group vs NACL — give the two differences that decide most questions. A: Security groups are stateful (return traffic auto-allowed) and allow-only, applied at the instance/ENI; NACLs are stateless (must allow return explicitly), support deny rules, and apply at the subnet.

  3. Q: A read replica versus Multi-AZ on RDS — what does each give you? A: Multi-AZ is a synchronous standby for availability/failover (not for reads); read replicas are asynchronous, for read scaling and can be cross-Region. They solve different problems and are often used together.

  4. Q: What makes an SQS consumer process the same message twice, and how do you stop it? A: The visibility timeout is shorter than processing time, so the message reappears mid-processing. Raise the visibility timeout above the maximum processing time (for Lambda event source mappings, ~6× the function timeout).

  5. Q: How do you guarantee no account in an OU can disable a control, regardless of IAM? A: A Service Control Policy with an explicit deny on the relevant actions; SCPs cap the maximum permissions for every principal in the OU and cannot be overridden by in-account IAM grants.

  6. Q: Which AWS service answers “who changed this resource and when”? A: CloudTrail records the API call with the principal, time, and parameters. Config shows the state change and references the CloudTrail event, but the identity attribution is CloudTrail.

  7. Q: NLB or ALB for a service needing a static IP and TCP at scale? A: NLB — layer 4, ultra-low latency, millions of connections, and a static IP per AZ (plus Elastic IP support). ALB exposes only a DNS name and is layer 7.

  8. Q: How do you achieve automatic rollback on a bad deployment? A: Associate CloudWatch alarms with the CodeDeploy deployment group (or the Lambda/ECS deployment); an alarm breach during the canary/linear window halts and rolls back automatically — no human in the loop.

  9. Q: Lazy loading vs write-through caching — trade-offs? A: Lazy loading populates the cache on a miss (cheap, resilient to cache failure, but can serve stale data and has a cold-cache penalty); write-through writes to the cache on every DB write (data is never stale, but every write costs a cache write and the cache fills with rarely-read data). Often combined with a TTL.

  10. Q: Give the DR strategies in order of RTO/RPO and cost. A: Backup & Restore (cheapest, hours) → Pilot Light (core running, minutes) → Warm Standby (scaled-down full stack, low minutes) → Multi-site Active-Active (full scale in multiple Regions, seconds, most expensive).

  11. Q: EBS, EFS, or S3 for a fleet of app servers that must share the same uploaded files? A: EFS — a shared, multi-AZ NFS filesystem many instances mount with POSIX semantics. EBS is single-instance/single-AZ block; S3 is object storage without filesystem semantics.

  12. Q: How does scaled scoring work and how should it change your strategy? A: Scores are scaled 100–1000 with a fixed passing line per exam (commonly ~700), and ~15 items are unscored pilots you cannot identify. Because you cannot know which count and there is no wrong-answer penalty, you answer every question and calibrate readiness on your practice percentage, not a raw “allowed misses” count.

Quick check

  1. You must fan out one order event to three consumers, each with its own durable buffer and retries. What do you use?
  2. Which is stateful — a security group or a NACL?
  3. A workload is fully fault-tolerant and you want the cheapest compute. Which purchasing option?
  4. Which service tells you who disabled CloudTrail and when?
  5. The exam score is scaled over what range, and should you ever leave a question blank?

Answers

  1. SNS → SQS with one SQS queue per consumer (durable fan-out). A direct SNS→Lambda fan-out lacks the buffer.
  2. The security group is stateful (return traffic auto-allowed); the NACL is stateless.
  3. Spot Instances — the fault-tolerance is the signal that interruption is acceptable, and Spot is the cheapest.
  4. AWS CloudTrail (the API audit; Config references the same CloudTrail event for attribution).
  5. 100–1000, with a fixed pass line and ~15 unscored items. Never leave a question blank — there is no penalty for guessing.

Exercise

Pick the next exam you intend to sit and produce a one-page readiness plan of your own:

  1. Download the official exam guide for your target (e.g. SAA-C03) and copy its domain table with weightings.
  2. Self-score 1–5 per domain on honest current confidence, then multiply each gap by the domain weighting to get a priority score — study the highest-priority gaps first.
  3. Write your own four “confused-services” cards for the pairs you personally muddle (beyond the four in this lesson).
  4. Draft a four-week plan using the template below, ending with two timed full-length mocks.
  5. Book the exam for the end of week four to create the deadline — and only move it if your timed mocks are not yet consistently above the passing range.

Four-week study-plan template (adapt to your timeline and exam):

Week Focus Activity Output
1 Highest-weighted/lowest-confidence domain Read exam guide + course lessons; build one small lab Notes + a deployed mini-project
2 Next two domains Hands-on for each; start a confused-services sheet Working examples + the sheet
3 Remaining domains + cross-cutting (security, cost) Targeted reading; first timed mock Mock score + error log
4 Weak areas from the mock Re-drill errors; second timed mock; light review Consistent pass-range mocks → sit the exam

Certification mapping

This lesson is the readiness layer for the entire AWS ladder: CLF-C02, SAA-C03, SOA-C02, DVA-C02, SAP-C02 and DOP-C02, with pointers into the specialties (ANS-C01, SCS-C02, MLS-C01/MLA-C01, DEA-C01). The domain checklists and weightings map directly to each official exam guide; the practice questions are tagged by exam; the confused-services section targets the associate level where those distinctions decide the most questions; and the scoring/format notes apply to every exam in the catalogue.

Glossary

Next steps

You now have the checklists, the question-working technique, the confused-services distinctions, and a plan. Turn study into proof by building the real thing: continue to the AWS Capstone — Build a Well-Architected Multi-Account Landing Zone + 3-Tier App, which exercises the SAA/SAP/DOP blueprints end to end. For depth on the topics the questions probe, revisit the AWS Architecting Ladder, Portfolio Projects, IAM Fundamentals, and the troubleshooting playbooks (single-service and multi-service RCA). Book the date, work the plan, and pass.

AWSCertificationSAA-C03DVA-C02SAP-C02Exam Prep
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading