Architecture AWS

AWS Well-Architected: Sustainability — Region Selection, Demand, Software, Data, Hardware, and Deployment Patterns

Where this fits

Sustainability is the sixth and newest pillar of the AWS Well-Architected Framework (added in late 2021, after Operational Excellence, Security, Reliability, Performance Efficiency, and Cost Optimization). Its single design goal is to maximize the useful work done per unit of resource consumed so that the carbon, energy, water, and embodied-material footprint of a workload falls over time. AWS frames this through the shared responsibility model for sustainability: AWS is responsible for sustainability of the cloud (efficient data centers, renewable-energy procurement, custom silicon, the path to net-zero), while you are responsible for sustainability in the cloud — the architecture, code, and data choices that govern how much of that infrastructure your workload actually draws on. The pillar decomposes into six improvement areas, each expressed as a numbered best-practice question (SUS 1 through SUS 6): Region selection, alignment to demand (user-behavior patterns), software and architecture patterns, data patterns, hardware patterns, and development and deployment process patterns. This article walks each area as you would implement it in a real AWS estate, naming the concrete metrics, artifacts, and services involved.

Before the six areas, internalize the pillar’s working method, because it is unusual: sustainability is hard to measure directly, so the Framework tells you to establish proxy metrics — energy, or more usefully a business-output-normalized metric such as “watt-hours per 1,000 requests,” “gCO₂e per active user,” or “vCPU-hours per transaction.” You set improvement targets against those proxies, evaluate every change for whether it raises useful work per resource, and treat the sustainability of your demand, data, and deployment as a continuous optimization — not a one-off audit. Keep that “maximize utilization, minimize total provisioned resources, normalize per unit of business value” lens active through every section below.

AWS Well-Architected Framework — animated overview

Region selection (SUS 1)

What it is. Region selection is the decision of where on the planet your workload physically runs, evaluated through a sustainability lens rather than only latency and cost. It maps to SUS 1 (“How do you select Regions for your workload?”). Two Regions running identical infrastructure can have very different carbon footprints because the local electricity grid has a different fuel mix, and because AWS’s renewable-energy matching varies by location.

Why it matters. Region choice is often the single highest-leverage sustainability decision you make, and it is essentially free to get right at design time and painful to change later. The grid carbon intensity of the bulk-power grid that feeds a data center can vary by an order of magnitude between a hydro- or nuclear-heavy grid and a coal-heavy one. Choosing the cleaner Region for a new, latency-insensitive workload can cut its operational emissions dramatically with zero code change.

How to do it well. Start from your hard constraints — data-residency and compliance, the latency your users can tolerate, and which services the workload needs (not every service is in every Region). Within the set of Regions that satisfy those constraints, prefer the one with the lowest grid carbon intensity and the strongest renewable matching. AWS publishes which Regions are powered by a high share of renewable energy and reports on its progress toward powering operations with 100% renewable energy; combine that with public grid-intensity data (e.g., the kind surfaced by Electricity Maps / WattTime) to rank candidates. For new, non-latency-critical work — batch analytics, ML training, async pipelines, dev/test — deliberately place it in a low-carbon Region even if it is not the closest one, since the user-facing latency penalty is irrelevant for asynchronous work. Keep latency-sensitive interactive traffic close to users, but push everything else toward the cleanest viable Region. Re-evaluate periodically: AWS continually brings new renewable projects online, so a Region’s profile improves over time, and your placement assumptions should be revisited.

Decision input Where it comes from How it shapes Region choice
Data residency / compliance Legal, GDPR/sovereignty rules Hard filter — eliminates non-compliant Regions first
User latency budget RUM / synthetic latency tests Interactive traffic stays near users; async does not
Service & feature availability AWS Regional Services List Filters Regions lacking required services
Grid carbon intensity Electricity Maps / WattTime, AWS renewable reporting Rank survivors by gCO₂e/kWh; prefer the cleanest
Renewable matching status AWS Sustainability site, CCFT trend Prefer Regions AWS reports as renewable-powered

Artifacts and decisions. A documented Region-selection decision record per workload (constraints considered, candidate Regions, the carbon ranking, and the final choice with rationale); a workload-placement map distinguishing latency-critical from latency-tolerant components; and a policy that new asynchronous/batch workloads default to a designated low-carbon Region. The recurring decision is the trade-off between proximity and grid cleanliness — resolve it by component, not by workload: it is entirely legitimate to serve the API from a near Region while running nightly training jobs in a far, clean one.

Alignment to demand — user-behavior patterns (SUS 2)

What it is. This area is about scaling the resources you provision to match real demand as tightly as possible, and about influencing user and consumer behavior so that demand itself becomes more efficient. It maps to SUS 2 (“How do you take advantage of user-behavior patterns to support your sustainability goals?”). The core insight: every idle resource is pure waste — it consumes energy and embodied carbon while doing zero useful work.

Why it matters. Most environments are provisioned for peak (or peak-times-a-safety-factor) and then run far below that peak the vast majority of the time. The gap between provisioned and used capacity is the largest, most reliably recoverable source of wasted energy in a typical estate. Shrinking it directly raises the “useful work per resource” ratio the whole pillar optimizes for.

How to do it well. Eliminate the idle gap from both sides — supply and demand.

Lever Mechanism AWS service Effect on the idle gap
Dynamic scaling Target-tracking / predictive EC2 Auto Scaling, Karpenter Tracks load instead of static peak
Scale to zero Serverless / on-demand Lambda, Fargate, DynamoDB, Aurora Serverless v2 Near-zero resource when idle
Time-based off Stop non-prod off-hours AWS Instance Scheduler Removes nights/weekends (~70% of week)
Demand shaping Queue + batch + off-peak SQS, EventBridge, Batch Fewer, fuller instances; shift to clean hours
Demand reduction Caching, right-sized payloads CloudFront, ElastiCache Less origin work per user request

Artifacts and decisions. A scaling policy per service (metric, target, min/max); a non-production scheduling calendar; a demand-shaping design that names which flows are async and buffered; and a utilization baseline that defines your “idle” threshold. The key decision is your minimum-capacity floor: set it too high and you re-create the idle gap; set it too low and you risk cold-start or scaling lag on real spikes — tune it against measured traffic, not guesses.

Software and architecture patterns (SUS 3)

What it is. This area covers the software-design and architecture choices that determine how much compute, memory, and network a workload needs to deliver a given outcome. It maps to SUS 3 (“How do you take advantage of software and architecture patterns to support your sustainability goals?”). It is the difference between two systems that do the same thing where one needs half the fleet.

Why it matters. Inefficient software taxes every resource underneath it forever. A hot loop that wastes CPU, a chatty service mesh, or an event handler that polls instead of reacts multiplies energy use across every instance and every hour the workload runs. Architecture is where you bank the largest structural efficiency gains — the ones that compound with scale.

How to do it well. Favor event-driven and asynchronous architectures over busy-wait and constant polling: an EventBridge/SQS-triggered Lambda consumes resources only when there is work, whereas a service polling every second burns CPU around the clock for nothing. Decompose monoliths so each component can be scaled and optimized independently — the right-sizing and scale-to-zero levers above only work cleanly on well-separated components. Move work off the synchronous request path: defer, batch, and queue anything the user doesn’t need to wait for. Choose efficient runtimes and algorithms — a workload’s language, framework, and data structures materially change its CPU and memory draw; offload heavy lifting to managed services (let Amazon handle the always-on plumbing of a database, queue, or search cluster at fleet-wide efficiency you can’t match per-workload). Right-size aggressively using AWS Compute Optimizer recommendations, and re-architect chronically over-provisioned services. Where it suits the workload, edge and on-device placement (CloudFront Functions, Lambda@Edge) cuts the round trips and origin compute per request.

Anti-pattern Why it wastes Well-Architected pattern
Constant polling / busy-wait Burns CPU with no work to do Event-driven (EventBridge, SQS, Lambda)
Monolith scaled as one block Must over-provision the whole to satisfy one hot part Decoupled components scaled independently
Synchronous everything Holds resources while the user waits Async/queued for anything off the critical path
Self-managed always-on infra Per-workload idle plumbing Managed services at fleet-wide efficiency
Static, over-sized instances Pays energy for headroom never used Compute Optimizer-driven right-sizing

Artifacts and decisions. An architecture decision record per major component justifying the synchronous-vs-async and managed-vs-self-managed choices on efficiency grounds; a Compute Optimizer right-sizing backlog; and a list of identified hot paths with their optimization status. The central decision is where to spend complexity: event-driven and decoupled designs are more efficient but harder to operate, so apply them where the resource savings (and scale) justify the added moving parts.

Data patterns (SUS 4)

What it is. This area covers how you classify, store, move, and retain data so that you keep the minimum data on the minimum-footprint medium for the minimum necessary time. It maps to SUS 4 (“How do you take advantage of data access and usage patterns to support your sustainability goals?”). Storage looks cheap and inert, but every byte you keep occupies a spinning disk or flash cell, gets replicated, gets backed up, and gets copied across networks — all of which consume energy indefinitely.

Why it matters. Data is the silent, ever-growing footprint. Unlike compute, which scales down when idle, stored data keeps consuming until you actively remove or demote it. Untiered logs, forgotten snapshots, duplicate datasets, and over-replicated archives quietly accumulate into one of the largest avoidable footprints in a mature estate.

How to do it well. Treat data lifecycle as a first-class sustainability control.

Data class Access pattern Target medium Sustainability lever
Hot transactional Frequent, low-latency S3 Standard / DynamoDB / SSD Keep small, cache, compress
Warm / occasional Periodic S3 IA / Intelligent-Tiering Auto-tier on access frequency
Cold archive Rare, recovery-only Glacier Flexible / Deep Archive Lowest-energy medium
Logs / telemetry Decaying value CloudWatch Logs + S3 + expiry Retention + delete, don’t hoard
Analytics Scan-heavy Parquet/ORC in S3 + Athena Columnar + compression = less scanned

Artifacts and decisions. A data classification and retention matrix mapping each class to a storage tier and a deletion rule; lifecycle policies as code; a Storage Lens dashboard with a recurring cold-data review; and a replication/backup policy justified per dataset. The key decision is retention duration — anchor it to the actual legal/business requirement, not to “keep everything forever just in case,” because the default of infinite retention is the most expensive sustainability anti-pattern in storage.

Hardware patterns (SUS 5)

What it is. This area is about choosing the most efficient underlying hardware for the work and using the least of it — picking instance types and accelerators whose performance-per-watt is highest, and minimizing the total devices you provision. It maps to SUS 5 (“How do your hardware management and usage practices support your sustainability goals?”). It also accounts for embodied carbon — the emissions baked into manufacturing the hardware — which you amortize better by using fewer devices at higher utilization.

Why it matters. The same workload on a more efficient processor draws materially less energy for the same result, and AWS’s custom silicon offers some of the best performance-per-watt available. Beyond energy, minimizing the count of physical devices reduces the embodied-carbon share of your footprint, since manufacturing emissions are fixed per device and only get amortized through high utilization and long, effective use.

How to do it well. Migrate compute to AWS Graviton (Arm-based) instances wherever the workload supports it — Graviton consistently delivers better performance-per-watt (AWS cites large energy-efficiency gains versus comparable x86 instances), and most managed services (Lambda, Fargate, RDS, ElastiCache, OpenSearch, EMR) offer a Graviton option, so the migration is frequently a configuration change plus a recompile/test pass. For ML, use purpose-built acceleratorsAWS Trainium for training and AWS Inferentia for inference — which are designed for far higher throughput-per-watt than running the same models on general-purpose GPUs, and which let you do more ML work for the same energy. Use the newest instance generation for a given family, since each generation typically improves efficiency. Maximize utilization through bin-packing (consolidate workloads onto fewer, fuller hosts with ECS/EKS and Karpenter, which actively consolidates underutilized nodes), and prefer managed services so AWS runs the hardware at fleet-wide utilization you cannot reach in a single-tenant fleet. Use Spot capacity for fault-tolerant work to consume otherwise-idle pooled capacity efficiently. Right-size with Compute Optimizer, including its recommendations to move to Graviton.

Workload Less efficient default Efficient hardware choice Why it’s better
General compute / web / microservices x86 (Intel/AMD) AWS Graviton (Arm) Higher performance-per-watt; lower energy per request
ML training General-purpose GPU AWS Trainium Purpose-built throughput-per-watt for training
ML inference General-purpose GPU AWS Inferentia Far better inferences-per-watt at scale
Fragmented small instances Many under-used hosts Bin-pack via Karpenter / ECS Fewer devices; better embodied-carbon amortization
Fault-tolerant batch On-demand only Spot Uses pooled idle capacity efficiently

Artifacts and decisions. A hardware-migration plan (which services move to Graviton, in what order, with the compatibility/test gate); an ML accelerator decision for training vs inference; a utilization target per cluster with a bin-packing strategy; and a “newest-generation by default” instance policy. The core decision is the Graviton migration trade-off: the efficiency win is large, but it requires validating Arm compatibility for native dependencies — prioritize the high-volume, long-running services where the per-watt savings compound, and stage the migration behind tests.

Development and deployment process patterns (SUS 6)

What it is. This area covers the process by which you build, test, and operate the workload — keeping development and operational overhead lean, adopting efficiency improvements quickly, and measuring the sustainability impact of changes. It maps to SUS 6 (“How do your development and deployment processes support your sustainability goals?”). The principle: the pipeline, the test estate, and the unused features are part of the footprint too.

Why it matters. A surprising share of cloud energy goes to non-production — sprawling dev/test/staging environments, build farms that idle between commits, redundant pre-production copies, and stale resources nobody owns. And the speed at which you can ship matters for sustainability: if adopting a more efficient instance type, runtime, or service takes six months of manual change, you forgo months of savings. Automation and a tight feedback loop turn efficiency from a project into a habit.

How to do it well. Keep non-production small and ephemeral — spin environments up on demand and tear them down after use (Infrastructure as Code with CloudFormation/AWS CDK/Terraform, ephemeral preview environments, Instance Scheduler on what must persist) rather than running full-time clones of production. Build on managed CI/CD (CodePipeline / CodeBuild or your platform) so the build fleet itself scales to demand instead of idling. Adopt new efficient technologies fast by keeping the path to change short: automated tests, canary/blue-green deploys, and IaC let you roll out a Graviton move or a runtime upgrade safely and quickly. Measure the impact of changes against your proxy metrics, and crucially, use the AWS Customer Carbon Footprint Tool (CCFT) in the Billing console to track your estate’s estimated emissions over time and validate that your optimizations actually move the number — close the loop so sustainability decisions are evidence-based, not assumed. Reduce unused surface area: retire features, environments, and resources that no longer earn their footprint, and feed utilization data (CloudWatch, Compute Optimizer, Storage Lens) back into the backlog as recurring sustainability work.

Process lever Anti-pattern Well-Architected pattern AWS support
Non-prod environments Full-time prod clones Ephemeral, on-demand, scheduled-off IaC (CDK/CFN), Instance Scheduler
Build infrastructure Always-on build servers Demand-scaled managed CI/CD CodeBuild / CodePipeline
Adopting efficiency gains Slow, manual, risky rollouts Fast, tested, canary/blue-green Deploy pipelines, automated tests
Knowing if it worked Assume savings Measure emissions trend Customer Carbon Footprint Tool
Unused surface area Accumulates silently Periodic retirement of stale resources CloudWatch, Compute Optimizer, Storage Lens

Artifacts and decisions. An IaC standard mandating ephemeral non-production; a CCFT-based emissions trend report reviewed on a regular cadence; a sustainability backlog fed by utilization/right-sizing findings; and a definition-of-done that includes the proxy-metric impact of significant changes. The key decision is what to measure as your business-output proxy — pick a normalized KPI (e.g., gCO₂e or watt-hours per 1,000 requests or per active user) that ties resource use to value delivered, so growth in usage doesn’t masquerade as a sustainability regression.

Real-world enterprise scenario

StreamForge Media is a fictional ad-supported video-streaming company: ~12 million monthly active users, a 60-engineer platform org across four teams, and a workload that includes a customer-facing API, a video-transcoding pipeline, a recommendations ML system, and a large analytics/data-lake estate. They run primarily in us-east-1 for historical reasons, with everything (prod, staging, dev, and the build farm) provisioned for peak and running 24/7. A board-level ESG commitment to cut operational emissions per active user by 40% within a year forces them to apply the Sustainability pillar end to end. They begin by enabling the Customer Carbon Footprint Tool to set a baseline and defining a proxy KPI: gCO₂e per 1,000 minutes streamed.

Region selection. They keep the latency-sensitive streaming API and CDN origin near users, but audit their asynchronous workloads — nightly transcoding, recommendation-model training, and the analytics batch layer — none of which are latency-critical. Cross-referencing AWS renewable-Region reporting with grid-intensity data, they relocate transcoding, ML training, and the data-lake batch jobs to a lower-carbon Region, capturing a large operational-emissions cut on those components with no user-facing latency change. They record a Region-selection decision per workload and set a policy that new async/batch services default to the clean Region.

Alignment to demand (user-behavior). Their fleet was sized for prime-time peak and ran flat all day. They move the API to EC2 Auto Scaling with predictive scaling, put the transcoding workers on AWS Batch with Spot, and schedule all non-production with AWS Instance Scheduler (dev/staging now run ~50 hrs/week, not 168). They convert transcoding from synchronous-on-upload to SQS-buffered batch processed in off-peak, lower-carbon windows, and push more delivery to CloudFront to cut origin work. Idle gap on the API fleet drops from roughly 65% to under 20%.

Software and architecture. A profiling pass finds a recommendations service polling a feature store every second; they re-architect it to EventBridge-driven updates, eliminating constant idle CPU. They decompose the transcoding monolith so each stage scales independently and offload session/state to managed ElastiCache and DynamoDB instead of self-managed always-on instances. Compute Optimizer drives a right-sizing pass across 40+ over-provisioned services.

Data patterns. The data lake had years of un-tiered logs and orphaned snapshots. They apply S3 Intelligent-Tiering and Lifecycle rules (Standard → IA → Glacier Deep Archive), set CloudWatch Logs and CloudTrail retention windows, add DynamoDB TTL on ephemeral tables, and script cleanup of unattached EBS volumes and stale RDS snapshots. They convert analytics datasets to Parquet so Athena scans far less per query. Storage Lens surfaces 300+ TB of cold, non-current data they expire. Total stored footprint falls sharply.

Hardware patterns. They migrate the API, ElastiCache, RDS, and Lambda functions to AWS Graviton, validated behind their test suite, for a substantial per-request energy reduction. ML training moves to AWS Trainium and inference to AWS Inferentia, doing the same ML work at far better throughput-per-watt. Karpenter bin-packs the EKS fleet onto fewer, fuller Graviton nodes, improving embodied-carbon amortization, and a “newest-generation by default” policy is adopted.

Development and deployment. Non-production becomes ephemeral: PR preview environments spin up via AWS CDK and tear down on merge, replacing three full-time staging clones. Builds move to demand-scaled CodeBuild. Blue-green pipelines let them roll out the Graviton and runtime changes in weeks, not quarters. The Customer Carbon Footprint Tool trend, reviewed monthly, confirms the savings are real, and right-sizing findings feed a standing sustainability backlog.

Measurable outcome. Within the year: gCO₂e per 1,000 minutes streamed falls 47% (beating the 40% target); API fleet idle drops from ~65% to <20%; ~60% of compute now runs on Graviton/Trainium/Inferentia; stored data footprint down ~35% after expiring 300+ TB of cold data and tiering the rest; non-production energy down ~70% from scheduling and ephemeral environments; and — a co-benefit — the AWS bill falls by a low-seven-figure annual sum, since nearly every sustainability move (less idle, fewer/efficient instances, less stored data) is also a cost move.

Deliverables & checklist

Common pitfalls

What’s next

This concludes the AWS Well-Architected Framework series: with Sustainability covered alongside Operational Excellence, Security, Reliability, Performance Efficiency, and Cost Optimization, the next step is to run the AWS Well-Architected Tool review across all six pillars to turn these practices into a prioritized, tracked improvement plan for your own workloads.

AWSWell-ArchitectedSustainabilityEnterprise
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

// part 6 of 6 · AWS Well-Architected Framework

Keep Reading