Architecture AWS

AWS Enterprise Architecture: Serverless REST/GraphQL API

A REST endpoint and a GraphQL endpoint look identical to the business — both return JSON, both sit behind a domain name, both need auth. But the two protocols pull an architecture in opposite directions: REST is request/response and resource-shaped, GraphQL is schema-shaped and loves to fan out into N child fetches per query. The interesting engineering question for an enterprise serverless API is not “which one” — mature platforms ship both — but how to put them on the same identity, the same data, and the same operational substrate without building two parallel stacks. This article is that blueprint, built entirely on managed AWS services: Amazon API Gateway and AWS AppSync at the front door, AWS Lambda for business logic, Amazon DynamoDB for data, and Amazon Cognito for identity.

The business scenario

Northwind Lockers (fictional, used throughout) runs smart parcel lockers for apartment complexes and offices. They started with a single REST API that a courier mobile app called to drop and pick up parcels. Three years in, the surface area has exploded:

The team is 9 engineers. They have no appetite to run Kubernetes, patch EC2 fleets, or babysit a self-managed GraphQL server at 2 a.m. Traffic is spiky and seasonal: quiet overnight, a delivery surge 11 a.m.–2 p.m., and a brutal December peak that is 8x the July baseline. They tried a fixed EC2 + ALB tier and spent the year either over-provisioned (paying for December in March) or falling over (paying for March in December).

The mandate from the new VP of Engineering is specific:

  1. One identity across every channel — courier, building manager, partner, machine. No second auth system.
  2. Both REST and GraphQL, because forcing the partner onto GraphQL or the mobile team onto REST would each waste a quarter.
  3. Scale to zero overnight and absorb the December peak without a capacity meeting.
  4. A real DR story — a region can fail and parcels keep moving, because a locker that won’t open at 9 p.m. is a support call and a churned building.

This is the sweet spot for serverless: variable, event-driven, multi-protocol traffic where the per-request cost matters more than steady-state utilization, and where the team’s scarcest resource is operational attention.

Architecture overview

The end-to-end picture is a two-front-door, shared-core design. Two managed API layers — REST via API Gateway, GraphQL via AppSync — terminate the public edge, validate identity against a single Cognito user pool, and then converge on a shared pool of Lambda functions and a shared DynamoDB data layer. Nothing about “REST vs GraphQL” leaks below the front door.

AWS serverless API reference architecture: Route 53/CloudFront/WAF edge and one Cognito identity front two managed doors — API Gateway (REST) and AppSync (GraphQL) — over a shared Lambda compute and single-table DynamoDB core, with an IoT/EventBridge event path and a DynamoDB Streams to AppSync real-time subscription fan-out

The request path (REST, partner billing call):

  1. The partner’s generated client calls https://api.northwind.example/v1/invoices/{id} over TLS 1.3. DNS resolves through Amazon Route 53 to a CloudFront distribution; an AWS WAF web ACL on CloudFront strips the obvious garbage (SQLi/XSS signatures, IP reputation lists, a rate-based rule).
  2. CloudFront forwards to a regional API Gateway REST API. A Cognito authorizer on the route validates the partner’s JWT (machine-to-machine, OAuth2 client-credentials grant from a Cognito app client). API Gateway also enforces request validation against the JSON Schema in the OpenAPI definition and applies a usage plan (API key + throttle + quota) so one partner cannot exhaust the account.
  3. API Gateway proxy-integrates to a Lambda function (invoices-api). The function reads from DynamoDB and returns JSON. Idempotent reads are cached in API Gateway’s response cache for hot invoice IDs.

The request path (GraphQL, courier mobile query + subscription):

  1. The mobile app calls the AppSync GraphQL endpoint (also fronted by Route 53 + custom domain). AppSync authorizes the user’s Cognito ID token directly — no separate Lambda authorizer needed for the common case.
  2. The GraphQL query resolves field-by-field. A getDelivery query hits a DynamoDB resolver via a VTL or JavaScript (APPSYNC_JS) resolver with no Lambda in the path — AppSync talks to DynamoDB directly, which is the cheapest and lowest-latency option. A delivery.signature field that needs an S3 pre-signed URL is resolved by a Lambda data source (a pipeline resolver chains the DynamoDB read and the Lambda step).
  3. When a courier marks a parcel delivered (a markDelivered mutation), AppSync writes to DynamoDB and publishes a subscription event. Every building-manager dashboard subscribed to that locker receives the update over a managed WebSocket — AppSync owns the connection fan-out, so no one runs a socket server.

The event path (IoT sensor ingestion):

  1. Locker sensors publish state changes to AWS IoT Core, which routes via an IoT rule onto an Amazon EventBridge bus (or directly to an SQS queue). A Lambda consumer (sensor-projector) writes the new door state into the same DynamoDB table.
  2. That same write triggers a DynamoDB Stream. A Lambda (stream-fanout) reads the stream and calls an AppSync mutation as the system, which causes the live subscription push to dashboards. This is the trick that makes a backend-originated change appear instantly on a client subscription: write → stream → AppSync mutation → subscription fan-out.

The data layer is a single DynamoDB table in single-table design, with Global Secondary Indexes for the access patterns, DynamoDB Streams feeding the real-time fan-out, point-in-time recovery (PITR) on, and a global table replica in a second region for DR. Cold/large artifacts (parcel photos, signature images) live in S3, referenced from DynamoDB by key.

The whole thing is regional and stateless at the compute tier — every Lambda is horizontally scalable and idempotent, every front door is a managed service that scales without our involvement, and the only durable state is in DynamoDB (multi-region) and S3 (cross-region replicated).

Component breakdown

Component Service Role Key configuration choices
Edge / CDN / DDoS CloudFront + AWS WAF + Shield Standard TLS termination, caching, L7 filtering WAF managed rule groups (Core, Known-Bad-Inputs, IP reputation) + a rate-based rule; CloudFront in front of both API Gateway and AppSync for one edge and one WAF
REST front door API Gateway (REST, regional) REST contract, request validation, throttle/quota, response cache Cognito authorizer; request validators from OpenAPI JSON Schema; usage plans per partner; per-method throttling; access logs to CloudWatch
GraphQL front door AWS AppSync Schema, resolvers, managed subscriptions Direct DynamoDB resolvers where possible (no Lambda); pipeline resolvers for multi-step; APPSYNC_JS resolvers; enhanced subscription filtering; caching tier for hot queries
Identity Amazon Cognito user pool One identity for humans + machines Groups → roles (courier, manager, partner, admin); app clients per channel; client-credentials for M2M; advanced security (compromised-credential + adaptive MFA); token validity tuned (short access tokens, refresh rotation)
Compute AWS Lambda Business logic, integrations, stream/event consumers ARM64 (Graviton) for ~20% better price/perf; Lambda SnapStart or Provisioned/Reserved Concurrency on latency-critical functions; tight per-function IAM; Powertools for logging/tracing/metrics; idempotency layer
Data Amazon DynamoDB Primary store, single-table On-demand capacity (matches spiky/seasonal load); single table + GSIs; Streams enabled; PITR on; TTL for ephemeral items; global table for DR
Real-time fan-out DynamoDB Streams + Lambda → AppSync mutation Backend changes pushed to subscribed clients Stream batches → idempotent projector → AppSync mutation with IAM auth as the system principal
Async / events EventBridge + SQS + IoT Core Decoupled ingestion, retries, buffering SQS as a shock absorber in front of Lambda; DLQs everywhere; EventBridge for routing and future fan-out; partial batch response on SQS/streams
Blobs Amazon S3 Photos, signatures, exports Referenced by key from DynamoDB; pre-signed URLs minted by Lambda/AppSync; SSE-KMS; cross-region replication for DR; lifecycle to Intelligent-Tiering
Secrets / config Secrets Manager + SSM Parameter Store Partner credentials, feature flags Rotation on secrets; Parameter Store for non-secret config; fetched via Lambda extension/cache, never baked into images
Observability CloudWatch + X-Ray + Powertools Logs, metrics, traces, alarms Structured JSON logs; X-Ray traces spanning API GW/AppSync → Lambda → DynamoDB; CloudWatch dashboards + composite alarms; embedded metrics for business KPIs

A few choices deserve the “why,” because they are where this architecture differs from a naive serverless app.

Why AppSync resolves straight to DynamoDB, not “AppSync → Lambda → DynamoDB.” The reflex is to route every GraphQL field through a Lambda. For simple CRUD that adds latency, cost, and a cold-start surface for no benefit. AppSync’s native DynamoDB resolver (now writable in JavaScript via APPSYNC_JS, not just VTL) handles get/query/put/update directly. Lambda earns its place only when a field needs logic AppSync can’t express cleanly — calling a third party, minting a pre-signed URL, complex authorization. The rule: Lambda is a data source you reach for, not the default path.

Why a single DynamoDB table. Both REST and GraphQL serve the same nouns (deliveries, lockers, invoices, users). Modeling each as its own table would force the GraphQL resolvers and the REST Lambdas to join across tables in application code — slow and bug-prone. A single-table design with a deliberate partition/sort-key scheme and a handful of GSIs lets one item collection answer “get this delivery,” “list a courier’s deliveries today,” and “list a locker’s history” with single-digit-millisecond queries and no joins.

Why on-demand capacity, not provisioned + autoscaling. Northwind’s load is the textbook on-demand case: spiky within the day, seasonal across the year, with an 8x December peak. Provisioned capacity with autoscaling lags sudden spikes (the scaling alarm fires after throttling starts) and you pay for headroom. On-demand absorbs the spike instantly and bills per request. If a workload later becomes large and predictable, provisioned-with-autoscaling (or a reserved-capacity commit) becomes cheaper — but that is an optimization to earn with data, not a starting assumption.

Why Cognito is the single identity even for machines. The partner integration is machine-to-machine, which tempts teams to bolt on a separate API-key or homegrown JWT scheme. Cognito app clients support the OAuth2 client-credentials grant, so the partner gets a real OAuth token from the same issuer humans use. One JWKS endpoint, one set of authorizers, one audit trail. API Gateway and AppSync both natively validate Cognito tokens, so “one identity, two front doors” is literally one user pool referenced twice.

Implementation guidance

Provision with Terraform (the user’s house standard) using a layered state layout so blast radius is contained: a network-edge layer (Route 53 zones, ACM certs, CloudFront, WAF), an identity layer (Cognito pool, app clients, groups, IAM roles), a data layer (DynamoDB table, GSIs, streams, S3 buckets, global-table replica), and an app layer (Lambdas, API Gateway, AppSync, EventBridge, SQS) — each with its own remote state in S3 + DynamoDB lock table, wired together with terraform_remote_state data sources or SSM parameters. The serverless functions themselves are best authored with the AWS Serverless Application Model (SAM) or Serverless Framework and consumed by Terraform, or kept fully in Terraform if the team prefers one tool. Whichever — keep the function handler code out of the IaC repo’s critical path: build artifacts in CI, publish a versioned Lambda, and let IaC point at the version/alias.

Concretely:

Networking — and the deliberate choice to stay out of the VPC. This is a point teams get wrong. Lambda, DynamoDB, S3, AppSync, and API Gateway are all “VPC-optional” managed services that reach each other over the AWS network without a VPC. Putting Lambda in a VPC just to talk to DynamoDB adds ENI cold-start cost and a NAT bill for no security benefit — DynamoDB access is governed by IAM, not network reachability. So the default here is no VPC: functions reach DynamoDB/S3/AppSync over their service endpoints, secured by IAM. A Lambda is attached to a VPC only if it must reach something private — an RDS instance, an internal service, a partner over a private link. When that happens, the function gets a VPC config with VPC (Gateway/Interface) endpoints for DynamoDB and S3 so traffic never traverses a NAT, and the NAT is reserved for genuine internet egress. Network isolation here is an IAM-and-resource-policy problem, not a subnet problem.

Identity wiring. One Cognito user pool. Groups (courier, manager, partner, admin) map to IAM roles via the pool’s identity-pool/role mapping for any AWS-resource access, and to scopes/claims for app-level authorization. Separate app clients per channel (mobile, web, partner-M2M) so you can set different token lifetimes, revoke one channel without touching others, and enable the client-credentials grant only on the partner client. Fine-grained authorization lives in two tiers: coarse checks at the front door (the Cognito authorizer rejects an unauthenticated or wrong-audience token before any compute runs), and fine checks in resolvers/Lambdas (a courier can only read their own deliveries — enforced by constraining the DynamoDB query to the caller’s partition, derived from the verified sub claim, never from a client-supplied user ID). For complex policies, AppSync + Amazon Verified Permissions (Cedar) externalizes authorization as policy rather than scattered if statements.

Enterprise considerations

Security and Zero Trust. The architecture is Zero-Trust by construction: every request is authenticated (Cognito JWT) and authorized at the edge and re-checked at the data boundary, with no implicit trust from “being inside the network” — because there largely is no network perimeter to be inside. WAF on CloudFront filters L7 attacks; AWS Shield Standard absorbs common DDoS for free (Shield Advanced if the contractual SLA demands it). Every service-to-service hop is least-privilege IAM — each Lambda’s execution role grants only the specific table actions and item-level conditions it needs (dynamodb:LeadingKeys conditions to scope a function to a tenant’s partition). Data is encrypted with KMS (DynamoDB, S3, Secrets Manager) and in transit with TLS 1.2+ everywhere. Partner secrets rotate in Secrets Manager. The single biggest Zero-Trust win over a server-based design: there is no long-lived host to compromise, patch, or pivot from — compute is ephemeral and per-request.

Cost optimization. Serverless flips the cost model from “pay for capacity” to “pay for use,” which is exactly right for an 8x seasonal swing. Levers, roughly in order of impact:

Scalability. Each tier scales independently and natively. The real governors to set deliberately: Lambda reserved concurrency to protect downstream systems (and a per-account concurrency budget so one runaway function can’t starve the others), DynamoDB on-demand’s automatic scaling (with adaptive capacity smoothing hot partitions — which a good key design avoids in the first place), and AppSync/API Gateway throttles. The classic serverless scaling trap is a downstream that does not scale — if a Lambda calls a fixed-size relational database, Lambda will happily open 10,000 connections and melt it. Northwind avoids this by keeping the hot path on DynamoDB; any relational dependency would sit behind RDS Proxy to pool connections.

Reliability and DR (RTO/RPO). Within a region, every component is multi-AZ by default (managed services), so single-AZ failure is invisible. For regional DR the design uses DynamoDB global tables (active-active, multi-region, typically sub-second replication → RPO seconds), S3 cross-region replication for blobs, and infrastructure-as-code redeployable into the second region in minutes. Front-door failover is a Route 53 health-checked failover (or latency) routing policy flipping traffic to the standby region’s CloudFront/API Gateway/AppSync. Targets: RPO of seconds (global-table replication lag) and RTO of minutes (DNS failover + already-warm managed services in region 2). Because Lambda/API Gateway/AppSync are deploy-from-IaC and hold no state, “standby region” can be a genuinely warm stack rather than a cold rebuild. Idempotency (Powertools idempotency keys, conditional DynamoDB writes) makes retries and dual-region replays safe. DLQs on every async consumer mean a poison message parks instead of blocking the queue, and partial batch responses on SQS/stream sources mean one bad record doesn’t fail a whole batch.

Observability. Structured JSON logs from every Lambda (Powertools), X-Ray distributed tracing stitching API Gateway/AppSync → Lambda → DynamoDB into one trace so you can see exactly where a slow request spent its milliseconds, CloudWatch metrics (including embedded-metric-format business KPIs like “deliveries completed per minute”), and composite alarms that page only on genuine, correlated problems (error-rate and latency, not a single noisy metric). Track the serverless-specific signals: cold-start rate and duration, concurrency utilization vs. limit, DynamoDB throttles, async DLQ depth, and AppSync subscription connection counts. A dashboard per channel (REST partner, GraphQL mobile, dashboard subscriptions, IoT ingest) keeps a problem in one channel from being masked by health in the others.

Governance. Multi-account via AWS Organizations / Control Tower (separate dev/stage/prod accounts), Service Control Policies to enforce guardrails (deny public S3, require encryption, pin allowed regions), tagging standards for cost allocation per channel and per environment, AWS Config rules for drift and compliance, and CloudTrail for an immutable audit log. Cognito’s audit events and API Gateway/AppSync access logs give a per-request, per-identity trail end to end.

Reference enterprise example

Northwind Lockers, December peak readiness review. Baseline (July): ~3.5 million API/GraphQL operations/day. December peak: ~28 million/day, concentrated 11 a.m.–8 p.m., with a hard spike on the three days before Christmas. Roughly 60% of operations are GraphQL (mobile couriers), 30% REST (partner billing + dashboard CRUD), 10% IoT sensor ingest.

Decisions they made and why:

Cost outcome. The retired EC2 + ALB + self-managed-GraphQL tier had cost a flat ~$4,100/month — sized for a peak that occurred a few days a year. The serverless platform billed ~$1,250 in a quiet month and ~$6,800 in the December peak month, averaging ~$2,300/month across the year — a ~44% reduction — while the December peak was handled with no engineer paged for capacity and the overnight hours cost almost nothing. The team also deleted an entire class of work: no GraphQL servers to patch, no socket fleet to scale, no auth service to run.

Where they spent the savings. Two engineers’ worth of reclaimed operational time went into the things serverless doesn’t give you for free: a shared idempotency/observability Lambda layer, the OpenAPI-and-schema contract discipline, and the cross-region GameDay automation.

When to use it

Use this architecture when:

Trade-offs and anti-patterns to avoid:

Alternatives worth naming: a container-based API (ALB + ECS/EKS Fargate) when you need long-lived connections, large in-memory state, non-HTTP protocols, or constant high throughput; Aurora Serverless v2 when the domain is relational; AWS Amplify when a small team wants AppSync + Cognito + DynamoDB scaffolded end to end (Amplify generates much of exactly this stack); and AWS Step Functions layered in when the business logic is a long, multi-step, stateful workflow rather than a request/response API. The front-door pattern — Cognito identity, CloudFront/WAF edge, DynamoDB core — survives most of these swaps, which is the real reason to start here.

AWSArchitectureEnterpriseReference Architecture
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading