AWS Compute

AWS Lambda, In Depth: Runtimes, Triggers, Layers, Concurrency & Every Setting

AWS Lambda runs your code without a server you can see. You hand AWS a function — a zip of code or a container image, plus a handful of settings — and AWS runs it on demand, scaling from zero to thousands of concurrent executions and back to zero, charging you only for the milliseconds your code actually runs. There is no instance to patch, no fleet to size, no operating system to harden. That is the promise, and for event-driven workloads — an object lands in S3, a message arrives on a queue, an HTTP request hits an API — it is one of the highest-leverage services in all of AWS.

The catch is that “no server to manage” does not mean “nothing to understand”. Lambda has a precise execution model, a memory slider that secretly controls your CPU, three distinct invocation paths that behave very differently when things fail, two kinds of concurrency that are constantly confused in interviews, and a VPC story that has tripped up engineers for a decade. Get these wrong and you will pay for cold starts you could have priced away, lose messages you assumed were retried, or wonder why your function can reach the public internet but not your own database.

This lesson is the exhaustive version. We will walk every function setting as the console presents it, then the invocation models, then every common event source, then layers and extensions, then concurrency in full, then the execution role, resource policy, VPC access, versions and aliases, and failure handling. Cold starts get a short treatment here and a dedicated companion lesson for the deep performance work. By the end you will be able to configure a production Lambda function with confidence and answer the questions every certification exam and interviewer asks about it.

Learning objectives

By the end of this lesson you will be able to:

Prerequisites & where this fits

You need an AWS account, the AWS CLI configured (aws configure), and a working grasp of IAM — Lambda’s execution role is an IAM role, and every event source you connect is gated by IAM permissions. Familiarity with at least one of the supported languages (Python or Node.js are easiest to start with) helps but is not required to follow the configuration. This is a Compute lesson in the AWS Zero-to-Hero course, sitting alongside the EC2 and Auto Scaling deep dives: where EC2 gives you servers you size and run, Lambda gives you functions AWS runs for you. After this, the course moves on to storage with the Amazon S3 deep dive (aws-s3-deep-dive-storage-classes-versioning-lifecycle-encryption) — fittingly, since S3 is one of Lambda’s most common triggers.

Core concepts: the Lambda execution model

Before the settings, fix the mental model. A few terms recur throughout and are worth defining once, properly.

The two parameters every handler receives are the event and the context. The event is a JSON object describing what triggered the invocation — its shape depends entirely on the source (an S3 event looks nothing like an API Gateway event). The context is an object with runtime information: the request ID, the function name and version, the CloudWatch log group, and crucially getRemainingTimeInMillis() — how long you have before the timeout fires. A robust handler reads the event, does its work, watches the remaining time, and returns (or throws) cleanly.

The other concept to internalise now is that Lambda is stateless across invocations you cannot rely on. The same environment may be reused — and you should exploit that for connection reuse and caching by putting expensive set-up outside the handler — but AWS may also create a new environment at any time and recycle old ones. Never assume /tmp contents, in-memory state, or even the same environment will survive between two invocations. Treat reuse as an optimisation, not a guarantee.

Creating a function: every core setting

When you create a Lambda function (console: Lambda → Create function, or aws lambda create-function), you choose an authoring path and then configure the function. The settings below are the ones you will touch on essentially every function.

Authoring path: zip, container, or blueprint

Path What it is When to pick it Gotcha
Author from scratch An empty function with a runtime you choose; deploy code as a .zip (up to 50 MB zipped direct upload, 250 MB unzipped, or larger via S3) Most functions; fastest path; smallest cold starts The 250 MB unzipped limit includes layers — large dependency trees hit it
Container image Package code and dependencies as an OCI image (up to 10 GB) pushed to Amazon ECR Heavy dependencies (ML libraries), custom system packages, or teams standardised on Docker Larger images can mean slower cold starts; you own the base image’s patching
Use a blueprint A pre-written sample for a common pattern Learning, or scaffolding a known integration Sample code is a starting point, not production-ready

The zip path and the container path are functionally equivalent at runtime; the choice is about how you package and how big your artefact is. Container images do not make Lambda “just run any container” — your image must implement the Lambda runtime API (AWS base images do this for you).

Runtime and handler

The runtime is the language and version Lambda provides to execute your code.

Setting What it does Choices (2026) Default When to change Gotcha
Runtime The managed language environment Node.js, Python, Java, .NET, Ruby, and the OS-only runtime (provided.al2023) for Go, Rust, C++, or custom The latest LTS of your chosen language Pin to a supported version; migrate before a runtime is deprecated Deprecated runtimes lose security patches and eventually cannot be updated or invoked — track the deprecation schedule
Handler The entry-point method file.method (e.g. index.handler, app.lambda_handler) Set by the blueprint/template Must match your code’s actual file and exported function name A mismatch yields the dreaded Unable to import module / handler not found at invoke time

Go and Rust ship as a self-contained executable named bootstrap on the provided.al2023 OS-only runtime — there is no language-specific managed runtime for them, and they typically deliver the fastest cold starts because there is no interpreter or JVM to warm.

Memory — and the CPU it secretly buys

This is the single most important performance setting and the most misunderstood.

Setting What it does Range Default When to change Gotcha
Memory Allocates RAM and a proportional share of vCPU 128 MB – 10,240 MB (10 GB), in 1 MB steps 128 MB Raise it whenever the function is CPU-bound or latency-sensitive At 128 MB you get a fraction of one vCPU; you cross roughly one full vCPU at ~1,769 MB and reach the 6-vCPU ceiling at 10,240 MB

The coupling is the whole point: memory is CPU. You cannot set CPU directly — you buy more CPU by raising the memory slider. A function doing real computation at 128 MB is throttled to a sliver of a core and runs slowly; bumping it to 1,769 MB gives it a whole vCPU and often runs faster and cheaper because it finishes in a fraction of the time (you pay GB-seconds, and halving the duration can more than offset doubling the memory). Multi-threaded code sees no parallelism until you allocate enough memory to cross multiple vCPUs — below ~1,769 MB there is effectively one core to share. Right-sizing this with a tool such as Lambda Power Tuning is covered in the performance companion lesson.

Timeout

Setting What it does Range Default When to change Gotcha
Timeout Maximum wall-clock time for one invocation before Lambda kills it 1 second – 900 seconds (15 minutes) 3 seconds Set it to a realistic ceiling for the work, with margin Too low truncates legitimate work; too high lets a hung call burn cost and hold concurrency. The hard cap is 15 minutes — longer jobs belong in Step Functions, Fargate, or batch

Set the timeout to a little above your expected worst-case duration, not to 900 by default. A generous timeout combined with a downstream that hangs means each stuck invocation occupies a concurrency slot for the full 15 minutes, which can quietly exhaust your account limit.

Ephemeral storage (/tmp)

Setting What it does Range Default When to change Gotcha
Ephemeral storage Size of the writable /tmp directory in the environment 512 MB – 10,240 MB (10 GB) 512 MB Raise it when you download, unzip, or process large files on disk It is ephemeral and per-environment — contents may persist across warm invocations on the same environment but are never guaranteed; never use it for durable state

/tmp is the only writable part of the function’s filesystem (the deployment package itself is read-only). It is local scratch space — handy for buffering a large object you pulled from S3 — but it is not shared between concurrent environments and not durable. For shared file state, mount Amazon EFS (below).

Environment variables

Setting What it does Limit Default When to change Gotcha
Environment variables Key/value config injected into the runtime Up to 4 KB total for all keys+values None Externalise config (table names, endpoints, feature flags) so code is environment-agnostic They are visible in the console and CLI to anyone with read access — do not put secrets here in plain text

Lambda can encrypt environment variables at rest with a KMS key (a default AWS-managed key by default, or your own customer-managed key). For genuine secrets, store them in AWS Secrets Manager or SSM Parameter Store and fetch them at init time (cached for the life of the environment), or use the Parameters and Secrets Lambda Extension which adds a local caching layer. Several reserved variables (e.g. AWS_REGION, AWS_LAMBDA_FUNCTION_NAME, _HANDLER) are populated by the platform automatically.

CPU architecture: x86_64 vs arm64

Setting What it does Choices Default When to change Gotcha
Architecture The instruction set the function runs on x86_64 or arm64 (AWS Graviton) x86_64 Switch to arm64 for ~20% lower price and often better performance per watt Your code and all native dependencies/layers must have Arm builds; pure-interpreted code usually “just works”, compiled extensions may not

Arm64/Graviton is the cheaper default choice for new functions when your dependencies support it: AWS prices arm64 invocations roughly 20% lower than x86 and many workloads run as fast or faster. The migration risk is entirely in native code — a Python wheel or Node native addon compiled only for x86 will fail on Arm. Test before switching production traffic.

A complete create command

The CLI ties the core settings together:

# Package
zip function.zip app.py

# Create the function with explicit core settings
aws lambda create-function \
  --function-name order-processor \
  --runtime python3.13 \
  --handler app.lambda_handler \
  --architectures arm64 \
  --memory-size 512 \
  --timeout 30 \
  --ephemeral-storage Size=512 \
  --environment "Variables={TABLE_NAME=orders,LOG_LEVEL=INFO}" \
  --role arn:aws:iam::111122223333:role/order-processor-role \
  --zip-file fileb://function.zip

Every flag here maps directly to a setting above; the --role is the execution role, covered in its own section below.

The three invocation models

How a function is invoked determines how it retries, where errors go, and whether the caller waits. There are exactly three models, and confusing them is the source of most “my events disappeared” incidents.

Model Who waits Retries on error Where failures can go Typical sources
Synchronous (request/response) The caller blocks for the result None by Lambda — the error is returned to the caller, who decides The caller (e.g. API Gateway returns 5xx) API Gateway, Function URLs, ALB, aws lambda invoke, Step Functions (Lambda task)
Asynchronous (event) Caller gets an immediate 202 Accepted; result is discarded Lambda retries twice (3 attempts total) with backoff Dead-letter queue or on-failure destination S3, SNS, EventBridge, SES
Event source mapping (poll-based) Nothing — Lambda polls the source Depends on source; stream sources retry until success or record expiry by default On-failure destination, or messages return to the queue/stream SQS, Kinesis Data Streams, DynamoDB Streams, Amazon MQ, self-managed/MSK Kafka

A few consequences worth stating plainly:

You can tell which model an integration uses by asking “does Lambda poll the source, or does the source call Lambda?” SQS/Kinesis/DynamoDB Streams are polled (event source mappings); S3/SNS/EventBridge push asynchronously; API Gateway/Function URLs call synchronously.

Triggers and event sources, one by one

A trigger is the configuration that connects an event source to your function. Here are the common ones with the settings that matter.

API Gateway (synchronous HTTP)

Fronts your function with a managed HTTP API. HTTP APIs are the cheaper, lower-latency, simpler option for most cases; REST APIs add request validation, API keys/usage plans, WAF integration, and edge-optimised endpoints. The function receives an event containing the HTTP method, path, headers, query string, and body, and must return a response object (status code, headers, body) — for HTTP APIs you can use the simplified payload format v2.0. Lambda grants API Gateway permission to invoke via a resource-based policy statement (added automatically when you create the trigger in the console).

Function URLs (synchronous HTTP, no API Gateway)

A built-in, dedicated HTTPS endpoint for a single function — no API Gateway in front. Configuration is minimal:

Setting Choices Notes
Auth type AWS_IAM or NONE NONE is public — anyone with the URL can invoke; use only deliberately
CORS Origins, methods, headers Configure here rather than in code for browser callers
Invoke mode BUFFERED (default) or RESPONSE_STREAM Streaming returns large/early responses progressively

Function URLs are ideal for webhooks and simple endpoints where you do not need API Gateway’s routing, throttling, or validation. The trade-off: no usage plans, no request validation, no path-based routing — one URL, one function.

Amazon S3 (asynchronous)

Invoke a function when objects are created, removed, restored, or replicated. You configure an event notification on the bucket filtered by event type (e.g. s3:ObjectCreated:*) and optionally by prefix and suffix (e.g. only uploads/ keys ending in .jpg). The event delivers bucket and object metadata — name, key, size, ETag — not the object body; your function fetches the object itself if needed. Because it is asynchronous, configure a DLQ or destination for failures. A long-standing gotcha: a function that writes to the same bucket and prefix that triggers it creates an infinite loop — always scope the trigger filter and write outputs elsewhere.

Amazon EventBridge (asynchronous)

The event bus for AWS. A rule matches events — either by event pattern (e.g. every RunInstances API call from CloudTrail, or any custom event with detail-type: OrderPlaced) or on a schedule (cron/rate, the modern replacement for “CloudWatch Events” scheduled invokes) — and targets your function. EventBridge is the right choice for decoupled, fan-out event-driven architectures and for scheduled functions. Like S3 it is asynchronous, so the same retry/DLQ rules apply, and EventBridge itself can be given a DLQ for events it fails to deliver.

Amazon SQS (event source mapping)

Lambda polls a standard or FIFO queue and invokes your function with a batch of messages.

Setting What it does Range / choices Gotcha
Batch size Max messages per invocation 1–10,000 (standard) / up to 10 effective per group considerations for FIFO Larger batches amortise overhead but one poison message can fail the whole batch
Batch window Max seconds to gather a batch before invoking 0–300 s Trades latency for fewer, fuller invocations
Maximum concurrency Cap on concurrent pollers for this queue 2–1,000 Protects downstreams from a flood
Report batch item failures Return only the failed message IDs on/off Turn this on so successful messages in a batch are not reprocessed
Filter criteria Drop non-matching messages before invoking JSON pattern Saves invocations and cost

The two critical points: set the queue’s visibility timeout to at least 6× the function timeout (so a message is not redelivered while still being processed), and enable partial batch responses (ReportBatchItemFailures) so that one bad message does not force the entire batch to be retried. Failed messages return to the queue and, after maxReceiveCount, move to the queue’s DLQ (configured on the queue, not the mapping).

Amazon Kinesis Data Streams and DynamoDB Streams (event source mapping)

Both are ordered, sharded streams polled by Lambda, and they share configuration. Lambda invokes one concurrent execution per shard (raise this with parallelisation factor, up to 10, to fan out within a shard while preserving per-partition-key order).

Setting What it does Notes
Batch size Records per invocation Up to 10,000 (Kinesis) / 10,000 (DynamoDB Streams)
Starting position Where to begin LATEST, TRIM_HORIZON, or AT_TIMESTAMP
Parallelisation factor Concurrent invocations per shard 1–10; preserves order within a partition key
Maximum retry attempts Retries for a failing batch Default ∞ until record expiry; set a finite number to avoid blocking
Maximum record age Skip records older than this Prevents a poison record blocking the shard forever
Bisect batch on function error Split a failing batch to isolate the bad record Pinpoints poison records
On-failure destination SQS/SNS for discarded batches Where to send records you give up on

The defining trait of stream sources is strict ordering per shard/partition key and the head-of-line blocking failure mode: by default a failing batch is retried until it succeeds or the records expire, which stalls that shard in the meantime. Always set a maximum retry attempts and/or maximum record age, enable bisect on error, and route give-ups to an on-failure destination so one bad record cannot freeze a partition indefinitely. As with SQS, enable partial batch responses to checkpoint past the records that did succeed.

Other notable sources (briefly)

Layers and extensions

Lambda layers

A layer is a .zip of libraries, a custom runtime, or other content that you attach to a function so it is extracted alongside your code (into /opt). Layers let you share dependencies across many functions and keep each function’s own package small.

Aspect Detail
Limit Up to 5 layers per function
Size The 250 MB unzipped package limit includes the function code plus all layers combined
Versioning Layers are immutable, versioned artefacts; functions pin a specific version ARN
Sharing Layers can be shared across accounts via resource policies; AWS and third parties publish public layers
Mount point Contents appear under /opt (e.g. /opt/python is on the Python path automatically)

Layers are excellent for common SDKs, shared utility code, or pulling a binary (like a media-processing tool) out of every function. They are not a packaging silver bullet — they count against the same 250 MB ceiling, and overusing them can make dependency versions hard to track. For very large or system-level dependencies, a container image is often cleaner.

Lambda extensions

Extensions run as separate processes inside the execution environment, in parallel with your function, hooking into the lifecycle (init, invoke, shutdown). They come as internal extensions (running in the runtime process) or external extensions (separate processes, often delivered as a layer). Typical uses: shipping logs/metrics to an observability vendor, fetching and caching secrets/parameters, or pre-fetching configuration. The Parameters and Secrets Lambda Extension is a first-party example that caches Secrets Manager and SSM lookups locally so you are not calling those APIs on every invocation. Extensions add capability but also share the environment’s CPU and memory, so a heavy extension can affect function latency.

Concurrency in full

This is the section interviewers love, because the two kinds of concurrency are constantly confused. Get the vocabulary exact.

Concurrency is the number of in-flight executions at a single moment. If each request takes 200 ms and you receive 100 requests per second, your steady-state concurrency is about 20 (100 × 0.2). Lambda scales concurrency automatically.

Account limits and burst

Limit Default Notes
Account concurrency limit 1,000 concurrent executions per Region (soft; raise via a quota request) Shared across all functions in the account/Region
Burst concurrency Now up to 1,000 per 10 seconds per function Each function can scale by up to 1,000 every 10 seconds, independently, until the account limit is reached

When demand exceeds available concurrency, Lambda throttles — returning a 429 TooManyRequestsException to synchronous callers, and (for async/stream sources) retrying per that source’s rules. The account limit is a shared pool: one runaway function can starve every other function in the Region, which is exactly why reserved concurrency exists.

Reserved concurrency

Reserved concurrency carves out a guaranteed slice of the account pool for one function and simultaneously caps that function at the reserved number.

Property Effect
Guarantees The function can always reach up to its reserved value
Caps The function can never exceed its reserved value (set it to 0 to disable a function entirely)
Subtracts The reserved amount is removed from the unreserved pool available to all other functions
Cost Free to configure

Use reserved concurrency to (a) protect a function from being starved by noisy neighbours, (b) throttle a function so it cannot overwhelm a fragile downstream (a relational database that allows only N connections, for example), or © hard-stop a misbehaving function by setting it to 0. The trade-off is that reserving capacity reduces what is left for everything else — reserve too aggressively across functions and you can throttle the rest of the account.

Provisioned concurrency

Provisioned concurrency keeps a set number of execution environments initialised and warm so they respond with no cold start.

Property Effect
Eliminates Cold starts for the provisioned number of concurrent executions
Applies to A specific version or alias (not $LATEST)
Cost You pay for it whether or not it is used — for the kept-warm capacity, plus normal invocation charges
Autoscaling Can be scaled on a schedule or via Application Auto Scaling target tracking

This is the lever for latency-sensitive workloads (user-facing APIs, anything with a strict p99) where cold-start jitter is unacceptable. Because it bills for idle warm capacity, you provision it where it pays for itself and often schedule it to match traffic (warm during business hours, scaled down overnight). Beyond the provisioned number, additional demand still cold-starts normally. The deeper question of when provisioned concurrency beats other cold-start tactics — and where SnapStart fits — is the subject of the companion lesson, Optimizing AWS Lambda Performance: Cold Starts, Provisioned Concurrency, SnapStart, and Memory Tuning.

The reserved vs provisioned distinction (the interview answer)

Reserved concurrency is a limit — it guarantees and caps how many environments a function may have, and it is free. Provisioned concurrency is pre-warmed capacity — it keeps a number of environments initialised so they skip the cold start, and you pay for it. They are orthogonal: you can set reserved concurrency to bound a function and provisioned concurrency to pre-warm part of that bound.

Cold starts (the short version)

A cold start is the latency added when Lambda must create a fresh execution environment: provision the microVM, download your package or image, start the runtime, and run your init code (imports, client construction, static config). Warm invocations reuse an existing environment and skip all of that. The init phase is where you have the most leverage — trim package size, do expensive set-up once outside the handler so it is reused, and prefer lighter runtimes (Go/Rust/Node over a cold JVM). For the full performance treatment — measuring init duration, Lambda Power Tuning for memory, provisioned concurrency economics, SnapStart for Java/Python/.NET, and connection reuse at scale — see the companion lesson linked above; here it is enough to know what a cold start is and that init code is the lever.

The execution role

Every function runs as an IAM role — its execution role — and that role’s policies decide what AWS APIs the function may call. This is not the same as the resource policy (next section): the execution role is outbound (what your code can do), the resource policy is inbound (who may invoke the function).

At minimum the execution role needs permission to write logs:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": [
      "logs:CreateLogGroup",
      "logs:CreateLogStream",
      "logs:PutLogEvents"
    ],
    "Resource": "arn:aws:logs:*:*:*"
  }]
}

The managed policy AWSLambdaBasicExecutionRole grants exactly these CloudWatch Logs permissions and is the right starting point. Add only what the function actually uses — dynamodb:PutItem on one table, s3:GetObject on one bucket — following least privilege. For VPC functions you also need AWSLambdaVPCAccessExecutionRole (network-interface permissions). The role’s trust policy must allow lambda.amazonaws.com to assume it; the console wires this up automatically when it creates a role for you.

The resource-based policy (who may invoke)

The resource-based policy is attached to the function and names the principals and services allowed to invoke it. When you add a trigger in the console, AWS adds a statement here for you — for example, granting s3.amazonaws.com permission to invoke the function from a specific bucket, or apigateway.amazonaws.com from a specific API. You manage it explicitly with aws lambda add-permission:

aws lambda add-permission \
  --function-name order-processor \
  --statement-id s3invoke \
  --action lambda:InvokeFunction \
  --principal s3.amazonaws.com \
  --source-arn arn:aws:s3:::my-upload-bucket \
  --source-account 111122223333

This is also how cross-account invocation is granted: a statement that allows another account’s principal to call lambda:InvokeFunction. The mental split worth keeping: execution role = what the function can do; resource policy = who can run the function.

VPC access (the Hyperplane story)

By default a Lambda function runs in an AWS-managed network with internet access but no route into your VPC — it cannot reach a private RDS database or an internal service. Attaching the function to a VPC (choosing subnets and security groups) places it on your private network so it can reach those resources.

How this works has changed for the better. Lambda uses Hyperplane ENIs — shared, network-interface infrastructure — so that creating the network plumbing is fast and ENIs are shared across invocations rather than created per concurrent execution. The old pain (slow per-invocation ENI creation, ENI exhaustion under scale) is gone; VPC attachment no longer meaningfully worsens cold starts.

The settings and the classic gotcha:

Setting What it does Gotcha
Subnets Which private subnets the function’s ENIs live in Choose subnets in multiple AZs for resilience
Security groups Firewall rules for the function’s ENIs The function’s SG must be allowed by the target’s SG (e.g. the database’s inbound rule)
Internet egress A VPC-attached function loses default internet access To reach the public internet and your VPC, route the subnets’ traffic through a NAT gateway (or use VPC endpoints for AWS services like S3/DynamoDB to avoid NAT entirely)

The single most common VPC-Lambda mistake: attaching a function to a VPC and then being surprised it can no longer reach the internet or call an AWS API. A VPC-attached Lambda has only the routes its subnets provide — give those subnets a NAT gateway for outbound internet, and use gateway/interface VPC endpoints to reach AWS services privately (cheaper and more secure than routing AWS API traffic through NAT). Only attach to a VPC when you genuinely need to reach private resources; if you do not, leave the function out of the VPC and keep the default internet egress.

Versions and aliases

A function is mutable while you work on it, but production wants immutable, named points you can roll back to. That is what versions and aliases provide.

Aliases also support weighted routing for safe, gradual deploys:

# Publish the new code as a version
aws lambda publish-version --function-name order-processor

# Send 10% of traffic to version 8, 90% still on version 7
aws lambda update-alias \
  --function-name order-processor \
  --name prod \
  --function-version 7 \
  --routing-config AdditionalVersionWeights={"8"=0.1}

This is the foundation of canary / linear deployments (often orchestrated by AWS CodeDeploy or SAM): shift a small percentage to the new version, watch the alarms, then ramp to 100% — or roll back instantly by zeroing the weight. Provisioned concurrency is also configured per version/alias, so point it at the alias your production traffic uses.

Failure handling: DLQs, destinations, and retries

What happens to an event your function cannot process depends on the invocation model, and you must configure the safety net deliberately.

Mechanism Applies to What it does
Maximum retry attempts (async) Asynchronous invokes 0–2 automatic retries (default 2) before the event is failed
Maximum event age (async) Asynchronous invokes Discard events older than 60 s–6 h still waiting in the internal queue
Dead-letter queue (DLQ) Asynchronous invokes An SQS queue or SNS topic that receives the event after all retries fail
On-failure destination Async and stream/poll A richer target (SQS, SNS, EventBridge, or another Lambda) receiving the event plus context (error, response, request ID)
On-success destination Asynchronous invokes A target that receives successful results — useful for chaining

Destinations are the modern, preferred mechanism over the older DLQ: they carry more context (the request/response and error details, not just the raw event), support success as well as failure, and offer more target types. For stream sources (Kinesis/DynamoDB Streams) an on-failure destination receives metadata about batches you gave up on after the retry/age limits. The non-negotiable takeaway: asynchronous invocations silently drop failed events unless you configure a DLQ or destination — wire one up for any async function whose events matter.

Lambda anatomy & event sources

The diagram below ties the pieces together: the event sources on the left, the three invocation models, the execution environment with its memory/CPU//tmp/role, and the failure paths to DLQs and destinations on the right.

AWS Lambda anatomy & event sources

Trace one path through it. An object lands in S3; S3 invokes the function asynchronously; Lambda creates (or reuses) an execution environment running as the execution role; the handler processes the event; on repeated failure the event flows to the on-failure destination. Now contrast that with API Gateway calling synchronously — no Lambda retries, the error returns to the caller — and SQS being polled by an event source mapping in batches. The same function, three very different control flows.

Hands-on lab

You will create a small function from scratch, invoke it synchronously, change its memory and timeout, publish a version behind an alias, and clean up. Everything here is within the AWS Free Tier — Lambda’s free tier includes 1 million requests and 400,000 GB-seconds per month.

Run these as an administrator (not the root user). Replace the account ID 111122223333 with your own. The lab assumes the AWS CLI v2 is configured.

Step 1 — Create an execution role Lambda can assume.

cat > trust.json <<'EOF'
{ "Version": "2012-10-17",
  "Statement": [{ "Effect": "Allow",
    "Principal": { "Service": "lambda.amazonaws.com" },
    "Action": "sts:AssumeRole" }] }
EOF

aws iam create-role \
  --role-name lab-lambda-role \
  --assume-role-policy-document file://trust.json

aws iam attach-role-policy \
  --role-name lab-lambda-role \
  --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole

Step 2 — Write and package a trivial handler.

cat > app.py <<'EOF'
import json, os
def lambda_handler(event, context):
    name = event.get("name", "world")
    return {"message": f"hello, {name}",
            "memory_mb": context.memory_limit_in_mb,
            "remaining_ms": context.get_remaining_time_in_millis()}
EOF

zip function.zip app.py

Step 3 — Create the function.

aws lambda create-function \
  --function-name lab-hello \
  --runtime python3.13 \
  --handler app.lambda_handler \
  --architectures arm64 \
  --memory-size 128 \
  --timeout 5 \
  --role arn:aws:iam::111122223333:role/lab-lambda-role \
  --zip-file fileb://function.zip

Step 4 — Invoke it synchronously and read the result.

aws lambda invoke \
  --function-name lab-hello \
  --payload '{"name":"Vinod"}' \
  --cli-binary-format raw-in-base64-out \
  response.json

cat response.json

Expected output: response.json contains {"message": "hello, Vinod", "memory_mb": 128, "remaining_ms": <~4900>} and the invoke command returns "StatusCode": 200. Note memory_mb reflects your setting and remaining_ms is just under the 5,000 ms timeout — the context object in action.

Step 5 — Change memory and timeout (a post-creation config update).

aws lambda update-function-configuration \
  --function-name lab-hello \
  --memory-size 512 \
  --timeout 10

# Wait for the update to finish, then re-invoke to confirm memory_mb is now 512
aws lambda wait function-updated --function-name lab-hello

Step 6 — Publish a version and point a prod alias at it.

VER=$(aws lambda publish-version --function-name lab-hello \
        --query Version --output text)

aws lambda create-alias \
  --function-name lab-hello \
  --name prod \
  --function-version "$VER"

Step 7 — Validation checklist.

Cleanup (so nothing lingers):

aws lambda delete-alias --function-name lab-hello --name prod
aws lambda delete-function --function-name lab-hello
aws iam detach-role-policy --role-name lab-lambda-role \
  --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
aws iam delete-role --role-name lab-lambda-role
rm -f app.py function.zip response.json trust.json

Cost note: at this scale the lab is effectively free — a handful of invocations at 128–512 MB is a rounding error against the 1M-request / 400,000 GB-second monthly free tier. The only ways this could cost money are forgetting provisioned concurrency turned on (it bills for idle warm capacity) or leaving a function attached to a VPC with a NAT gateway running — neither of which this lab creates. Deleting the function and role removes any residual footprint.

Common mistakes & troubleshooting

Symptom Likely cause Fix
Unable to import module / handler not found The handler string doesn’t match the file/function, or a dependency isn’t packaged Set handler to file.method matching your code; bundle all deps (or use a layer/container)
Async events seem to “disappear” on failure No DLQ or destination — failed async events drop after 2 retries Configure an on-failure destination (or a DLQ) for the function
Function is slow at “only” 128 MB Memory is CPU — 128 MB is a fraction of a vCPU Raise memory (often cheaper because duration drops); right-size with Power Tuning
Task timed out after N seconds Timeout too low, or a downstream is hanging Raise the timeout to a realistic value; add client-side timeouts on downstream calls; check getRemainingTime
VPC function can’t reach the internet / an AWS API VPC attachment removed default egress Route the subnets through a NAT gateway; use VPC endpoints for AWS services
Throttling (429) under load Hit account or reserved concurrency limit Raise the account quota; review reserved settings; add reserved concurrency to protect the function
One bad SQS/stream message blocks the batch Partial failures not reported Enable ReportBatchItemFailures (partial batch responses); set max retries / record age; bisect on error
S3-triggered function loops forever The function writes to the same bucket/prefix that triggers it Scope the trigger’s prefix/suffix; write outputs to a different prefix or bucket
Access Denied calling another AWS service Execution role lacks the permission Add the least-privilege action/resource to the execution role

Best practices

Security notes

Interview & exam questions

  1. How does memory relate to CPU in Lambda? They are coupled — you cannot set CPU directly; allocating more memory grants a proportional share of vCPU. You cross roughly one full vCPU near 1,769 MB and reach 6 vCPUs at the 10,240 MB ceiling. More memory often runs cheaper because the function finishes faster.

  2. What is the difference between reserved and provisioned concurrency? Reserved is a limit — it guarantees and caps how many environments a function may have, and it is free. Provisioned is pre-warmed capacity — it keeps environments initialised so they skip the cold start, and you pay for it. They are orthogonal and can be combined.

  3. Name the three invocation models and how each handles errors. Synchronous (caller waits; Lambda does not retry; error returns to caller). Asynchronous (immediate 202; Lambda retries twice; failures go to a DLQ/destination if configured, else are dropped). Event source mapping / poll-based (Lambda polls; stream sources retry until success or record expiry by default; failures go to an on-failure destination or back to the queue).

  4. Why might events sent to a Lambda “disappear”? They were asynchronous invocations that failed all retries with no DLQ or destination configured, so Lambda dropped them. The fix is to attach an on-failure destination (or DLQ).

  5. A function in a VPC can’t reach the internet. Why, and how do you fix it? Attaching to a VPC removes the default managed-network internet egress; the function now only has the routes its subnets provide. Route those subnets through a NAT gateway for internet, and use VPC endpoints to reach AWS services privately.

  6. What is the maximum timeout, and what do you do for longer work? 15 minutes (900 seconds). For longer or multi-step work, use Step Functions, Fargate, or AWS Batch instead of one long Lambda.

  7. What is the difference between the execution role and the resource-based policy? The execution role is what the function can do (its outbound IAM permissions). The resource-based policy is who may invoke the function (inbound), including cross-account principals and AWS services.

  8. How do versions and aliases enable safe deployment? Publishing a version creates an immutable snapshot of code and config. An alias is a movable pointer to a version; production references the alias. Deploy by repointing the alias (with weighted routing for canaries) and roll back by repointing it again.

  9. How do you stop one bad message from blocking an SQS or stream batch? Enable partial batch responses (ReportBatchItemFailures) so only failed records are retried; for streams also set maximum retry attempts and maximum record age and enable bisect batch on function error to isolate the poison record.

  10. What is a cold start, and where is the biggest lever to reduce it? The latency to create a fresh execution environment and run init code. The biggest lever is the init phase: trim package size and move expensive set-up (clients, connections) outside the handler so warm invocations reuse it. Provisioned concurrency or SnapStart address what remains.

  11. What are layers, and what limit do they share with your code? Reusable, immutable, versioned zips of dependencies/runtime mounted at /opt (up to 5 per function). They count against the same 250 MB unzipped package limit as the function code.

  12. How do you set up cross-account invocation of a Lambda function? Add a statement to the function’s resource-based policy (aws lambda add-permission) allowing the other account’s principal to call lambda:InvokeFunction, scoped with --source-arn/--source-account where applicable.

Quick check

  1. At roughly what memory setting does a function get one full vCPU?
  2. Which invocation model retries automatically twice before failing?
  3. True or false: provisioned concurrency is free to configure.
  4. What single thing must you configure so failed asynchronous events are not silently dropped?
  5. Why can a VPC-attached function lose access to the public internet, and what restores it?

Answers

  1. About 1,769 MB — below that you share less than a full vCPU.
  2. Asynchronous invocation (3 attempts total: the original plus two retries).
  3. False. Reserved concurrency is free; provisioned concurrency bills for the kept-warm capacity whether used or not.
  4. A dead-letter queue or on-failure destination on the function.
  5. VPC attachment removes the default managed-network egress, leaving only the subnets’ routes; a NAT gateway (and/or VPC endpoints for AWS services) restores outbound access.

Exercise

Build a small event-driven image-thumbnail pipeline and exercise the failure paths:

  1. Create a function (arm64, 512 MB, 30 s timeout) triggered by S3 ObjectCreated events filtered to uploads/ and suffix .jpg. Have it read the object and (conceptually) write a thumbnail to a different prefix — explaining why writing back to uploads/ would loop.
  2. Give the execution role least-privilege s3:GetObject on the source prefix and s3:PutObject on the destination prefix only.
  3. Configure an on-failure destination (an SQS queue) and prove it works by uploading a deliberately corrupt file and observing the failed event land in the queue with error context.
  4. Publish a version, create a prod alias, and configure provisioned concurrency = 1 on the alias; confirm a freshly invoked alias has no cold start, then remove provisioned concurrency so you stop paying.
  5. Bonus: enable AWS X-Ray active tracing and read the init-vs-invoke breakdown for a cold versus warm invocation.

Certification mapping

Exam Objective area this supports
DVA-C02 (Developer – Associate) Development with AWS services — authoring Lambda functions, runtimes/handlers, environment variables, layers, versions/aliases, event source mappings, and concurrency; deployment with canary/linear traffic shifting.
SAA-C03 (Solutions Architect – Associate) Design cost-optimised and resilient architectures — choosing Lambda for event-driven workloads, the three invocation models, concurrency limits and throttling, VPC integration, and failure handling with DLQs/destinations.
SOA-C02 (SysOps Administrator – Associate) Monitoring and reliability — CloudWatch metrics/logs for Lambda, concurrency and throttling alarms, and the operational settings (timeout, memory, retries).

Glossary

Next steps

Continue the course with Amazon S3, In Depth: Storage Classes, Versioning, Lifecycle, Encryption & Access Control — fitting, since an object landing in S3 is one of the most common ways a Lambda function gets invoked. Then go deeper on Lambda itself with:

AWSLambdaServerlessComputeConcurrencyEvent-Driven
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading