AWS Lesson 18 of 123

S3 Access Points, Object Lambda, and Multi-Region Access Points for Shared Data at Scale

A shared data lake bucket starts clean and ends as a 20 KB bucket policy that nobody dares to edit. Forty teams, each needing a different prefix, a different VPC restriction, a different account — all crammed into one JSON document with a hard 20 KB ceiling and a single point of failure. S3 Access Points exist precisely to break that document apart: each consumer gets its own named endpoint to the bucket, with its own policy, and the bucket policy shrinks to one line that says “trust access points.” On top of that, S3 Object Lambda lets you transform objects on the read path without copying data, and Multi-Region Access Points (MRAP) give you one global endpoint over replicated buckets. This guide wires all three together the way a platform team actually does it.

The reason this matters in production terms: a shared bucket is a contention point, a blast-radius concentrator, and a correctness hazard all at once. One bad edit to the shared policy breaks every consumer; one team’s prefix wildcard accidentally grants another team’s data; one nightly Glue job that writes a redacted copy doubles your storage and introduces staleness the moment it runs. Access points, Object Lambda, and MRAP each attack one of those failure modes — single-purpose policies kill contention and blast radius, in-flight transforms kill the duplicate copy, and a global endpoint over replicated buckets kills the multi-region routing/divergence problem.

By the end you will stop treating the bucket policy as the place where authorization happens. You will know how to decompose it into N small, auditable access-point policies; when to reach for a VPC-bound access point versus a cross-account one; how to insert a Lambda into the GET path safely (including the Range-request trap that breaks naive transforms); and how to stand up a global active-active endpoint with the SigV4A signing that every first MRAP integration forgets. Every design decision below comes with the limit that constrains it, the error you’ll see when you get it wrong, and the exact aws command to confirm the fix.

What problem this solves

A single S3 bucket policy is one JSON document with a 20 KB hard size limit, edited by one change process, governing every principal that touches the bucket. That works for a private bucket with three IAM roles. It collapses for a shared data lake. The pain shows up in four distinct ways, none of which a bigger policy fixes:

Who hits this: every platform team that owns a shared bucket fronting a data lake, a media library, a partner-export surface, or a multi-tenant SaaS storage tier. It bites hardest the moment you cross account boundaries (cross-account sharing crammed into a bucket policy), the moment PII enters a shared dataset (the duplicate-copy trap), and the moment you go multi-region (routing and divergence). Here is the field this article covers, the question each feature forces, and where you reach first:

Capability What it actually does First question it forces Where you configure it Most common first mistake
Access Point A named endpoint with its own policy + BPA on one bucket Should the bucket policy authorize, or just delegate? aws s3control create-access-point Forgetting the bucket-policy delegation statement
VPC-bound Access Point An access point reachable only via an interface/gateway endpoint in one VPC Must this data ever have a public path? --vpc-configuration at create No matching VPC endpoint → all requests fail
Object Lambda Access Point (OLAP) A Lambda inserted into the GET path; transforms bytes per request Is the transform length-preserving (range-safe)? create-access-point-for-object-lambda Not handling Range/partNumber headers
Multi-Region Access Point One global endpoint routing to the closest healthy bucket Is the data actually replicated, or just routed? create-multi-region-access-point Routing over divergent (un-replicated) data
Cross-account Access Point An access-point policy granting a principal in another account Did both sides (AP policy + consumer IAM) allow it? put-access-point-policy Granting only one side of the cross-account pair

Learning objectives

By the end of this article you can:

Prerequisites & where this fits

You should already be comfortable with the S3 fundamentals — storage classes, versioning, lifecycle, and encryption, because access points sit on top of a normal bucket and inherit its versioning and encryption behaviour. You should understand IAM policy evaluation — users, roles, policies, and how Allow/Deny resolve, since the whole access-point model is a layering of resource policies, and you should know the least-privilege and permission-boundary patterns that make per-tenant policies safe. Cross-account sharing assumes familiarity with AWS Organizations, SCPs, and delegated administration. For the VPC-bound path you’ll want the VPC deep dive — subnets, routing, IGW, NAT, and endpoints, and Object Lambda assumes the Lambda deep dive — runtimes, triggers, layers, and concurrency.

This sits in the Storage & data-access track, one layer above plain S3 and one layer below a full S3 data protection and governance at scale program. Think of it as the access-plane design for a shared bucket: the bucket itself (and its replication, encryption, lifecycle) is the data plane; access points, OLAP, and MRAP are how you expose, transform, and globalize that plane without copying it. A quick map of who owns what during a shared-bucket design, so you pull in the right person:

Layer What lives here Who usually owns it What it can break
Bucket policy Delegation statement only Platform / data-lake team Every consumer (blast radius) if it authorizes directly
Access-point policy Per-tenant fine-grained grants The consuming team (delegated) One tenant’s access only
BPA (account + AP) Public-access guardrails Security / platform Re-exposure if loosened (it can’t be, by design)
VPC + endpoints Interface/gateway endpoint, endpoint policy Network team VPC-bound AP unreachable if endpoint missing
Lambda transform The OLAP function + its role App/data team Wrong/empty bytes; latency on every GET
Replication + MRAP CRR rules, RTC, routing dials Platform / SRE Divergent data served as “one” global object

Core concepts

Five mental models make every later decision obvious.

An access point is a named door, not a copy. An S3 access point is a named network endpoint attached to a single bucket, each with its own resource policy, its own Block Public Access (BPA) settings, and optionally a VPC restriction. It is not a copy of the data and not a new storage location — it is an alternate front door into the same objects, with its own lock. Requests through it still reach the bucket, so the bucket owner delegates to access points with one statement and the access-point policy does the fine-grained work.

The bucket policy becomes a delegation document, not an authorization document. This is the single biggest shift. Instead of “allow role/finance-etl to s3:GetObject on finance/*,” the bucket policy says “allow any request that arrived via an access point owned by this account” (s3:DataAccessPointAccount). The authorization moves into N small access-point policies you can actually reason about. You go from one unauditable monolith to many single-responsibility policies.

Block Public Access composes as the most-restrictive of the layers. Every access point carries its own BPA, and the effective setting is the most restrictive of the access-point setting and the account/bucket setting. You can never use an access point to loosen public access the account locked down — if the account blocks public policies, no access point can re-expose the bucket. This is a one-way ratchet by design, and it’s why VPC-bound access points are the strongest network control S3 offers for a shared bucket.

Object Lambda transforms on the read path, so you keep exactly one authoritative copy. When a client GETs through an Object Lambda Access Point, S3 invokes your Lambda, hands it a pre-signed URL to the original object, and your function returns transformed bytes via WriteGetObjectResponse — without writing a derived copy back to S3. Redaction, PII masking, row filtering, watermarking, format conversion all happen per request, based on who is asking. The cost is a Lambda invocation per GET and the latency it adds.

MRAP routes; replication copies — they are different jobs. A Multi-Region Access Point is one global hostname that routes each request to the lowest-latency healthy underlying bucket using AWS Global Accelerator anycast under the hood. It does not move data between regions. “The object exists in the other region” is your responsibility via Cross-Region Replication (CRR). An MRAP over un-replicated buckets is latency routing over divergent data — a correctness bug waiting to happen. And because a single global request can be served from any region, it cannot be signed with region-scoped SigV4; it requires SigV4A.

The vocabulary in one table

Pin down every moving part before the deep sections. The glossary repeats these for lookup; this is the model side by side:

Term One-line definition Where it lives Why it matters
Access point (AP) Named endpoint + policy + BPA on one bucket Account, region of the bucket Decomposes the bucket policy
Supporting access point A plain AP that an OLAP points at On the bucket OLAP’s data source; must be the full ARN
Object Lambda AP (OLAP) Lambda inserted into the GET path References a supporting AP Transform on read, no copy
MRAP One global endpoint over multi-region buckets Global (no region segment) Latency routing + failover
CRR Cross-Region Replication between buckets Replication config on buckets Keeps MRAP data consistent
RTC Replication Time Control (15-min SLA) CRR rule option Bounded replication lag
s3:DataAccessPointAccount Condition key: “request came via an AP in this account” Bucket policy condition The delegation statement
s3:DataAccessPointArn Condition key: scope to specific AP ARNs Bucket policy condition Tighter delegation
BPA Block Public Access (4 flags) Account + each AP Most-restrictive wins
SigV4A Multi-region request signing variant Client/SDK signer Mandatory for MRAP
WriteGetObjectResponse API the OLAP Lambda calls to return bytes Lambda code + IAM How transformed bytes reach the caller
VPC configuration Binds an AP to one VPC AP at create time No public path, ever

And the three access-point flavors side by side — pick by what job you need the door to do:

Flavor Primary job Adds over a bucket Signing Reach for it when
Standard AP Per-tenant policy + BPA Decomposed authorization SigV4 Many teams/prefixes on one bucket
VPC-bound AP No public path Network isolation SigV4 Data must never be internet-reachable
Object Lambda AP Transform on read In-flight redaction/convert SigV4 Two views of one object (raw vs masked)
MRAP Global routing + failover Multi-region single endpoint SigV4A Active-active low-latency reads

The mental model: an access point is a named door, not a copy

An S3 access point is a named network endpoint attached to a single bucket, each with its own resource policy, its own Block Public Access settings, and optionally a VPC restriction. It is not a copy of the data and it is not a new storage location — it is an alternate front door into the same objects, with its own lock.

Three facts drive every design decision below:

Internalize this: the bucket policy becomes a delegation document (“allow access via my access points”), and the access point policies become the authorization documents. You move from one unauditable monolith to many small, single-responsibility policies you can actually reason about.

Access points use a distinct ARN shape and a distinct hostname, so application code addresses the access point, not the bucket:

arn:aws:s3:us-east-1:111122223333:accesspoint/finance-reports-ap
https://finance-reports-ap-111122223333.s3-accesspoint.us-east-1.amazonaws.com

The ARN and hostname forms differ across the three access-point types — using the wrong one is the most common “it worked in the console but not in code” error. Keep this table open while you write client code:

Access-point type ARN form Hostname / how clients address it Region segment?
Standard AP arn:aws:s3:<region>:<acct>:accesspoint/<name> <name>-<acct>.s3-accesspoint.<region>.amazonaws.com Yes
AP object ref (in policy) …:accesspoint/<name>/object/<key-or-prefix> n/a (policy Resource) Yes
Object Lambda AP arn:aws:s3-object-lambda:<region>:<acct>:accesspoint/<name> pass the ARN as --bucket to get-object Yes
MRAP arn:aws:s3::<acct>:accesspoint/<alias>.mrap <alias>.accesspoint.s3-global.amazonaws.com No (global)

And the naming/ARN identifiers you’ll juggle, with their constraints — names are not arbitrary, and collisions are scoped by account:

Identifier Format / rule Constraint Collision scope
AP name lowercase, hyphens, 3–50 chars No underscores, no uppercase Per account + region
AP alias auto-generated, S3-style Read-only; used in some hostnames Globally unique
MRAP name 3–50 chars, lowercase Cannot be reused after delete for a while Per account
MRAP alias auto-generated <id>.mrap Immutable; this is what clients use Globally unique
OLAP name lowercase, hyphens Same rules as AP Per account + region

Why bucket policies break down at scale

A single bucket policy has a 20 KB size limit. That sounds generous until you have dozens of consumers, each needing a Condition block for their VPC, their aws:PrincipalOrgID, their prefix, and their allowed actions. You also hit operational problems that have nothing to do with size:

Access points solve all four: separate policies (separate size budgets), per-access-point blast radius, independent change ownership, and per-access-point VPC binding. The bucket policy collapses to a delegation statement:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DelegateToAccessPoints",
      "Effect": "Allow",
      "Principal": { "AWS": "arn:aws:iam::111122223333:root" },
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::datalake-shared-prod",
        "arn:aws:s3:::datalake-shared-prod/*"
      ],
      "Condition": {
        "StringEquals": { "s3:DataAccessPointAccount": "111122223333" }
      }
    }
  ]
}

That s3:DataAccessPointAccount condition is the key: it says “permit any request that arrived via an access point owned by this account.” The bucket stops making fine-grained decisions and lets the access points do it. (You can also use s3:DataAccessPointArn to scope to specific access points.)

Here is the honest before/after — the monolith versus the decomposed model — across the dimensions that actually bite a platform team:

Dimension Monolithic bucket policy Access-point model
Size budget One 20 KB document for everyone 20 KB per access point, plus a tiny bucket policy
Blast radius One edit can break all consumers One AP policy affects one consumer
Change ownership Central team gatekeeps every edit Each team owns its AP policy (delegated)
Per-tenant network Conditions piled into one doc VpcConfiguration per access point
Auditability One giant doc, hard to reason about N single-purpose docs, each auditable
Revocation Edit and hope you didn’t break others Delete the access point — clean cut
Cross-account Crammed in with Principal + conditions Lives in one AP policy, bucket untouched

The two delegation condition keys are not interchangeable — pick by how much you trust the access-point layer:

Condition key What it permits Use when Risk if misused
s3:DataAccessPointAccount Any AP owned by the named account You trust every AP in the account equally A rogue AP in the account inherits trust
s3:DataAccessPointArn Only the listed AP ARNs (supports wildcards) You want to allow-list specific access points Easy to forget to add a new AP → access denied
Both, combined Account and ARN pattern must match Tightest: account-scoped + ARN-scoped More to maintain
Neither (direct grants) Classic per-principal authorization You are not using access points The monolith problem returns

Creating access points with scoped policies

Create an access point per application. The first one is internet-routable but still gated by its policy and BPA; you scope it down with the policy:

aws s3control create-access-point \
  --account-id 111122223333 \
  --name finance-reports-ap \
  --bucket datalake-shared-prod

Now attach a policy that confines this access point to one prefix and one set of actions. Note the access point ARN in Resource and the /object/ segment used to reference objects through the access point:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "FinanceReadWritePrefix",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::111122223333:role/finance-etl"
      },
      "Action": ["s3:GetObject", "s3:PutObject"],
      "Resource": "arn:aws:s3:us-east-1:111122223333:accesspoint/finance-reports-ap/object/finance/*"
    }
  ]
}

Apply it:

aws s3control put-access-point-policy \
  --account-id 111122223333 \
  --name finance-reports-ap \
  --policy file://finance-ap-policy.json

The create-access-point call accepts a focused set of parameters; knowing the default and the gotcha for each removes the guesswork:

Parameter What it sets Default When to set it Gotcha
--name The access-point name (→ hostname) required always Immutable; lowercase, no underscores
--bucket The bucket it fronts required always One bucket per AP; cannot be changed
--bucket-account-id Owner account of the bucket the caller cross-account bucket Needed when AP and bucket are in different accounts
--vpc-configuration Binds the AP to one VPC none (internet) private-only data No public DNS path once set; needs a VPC endpoint
--public-access-block-configuration The four BPA flags on this AP all true rarely loosen Cannot loosen below account BPA

The four BPA flags, what each blocks, and why you set all of them unless audited otherwise:

BPA flag Blocks Default on AP When you’d ever set false
BlockPublicAcls New public ACLs on PUT true Legacy ACL-based workflow (avoid)
IgnorePublicAcls Honoring existing public ACLs true Almost never
BlockPublicPolicy Public bucket/AP policy statements true Never on a shared data lake
RestrictPublicBuckets Cross-account/anonymous via public policy true Never on a shared data lake

VPC-only access points

For an access point that must never be reachable from the internet, bind it to a VPC at creation. This is the single strongest network control S3 offers for a shared bucket: the access point simply has no public DNS path.

aws s3control create-access-point \
  --account-id 111122223333 \
  --name analytics-internal-ap \
  --bucket datalake-shared-prod \
  --vpc-configuration VpcId=vpc-0abc123def456 \
  --public-access-block-configuration \
    BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

A VPC-bound access point is reachable only through an S3 interface endpoint (or gateway endpoint) in that VPC. Combine it with an endpoint policy and you have a closed loop: traffic stays on the AWS network, and the access point rejects anything from outside vpc-0abc123def456.

The network-origin matrix — what each access-point shape will and won’t answer — is the difference between “locked down” and “accidentally public”:

Access-point shape Reachable from internet? Reachable from VPC (endpoint)? Reachable cross-account? Strongest guarantee
Standard AP (policy-gated) Yes, if policy allows Yes (via endpoint) Yes, if AP policy + IAM allow Policy is the only gate
VPC-bound AP No DNS path Yes (only that VPC) Only from that VPC’s account context No public path, ever
OLAP (on a supporting AP) Yes, if policy allows Yes Yes Transform always runs
MRAP Yes (global anycast) Yes (via endpoint, regional) Yes, if policies allow Latency routing + failover

You can also pair the VPC-bound access point with an S3 gateway or interface endpoint policy for defense in depth. The two endpoint types differ in ways that matter for access points:

Endpoint type Cost Used by Works with VPC-bound AP Note
Gateway endpoint Free Route-table prefix list Yes No private DNS; same-region only
Interface endpoint (PrivateLink) Hourly + per-GB ENI + private DNS Yes Private IP; cross-region capable
No endpoint n/a Public internet No (AP won’t answer) VPC-bound AP requires an endpoint

Delegated cross-account access

Access points shine for cross-account sharing because the access point policy can grant to a principal in another account, and that consumer addresses the access point ARN directly — they never see your bucket name. The owning account still controls everything via the access point policy plus the delegating bucket policy.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "PartnerReadOnly",
      "Effect": "Allow",
      "Principal": { "AWS": "arn:aws:iam::444455556666:role/partner-ingest" },
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:us-east-1:111122223333:accesspoint/partner-share-ap/object/exports/*",
      "Condition": {
        "StringEquals": { "aws:PrincipalOrgID": "o-exampleorgid" }
      }
    }
  ]
}

The cross-account principal also needs a matching Allow in its own IAM policy — cross-account always requires both sides. But critically, the bucket policy stays untouched; all the partner-specific logic lives in one small access point policy you can revoke by deleting the access point.

Cross-account access fails silently in predictable ways. This is the four-cornered checklist — all four must align or the consumer gets AccessDenied:

Side What must allow Owned by Symptom if missing
Bucket policy (delegation) s3:DataAccessPointAccount/Arn for the AP Producer AccessDenied even with a perfect AP policy
Access-point policy The consumer principal + action + /object/ resource Producer AccessDenied; AP didn’t grant
Consumer IAM policy s3:GetObject on the AP ARN (or *) Consumer AccessDenied; consumer side never allowed it
Org/condition guard aws:PrincipalOrgID (optional, recommended) Producer Over-broad grant if omitted; confused-deputy risk

Block Public Access inheritance and naming patterns

Every access point carries its own BPA configuration, and it is the most restrictive of the access point setting and the account/bucket setting that wins. You cannot use an access point to loosen public-access controls that the account-level BPA has locked down. If the account blocks public policies, no access point can re-expose the bucket. Set all four flags on every access point unless you have an explicit, audited reason not to:

# BPA is set at create time; inspect it:
aws s3control get-access-point \
  --account-id 111122223333 \
  --name finance-reports-ap \
  --query 'PublicAccessBlockConfiguration'

For shared datasets, adopt a naming convention that encodes ownership and intent, because the hostname is derived from the name. A consistent scheme — {team}-{dataset}-{rw|ro}-ap — keeps endpoints self-documenting and makes IAM Resource wildcards predictable:

Access point name Purpose Network
finance-reports-rw-ap Finance ETL read/write Internet + policy
analytics-events-ro-ap Analytics read-only VPC-only
partner-share-ro-ap Cross-account export Internet + OrgID

The ARN and hostname both embed the account ID, which is why two accounts can have an access point named reports-ap over different buckets without collision.

A naming convention is not cosmetic — each token does load-bearing work for IAM wildcards and operational clarity:

Token Encodes Example Why it pays off
{team} Owning team finance Group AP policies by team; predictable wildcards
{dataset} What data reports Self-documenting endpoint; maps to a prefix
{rw|ro} Intended access ro At-a-glance least-privilege intent
-ap suffix Resource type -ap / -olap / -mrap Distinguish AP vs OLAP vs MRAP in logs

The effective-BPA composition is worth a truth table, because “I set the AP to allow X” does not mean X is allowed — the account can still block it:

Account BPA Access-point BPA Effective behavior Can the AP re-expose?
All blocked All blocked Fully private No
All blocked Loosened (false) Still fully private (account wins) No
Loosened All blocked Private for this AP No (AP is stricter)
Loosened Loosened Public policy possible (audit hard) Yes — avoid on shared data

S3 Object Lambda: transform on the read path

Object Lambda inserts your own Lambda function into the GET path. When a client reads an object through an Object Lambda Access Point, S3 invokes your function, hands it the original object stream, and your function returns the transformed bytes to the caller — without writing a derived copy back to S3. This is the right tool for redaction, PII masking, row-level filtering, watermarking, and format conversion, because you keep exactly one authoritative copy and transform per-request based on who is asking.

The topology has three layers:

  1. The bucket holds the authoritative object.
  2. A supporting access point (a normal access point) sits on the bucket.
  3. The Object Lambda Access Point points at that supporting access point and names the Lambda transform.

The Lambda receives an event with a pre-signed inputS3Url. It fetches the original, transforms it, and calls WriteGetObjectResponse to stream the result back. Here is a correct PII-masking transform in Python:

import boto3
import re
import urllib3

s3 = boto3.client("s3")
http = urllib3.PoolManager()

# Mask anything that looks like a US SSN.
SSN = re.compile(rb"\b\d{3}-\d{2}-\d{4}\b")

def handler(event, context):
    ctx = event["getObjectContext"]
    # S3 hands us a pre-signed URL to the ORIGINAL object.
    resp = http.request("GET", ctx["inputS3Url"])
    original = resp.data

    transformed = SSN.sub(b"***-**-****", original)

    # Stream the transformed bytes back to the caller.
    s3.write_get_object_response(
        Body=transformed,
        RequestRoute=ctx["outputRoute"],
        RequestToken=ctx["outputToken"],
    )
    return {"status_code": 200}

The function’s execution role needs s3-object-lambda:WriteGetObjectResponse. Because the inputS3Url is pre-signed by S3 itself, the function does not need separate s3:GetObject on the bucket for the standard fetch path — but grant it if your code makes additional S3 calls (e.g., reading a redaction config object).

The event S3 hands your Lambda is small but every field matters. Confuse outputRoute and outputToken and the response goes nowhere:

Event field Type What it is Used for
getObjectContext.inputS3Url string Pre-signed URL to the ORIGINAL object Fetch the source bytes
getObjectContext.outputRoute string Where to send the transformed response RequestRoute in WriteGetObjectResponse
getObjectContext.outputToken string One-time token authorizing the write-back RequestToken in WriteGetObjectResponse
userRequest.url string The original request URL Inspect key/query
userRequest.headers map Caller’s headers (incl. Range) Decide range handling
userIdentity object Who made the request Per-caller transform logic
configuration.payload string Static payload from the OLAP config Pass tunables to the function

WriteGetObjectResponse takes more than just a body — these are the parameters that turn a transform into a correct HTTP response:

Parameter Required What it controls Note
RequestRoute yes Routes the response to the caller From outputRoute
RequestToken yes Authorizes this write-back From outputToken; single-use
Body yes (for 200) The transformed bytes Stream for large objects
StatusCode no HTTP status to return Use 501/416 to reject ranges
ContentType no MIME of the result Set when you change format
ContentLength no Length of the result Required by some clients
ErrorCode / ErrorMessage no Structured error to the caller For rejections

The execution role’s least-privilege set is small — grant exactly these, nothing broader:

Permission On Why When to omit
s3-object-lambda:WriteGetObjectResponse the OLAP Stream bytes back to the caller Never (required)
s3:GetObject the supporting AP / bucket Only if you call S3 beyond inputS3Url Omit if you only use the pre-signed URL
s3:GetObject (config object) a small config key Read a redaction map / allow-list Omit if config is in env vars
logs:* (basic exec role) CloudWatch Logs Function logging Never omit

Wiring Object Lambda, supporting access points, and Range handling

First create the supporting access point (a plain access point), then the Object Lambda Access Point that references it. The Object Lambda Access Point’s SupportingAccessPoint must be the full ARN of the supporting access point:

# 1. Supporting access point on the bucket.
aws s3control create-access-point \
  --account-id 111122223333 \
  --name pii-supporting-ap \
  --bucket datalake-shared-prod

# 2. Object Lambda Access Point referencing it.
aws s3control create-access-point-for-object-lambda \
  --account-id 111122223333 \
  --name pii-redacted-olap \
  --configuration '{
    "SupportingAccessPoint": "arn:aws:s3:us-east-1:111122223333:accesspoint/pii-supporting-ap",
    "TransformationConfigurations": [
      {
        "Actions": ["GetObject"],
        "ContentTransformation": {
          "AwsLambda": {
            "FunctionArn": "arn:aws:lambda:us-east-1:111122223333:function:pii-redactor"
          }
        }
      }
    ]
  }'

Clients then read through the Object Lambda Access Point ARN, and S3 invokes the transform transparently:

aws s3api get-object \
  --bucket arn:aws:s3-object-lambda:us-east-1:111122223333:accesspoint/pii-redacted-olap \
  --key customers/2026/records.csv \
  ./redacted.csv

The OLAP’s TransformationConfigurations block supports more Actions than just GetObject — knowing which your client paths use prevents “transform didn’t run” surprises:

Supported action When it fires Common need If unconfigured
GetObject Plain GET Redaction, conversion Object returned untransformed
GetObject-Range Client sends a Range header Range-safe slicing SDK multipart fails or gets raw bytes
GetObject-PartNumber Client sends partNumber Multipart-aware reads Same as above
HeadObject HEAD request Adjust Content-Length/metadata HEAD reflects original, not transformed
ListObjects / ListObjectsV2 Listing through the OLAP Filter/redact listings List shows original keys

Handling Range and partial reads

This is where naive Object Lambda functions break in production. If a client sends a Range or partNumber header (the AWS SDKs do this constantly for large objects and multipart downloads), your function must handle it. You have two correct options:

A length-preserving transform like fixed-width masking is range-safe; a format conversion (CSV to Parquet, or gzip) is not, because byte offsets in the output no longer map to the input. Know which one you have before you enable range support. A safe rejection looks like this:

    head = event.get("userRequest", {}).get("headers", {})
    if "Range" in head or "range" in head:
        s3.write_get_object_response(
            StatusCode=501,
            ErrorCode="RangeNotSatisfiable",
            ErrorMessage="Range requests are not supported by this transform",
            RequestRoute=ctx["outputRoute"],
            RequestToken=ctx["outputToken"],
        )
        return {"status_code": 501}

Decide range strategy by the nature of the transform — this is the decision table to keep next to your function:

Transform Changes byte length? Range-safe? Strategy Why
Fixed-width masking (***-**-****) No Yes Support ranges; transform the slice Offsets preserved
Variable redaction ([REDACTED]) Yes No Reject ranges (501) Output offsets diverge from input
CSV → Parquet Yes No Reject ranges Whole-object reframe
gzip / compression Yes No Reject ranges Stream is not slice-addressable
Watermark image Maybe Usually no Reject ranges Re-encode changes bytes
Row-level filter (drop rows) Yes No Reject ranges Row boundaries don’t map to byte ranges
Header/metadata rewrite only No Yes Support ranges Body unchanged

When you reject, pick the status that makes the client do the right thing:

Status to return Client behavior Use for
501 Not Implemented SDK retries with a full GET “This transform never supports ranges”
416 Range Not Satisfiable Client treats range as invalid The specific requested range is invalid
200 with full body (ignore range) Client may mis-assemble multipart Avoid — breaks SDK multipart
206 Partial Content (true range) Client accepts the slice Only when genuinely range-safe

Multi-Region Access Points: one global endpoint

A Multi-Region Access Point (MRAP) is a single global endpoint that routes requests to whichever underlying bucket — across multiple Regions — is closest and healthy. You attach buckets in different Regions, wire S3 Cross-Region Replication (CRR) between them for active-active, and clients use one hostname that ends in .accesspoint.s3-global.amazonaws.com. S3 routes each request to the lowest-latency available bucket using latency-based routing built on AWS Global Accelerator under the hood.

Create the MRAP over two regional buckets:

aws s3control create-multi-region-access-point \
  --account-id 111122223333 \
  --details '{
    "Name": "global-assets-mrap",
    "Regions": [
      { "Bucket": "assets-use1" },
      { "Bucket": "assets-euw1" }
    ]
  }'

This is asynchronous; poll the request token until it reports SUCCEEDED, then read the generated alias (the global hostname prefix):

aws s3control list-multi-region-access-points \
  --account-id 111122223333 \
  --query 'AccessPoints[?Name==`global-assets-mrap`].[Name,Alias,Status]'

For active-active you must configure two-way replication so a write in either Region propagates to the other. Enable replication on both buckets, turn on bidirectional sync (replica modifications and delete-marker replication as your data model requires), and ideally enable S3 Replication Time Control (RTC) for a 15-minute replication SLA. Without CRR, an MRAP is just latency routing over divergent data — which is a correctness bug waiting to happen.

Failover is automatic for read availability: if S3 detects a Regional impairment, it routes around it. But “the object exists in the other Region” is your responsibility via replication. MRAP routes; it does not copy. Replication copies.

An MRAP goes through several states during creation/deletion; polling the wrong way looks like a hang:

MRAP status Meaning What to do Typical duration
CREATING Anycast endpoint being provisioned Poll; do not use yet up to ~30 min
READY Endpoint live and routing Use it
DELETING Tear-down in progress Wait before reusing the name minutes
PARTIALLY_CREATED Some regions failed to attach Investigate the failed region
FAILED Creation failed Read the failure reason; recreate

The replication options that make an MRAP correct rather than merely routed — each one closes a specific divergence gap:

Replication option What it does Default Enable when Cost impact
CRR rule (one-way) Source → destination copy off Read-only replica region Per-GB transfer + request
Two-way (bidirectional) Writes in either region propagate off Active-active MRAP Doubles replication traffic
Replica modification sync Replicate metadata/ACL changes on replicas off Active-active correctness Minor
Delete-marker replication Propagate delete markers off If deletes must mirror Minor
RTC (Replication Time Control) 15-min SLA + metrics off You need bounded lag Per-GB RTC fee
Replica ownership override Destination account owns replicas off Cross-account replication None

Request routing, failover, and SigV4A signing

MRAP supports two routing controls. By default it uses latency-based routing across all active Regions. You can also flip a Region’s routing status to drain it (for maintenance or a controlled failover) using the routing-control API:

aws s3control submit-multi-region-access-point-routes \
  --account-id 111122223333 \
  --mrap global-assets-mrap \
  --route-updates '[
    { "Bucket": "assets-euw1", "Region": "eu-west-1", "TrafficDialPercentage": 0 },
    { "Bucket": "assets-use1", "Region": "us-east-1", "TrafficDialPercentage": 100 }
  ]'

Setting TrafficDialPercentage to 0 drains a Region without deleting anything — the canonical way to do a planned, reversible failover.

The routing controls and what each is for — confusing “drain” with “delete” is how people cause outages during maintenance:

Control API / setting Effect Reversible? Use for
Latency routing (default) none Closest healthy region serves n/a Normal active-active
TrafficDialPercentage=0 submit-...-routes Drain a region (no new traffic) Yes (dial back up) Planned maintenance / failover
TrafficDialPercentage=100 submit-...-routes Full traffic to a region Yes Restore after drain
Automatic health failover built-in Route around an impaired region n/a (auto) Region outage
Remove a region submit-multi-region-access-point-routes is not delete (use update MRAP) Permanent topology change

SigV4A is mandatory

This is the detail that trips up every first MRAP integration. Because a single global request can be served from any Region, it cannot be signed with classic SigV4 (which is Region-scoped). MRAP requests must use Signature Version 4A (SigV4A), the multi-Region signing variant. The recent AWS SDKs and CLI v2 support SigV4A, but you typically must enable the CRT (Common Runtime) auth dependency. For the CLI:

# SigV4A for MRAP requires the CRT signing component.
pip install 'awscli[crt]'   # or use a CLI v2 build with CRT bundled

# Address the MRAP by its ARN; the SDK selects SigV4A automatically.
aws s3api get-object \
  --bucket arn:aws:s3::111122223333:accesspoint/mfzwi23gnjvgw.mrap \
  --key images/logo.png \
  ./logo.png

Note the MRAP ARN form: arn:aws:s3::<account>:accesspoint/<alias>.mrap — no Region segment, because it is global. If you see SignatureDoesNotMatch on your first MRAP call, the cause is almost always a SigV4A-incapable signer; install the CRT extra and retry.

SigV4 vs SigV4A side by side — this is the table that explains the failure you’ll inevitably hit once:

Property SigV4 (classic) SigV4A (multi-region)
Scope Single region Multiple regions (*)
Used by Standard S3, regional APs MRAP
Dependency Built into all SDKs Requires CRT (e.g. awscli[crt])
Symptom when wrong n/a SignatureDoesNotMatch on first MRAP call
Region in signature The bucket’s region * (any region)
How to enable default Install CRT; SDK auto-selects for .mrap ARNs

Architecture at a glance

Trace the system left to right and it tells the whole story. On the left, two consumers approach the same authoritative data through different doors: an internal analytics role arrives over a VPC-bound access point (no public DNS path, reachable only via the S3 interface endpoint in vpc-0abc…), while an external partner arrives at an Object Lambda Access Point over the public edge. The center is the access plane: every door funnels into the bucket’s delegation policy, which authorizes nothing on its own — it simply trusts s3:DataAccessPointAccount. The partner’s path passes through the OLAP, which invokes the PII-redactor Lambda; that function fetches the original via the pre-signed inputS3Url, masks email and device fields, and streams the result back with WriteGetObjectResponse. The internal path goes straight to objects. On the right sits the data tier: the primary bucket datalake-shared-prod in us-east-1, plus — for the global-assets workload — a Multi-Region Access Point that anycasts reads to the closest healthy regional bucket, kept consistent by two-way Cross-Region Replication with RTC.

Five numbered failure points are marked on the hops where they actually bite. (1) is the bucket-policy delegation: forget s3:DataAccessPointAccount and every access point returns AccessDenied. (2) is the VPC-bound access point with no matching endpoint — every request times out. (3) is the OLAP Range trap — a naive transform mis-serves SDK multipart downloads. (4) is the Lambda write-back — a mismatched outputRoute/outputToken and the bytes go nowhere. (5) is the MRAP signing/replication pair — SigV4A missing gives SignatureDoesNotMatch, and un-replicated buckets make the “one global object” diverge. Read the legend as symptom · confirm · fix and you have the diagnostic map next to the architecture.

Architecture of S3 shared-data access showing two consumer paths on the left — an internal analytics role reaching a VPC-bound access point through an S3 interface endpoint, and an external partner reaching an Object Lambda Access Point at the public edge — both funneling into a central access plane where the bucket delegation policy trusts s3:DataAccessPointAccount, the OLAP invokes a PII-redactor Lambda that fetches the original via a pre-signed inputS3Url and streams masked bytes back via WriteGetObjectResponse, and the right-hand data tier holds the primary datalake-shared-prod bucket in us-east-1 plus a Multi-Region Access Point anycasting reads across regional buckets kept consistent by two-way Cross-Region Replication with Replication Time Control; five numbered badges mark the bucket-policy delegation, the VPC-endpoint requirement, the OLAP Range trap, the Lambda write-back token pair, and the MRAP SigV4A-and-replication pair

Real-world scenario

A media analytics platform team — call it PrismFeed — ran a single customer-events-prod bucket shared by 30+ internal teams plus two external data partners. The bucket policy had grown past 18 KB and was within sight of the 20 KB hard limit; the last partner onboarding had failed because the policy would not save. Worse, the same dataset had to be served two ways: internal analysts got raw event records, but a downstream BI partner was contractually forbidden from seeing raw email addresses and device IDs. The team had been solving this by running a nightly Glue job that wrote a second, redacted copy of every object to a partner/ prefix — doubling storage for 400 TB of events and introducing a 24-hour staleness gap that the partner kept complaining about.

The constraint: stay within the policy size limit, eliminate the duplicate redacted copy, and serve both audiences from one authoritative object — while keeping the partner traffic governed and the raw data inside a specific VPC.

They restructured around access points and Object Lambda. The bucket policy was rewritten to a single 400-byte delegation statement using s3:DataAccessPointAccount. Internal teams each got a VPC-bound access point scoped to their prefix. The partner got an Object Lambda Access Point whose transform masked email and device fields on the fly, eliminating the nightly Glue job and the 400 TB duplicate entirely — and the partner now saw live data with zero staleness. The masking function was deliberately length-preserving so range requests stayed safe:

import re
EMAIL = re.compile(rb'"email":"[^"]*"')
DEVICE = re.compile(rb'"device_id":"[^"]*"')

def mask(chunk: bytes) -> bytes:
    chunk = EMAIL.sub(b'"email":"[REDACTED]"', chunk)
    return DEVICE.sub(b'"device_id":"[REDACTED]"', chunk)

The result: the 18 KB policy became one line, partner onboarding stopped being a policy-size gamble, S3 storage dropped by roughly a third (the eliminated redacted copies), and the staleness complaint disappeared because redaction now happened at read time on the single live object. The one real cost they accepted was Lambda invocation on the partner read path — which, for a partner pulling a few thousand objects a day, was a rounding error next to 400 TB of duplicated storage.

The migration as a before/after ledger, because the deltas are the lesson:

Dimension Before (monolith + Glue copy) After (access points + OLAP) Net effect
Bucket policy size ~18 KB, near the 20 KB wall ~400 bytes (delegation only) Onboarding no longer a size gamble
Partner data freshness 24 h stale (nightly job) Live (read-time redaction) Complaint eliminated
Storage footprint 400 TB raw + 400 TB redacted copy 400 TB raw only ~⅓ total storage cut
Redaction mechanism Nightly Glue job OLAP Lambda per GET No batch pipeline to operate
Per-tenant network Conditions in one policy VPC-bound AP per team Clean isolation
New cost introduced Glue compute nightly Lambda per partner GET Far smaller; scales with reads
Blast radius of a bad edit All 30+ teams One access point Contained

Advantages and disadvantages

Decomposing a shared bucket into access points, OLAP, and MRAP is overwhelmingly the right move at scale — but each capability carries a cost you should accept with eyes open:

Advantages Disadvantages
Bucket policy collapses to one delegation line; per-tenant policies get their own 20 KB budget One more layer to reason about — a request now traverses bucket policy and AP policy and consumer IAM
Per-access-point blast radius — a bad edit breaks one consumer, not all More objects to govern (N access points, OLAPs, MRAPs) and to monitor
VPC-bound access points give a true no-public-path network control A VPC-bound AP with no matching endpoint is silently unreachable
Object Lambda kills the duplicate-copy pattern — one authoritative object, transformed per request Lambda invocation + latency on every GET through the OLAP; a hot read path can dominate cost
Cross-account sharing lives in one revocable AP policy; bucket stays untouched Cross-account still needs both sides aligned (AP policy + consumer IAM) — silent AccessDenied if not
MRAP gives one global endpoint with automatic read failover MRAP routes but does not copy — divergence if CRR lags or is misconfigured
Access points themselves are free and don’t change S3 request pricing MRAP adds a per-GB routing charge; CRR adds transfer + (with RTC) an SLA fee
Request metrics + CloudTrail attribute every request to the door it came through SigV4A (CRT) is an easy-to-miss prerequisite — first call fails with SignatureDoesNotMatch

The model is right for any shared bucket fronting a data lake, a partner-export surface, or a multi-tenant storage tier — anywhere the bucket policy is a contention point or PII forces two views of one dataset. It is overkill for a private bucket with a handful of IAM roles, where a plain bucket policy is simpler. It bites hardest when teams forget the second side of a cross-account grant, ship an OLAP transform that ignores Range, or stand up an MRAP without actually wiring replication — all three are “works in the demo, fails in production” traps this article exists to prevent.

Hands-on lab

Stand up a real access-point decomposition, then an Object Lambda redaction, end to end — free-tier-friendly (one small object, one tiny Lambda). Run in CloudShell or any shell with the AWS CLI v2 and credentials. Replace 111122223333 with your account ID.

Step 1 — Variables and a shared bucket.

ACCT=$(aws sts get-caller-identity --query Account --output text)
REGION=us-east-1
BUCKET=datalake-lab-$ACCT
aws s3api create-bucket --bucket $BUCKET --region $REGION
aws s3api put-public-access-block --bucket $BUCKET \
  --public-access-block-configuration \
  BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

Expected: the bucket is created and all four BPA flags are on.

Step 2 — Put a sample object with fake PII.

printf '{"name":"Asha","email":"asha@example.com","device_id":"dev-9921","ssn":"123-45-6789"}\n' > rec.json
aws s3api put-object --bucket $BUCKET --key customers/rec.json --body rec.json

Step 3 — Reduce the bucket policy to a delegation statement.

cat > delegate.json <<EOF
{ "Version":"2012-10-17","Statement":[{
  "Sid":"DelegateToAPs","Effect":"Allow",
  "Principal":{"AWS":"arn:aws:iam::$ACCT:root"},
  "Action":"s3:*",
  "Resource":["arn:aws:s3:::$BUCKET","arn:aws:s3:::$BUCKET/*"],
  "Condition":{"StringEquals":{"s3:DataAccessPointAccount":"$ACCT"}}
}]}
EOF
aws s3api put-bucket-policy --bucket $BUCKET --policy file://delegate.json

Expected: the bucket now authorizes only via access points.

Step 4 — Create a standard access point and read through it.

aws s3control create-access-point --account-id $ACCT --name lab-ap --bucket $BUCKET
aws s3api get-object \
  --bucket arn:aws:s3:$REGION:$ACCT:accesspoint/lab-ap \
  --key customers/rec.json ./via-ap.json
cat ./via-ap.json   # full record, unredacted

Expected: the object reads back through the access-point ARN — proof the delegation works.

Step 5 — Deploy a redactor Lambda.

cat > app.py <<'EOF'
import boto3, re, urllib3
s3=boto3.client("s3"); http=urllib3.PoolManager()
EMAIL=re.compile(rb'"email":"[^"]*"'); SSN=re.compile(rb"\b\d{3}-\d{2}-\d{4}\b")
def handler(event, context):
    ctx=event["getObjectContext"]
    data=http.request("GET", ctx["inputS3Url"]).data
    data=EMAIL.sub(b'"email":"[REDACTED]"', data)
    data=SSN.sub(b"***-**-****", data)
    s3.write_get_object_response(Body=data,
        RequestRoute=ctx["outputRoute"], RequestToken=ctx["outputToken"])
    return {"status_code":200}
EOF
zip fn.zip app.py
# Assume an execution role 'olap-lab-role' with WriteGetObjectResponse + basic logging exists.
aws lambda create-function --function-name pii-redactor-lab \
  --runtime python3.12 --handler app.handler --zip-file fileb://fn.zip \
  --role arn:aws:iam::$ACCT:role/olap-lab-role --timeout 30

Step 6 — Create the supporting AP and the Object Lambda Access Point.

aws s3control create-access-point --account-id $ACCT --name lab-support-ap --bucket $BUCKET
aws s3control create-access-point-for-object-lambda --account-id $ACCT \
  --name lab-redacted-olap --configuration "{
    \"SupportingAccessPoint\":\"arn:aws:s3:$REGION:$ACCT:accesspoint/lab-support-ap\",
    \"TransformationConfigurations\":[{\"Actions\":[\"GetObject\"],
      \"ContentTransformation\":{\"AwsLambda\":{\"FunctionArn\":
      \"arn:aws:lambda:$REGION:$ACCT:function:pii-redactor-lab\"}}}]}"

Step 7 — Read through the OLAP and confirm masking.

aws s3api get-object \
  --bucket arn:aws:s3-object-lambda:$REGION:$ACCT:accesspoint/lab-redacted-olap \
  --key customers/rec.json ./via-olap.json
diff <(cat ./via-ap.json) <(cat ./via-olap.json) && echo "NO MASKING (BUG)" || echo "MASKED OK"
cat ./via-olap.json   # email + ssn now redacted

Expected: MASKED OK, and the OLAP output shows "email":"[REDACTED]" and ***-**-**** while the raw AP read did not. This is the exact check that catches a misconfigured OLAP topology.

Validation checklist — what each step proved:

Step What you did What it proves
3 Bucket policy → delegation only Authorization moved out of the bucket
4 Read via standard AP ARN The delegation actually works
5–6 OLAP over a supporting AP The three-layer topology is wired right
7 diff raw vs OLAP output The transform actually changes bytes

Cleanup (avoid lingering charges):

aws s3control delete-access-point-for-object-lambda --account-id $ACCT --name lab-redacted-olap
aws s3control delete-access-point --account-id $ACCT --name lab-support-ap
aws s3control delete-access-point --account-id $ACCT --name lab-ap
aws lambda delete-function --function-name pii-redactor-lab
aws s3 rm s3://$BUCKET --recursive
aws s3api delete-bucket --bucket $BUCKET

Cost note. One object, a handful of Lambda invocations, and a few requests — this lab costs effectively nothing (well under ₹10) and deleting the bucket + functions stops everything. Access points and OLAPs carry no standing charge.

Common mistakes & troubleshooting

This is the playbook — bookmark it. First as a scannable table, then the full reasoning for the entries that bite hardest. Confirm each layer is doing exactly what you intended before you hand the topology to consumers.

# 1. Access point exists and carries the VPC + BPA you expect.
aws s3control get-access-point \
  --account-id 111122223333 --name analytics-internal-ap \
  --query '{vpc:VpcConfiguration, bpa:PublicAccessBlockConfiguration}'

# 2. The access point policy is the small, scoped one (not the monolith).
aws s3control get-access-point-policy \
  --account-id 111122223333 --name finance-reports-rw-ap

# 3. Object Lambda actually transforms: original vs. transformed bytes differ.
aws s3api get-object --bucket datalake-shared-prod --key customers/2026/records.csv ./raw.csv
aws s3api get-object \
  --bucket arn:aws:s3-object-lambda:us-east-1:111122223333:accesspoint/pii-redacted-olap \
  --key customers/2026/records.csv ./masked.csv
diff <(head -c 200 ./raw.csv) <(head -c 200 ./masked.csv) && echo "NO MASKING" || echo "MASKED OK"

# 4. MRAP is READY and reports both Regions.
aws s3control get-multi-region-access-point \
  --account-id 111122223333 --name global-assets-mrap \
  --query 'AccessPoint.{status:Status, regions:Regions[].Region}'

# 5. Replication is keeping both buckets in sync (expect near-zero pending).
aws cloudwatch get-metric-statistics \
  --namespace AWS/S3 --metric-name ReplicationLatency \
  --dimensions Name=SourceBucket,Value=assets-use1 \
  --start-time "$(date -u -d '-1 hour' +%FT%TZ)" \
  --end-time "$(date -u +%FT%TZ)" \
  --period 300 --statistics Maximum

The symptom → root cause → confirm → fix playbook — read at 2am, act in minutes:

# Symptom Root cause Confirm (exact cmd / path) Fix
1 Every AP request returns AccessDenied, even with a perfect AP policy Bucket policy missing the delegation statement aws s3api get-bucket-policy --bucket <b> — no s3:DataAccessPointAccount Add the DelegateToAccessPoints statement
2 Cross-account consumer gets AccessDenied Only one side of the pair allows it get-access-point-policy and the consumer’s IAM policy Allow on both AP policy and consumer IAM
3 VPC-bound AP times out / connection refused No S3 endpoint in that VPC aws ec2 describe-vpc-endpoints --filters Name=vpc-id,Values=<vpc> Create a gateway/interface S3 endpoint
4 OLAP returns the original (unredacted) bytes OLAP points at the wrong supporting AP, or transform no-op’d diff raw.csv masked.csv → identical Fix SupportingAccessPoint ARN; verify transform
5 SDK multipart download corrupts / fails through OLAP Transform ignores Range/partNumber Client sends Range; function doesn’t handle it Support range (if safe) or reject with 501
6 OLAP GetObject returns 500 Lambda error or missing WriteGetObjectResponse perm Lambda CloudWatch logs; check exec role Grant s3-object-lambda:WriteGetObjectResponse; fix code
7 First MRAP call: SignatureDoesNotMatch SigV4A signer not available Using non-CRT CLI/SDK pip install 'awscli[crt]' / enable CRT
8 MRAP serves stale/diverging data between regions Replication not configured or lagging get-metric-statistics ReplicationLatency high Enable two-way CRR; turn on RTC
9 MRAP stuck in CREATING for ages Normal async provisioning list-multi-region-access-points → status Wait (~30 min); poll the token
10 New AP can’t be reached despite policy Account BPA blocks the public policy get-public-access-block (account) Use a VPC-bound AP or fix routing — don’t loosen BPA
11 put-bucket-policy fails: policy too large Still authorizing in the bucket policy Policy > 20 KB Decompose into access points (this whole article)
12 Object reads fine by bucket name, fails by AP ARN Wrong ARN/hostname form for the AP type Compare ARN to the type table Use the correct ARN form (standard vs OLAP vs MRAP)

Entry 1 — every access-point request returns AccessDenied. Root cause: the bucket policy still authorizes directly (or was emptied) and lacks the delegation statement, so requests arriving via an access point are never permitted at the bucket. Confirm: aws s3api get-bucket-policy --bucket <b> shows no s3:DataAccessPointAccount/Arn condition. Fix: add the DelegateToAccessPoints statement. This is the number-one access-point onboarding failure — the access-point policy looks perfect, but the bucket never delegated to it.

Entry 3 — VPC-bound access point times out. Root cause: a VPC-bound access point has no public DNS path and is reachable only through an S3 gateway/interface endpoint in that VPC; if the endpoint is missing, every request hangs or is refused. Confirm: aws ec2 describe-vpc-endpoints --filters Name=vpc-id,Values=<vpc> returns nothing for S3. Fix: create a gateway endpoint (free, same-region) or an interface endpoint (PrivateLink) and ensure the route table / DNS is wired.

Entry 4 — Object Lambda returns the original bytes. Root cause: the OLAP’s SupportingAccessPoint points at the wrong access point, or the transform silently no-op’d (regex didn’t match, function returned the input). Confirm: diff the raw read against the OLAP read — if identical, the transform isn’t running. Fix: verify the SupportingAccessPoint is the full ARN of the correct supporting access point and that the Lambda actually mutates the bytes. This is exactly what Step 7 of the lab catches.

Entry 7 — SignatureDoesNotMatch on the first MRAP call. Root cause: MRAP requires SigV4A (region *), and your signer is a classic SigV4-only build. Confirm: you’re on a CLI/SDK without CRT. Fix: pip install 'awscli[crt]' or use a CLI v2 build with CRT bundled; the SDK then auto-selects SigV4A for .mrap ARNs.

Entry 8 — MRAP serves divergent data. Root cause: the MRAP routes to the closest region, but the regions hold different data because replication isn’t configured or has fallen behind — “global” over divergent buckets. Confirm: ReplicationLatency (or pending bytes/operations) is high. Fix: enable two-way CRR and turn on RTC for a bounded 15-minute SLA. Remember: MRAP routes, replication copies.

The error and limit reference

The codes and limits you’ll meet, what they mean for these features specifically, how to confirm, and the fix. The non-obvious ones are the cross-account AccessDenied (which has four possible causes) and SignatureDoesNotMatch (always SigV4A):

Code / signal Meaning here Likely cause How to confirm Fix
AccessDenied (via AP) Request through an AP was denied Missing delegation, AP policy, or consumer IAM Check all three layers Align bucket delegation + AP policy + IAM
AccessDenied (cross-account) One side didn’t allow it Consumer IAM or AP policy missing the grant get-access-point-policy + consumer IAM Allow on both sides
SignatureDoesNotMatch (MRAP) Region-scoped signature on a global endpoint SigV4A not available Using non-CRT signer Install CRT; SDK uses SigV4A
NoSuchAccessPoint AP ARN doesn’t resolve Wrong name/region/account in the ARN list-access-points Correct the ARN form
InvalidRequest (OLAP) Bad OLAP/transform config Malformed TransformationConfigurations get-access-point-for-object-lambda Fix the configuration JSON
501 (from OLAP) Transform rejected a range Your code returns 501 on Range Inspect WriteGetObjectResponse Intentional — clients fall back to full GET
416 RangeNotSatisfiable Range invalid for the object Bad range or non-range-safe transform Function logic Reject cleanly; document no-range support
Connection timeout (VPC-bound AP) No reachable endpoint Missing S3 VPC endpoint describe-vpc-endpoints Create the endpoint
MalformedPolicy (put-bucket-policy) Bad delegation policy JSON Syntax / wrong condition key put-bucket-policy error text Fix the JSON / condition key
PolicyTooLarge / size error Bucket policy over 20 KB Still authorizing in the bucket Policy size Decompose into access points
MRAP FAILED / PARTIALLY_CREATED Creation didn’t fully succeed A region failed to attach get-multi-region-access-point Read failure reason; recreate
High ReplicationLatency Replicas lagging CRR throttled or misconfigured CloudWatch metric Enable RTC; check CRR rules
KMS.AccessDeniedException (on read) Consumer can’t decrypt No kms:Decrypt on the key CloudTrail KMS event Grant kms:Decrypt to the principal/OLAP role
503 SlowDown Request rate too high for a prefix Hot prefix, not enough key spread S3 request metrics Spread keys across prefixes (not APs)
InvalidAccessPointAliasError Alias used where ARN expected Wrong identifier form Compare to the ARN/alias table Use the correct ARN form for the AP type
OLAP 502 / empty body Lambda timed out or returned nothing Slow/failed transform Lambda duration + logs Speed up transform; ensure Body is set

And the hard limits and quotas that constrain a design — real numbers you should size against, not discover in production:

Limit / quota Value Applies to Note
Bucket policy size 20 KB Per bucket The driver for decomposition
Access-point policy size 20 KB Per access point Each AP gets its own budget
Access points per bucket (region) 10,000 (default) Per bucket per region Soft-ish; plan naming for scale
Access point name length 3–50 chars Per AP Lowercase, hyphens, no underscores
GET requests per prefix 5,500 / s Bucket key namespace Scales by prefix, not by AP
PUT/COPY/DELETE per prefix 3,500 / s Bucket key namespace Same — spread keys, not APs
Object Lambda response size up to 5 GB stream Per WriteGetObjectResponse Stream large bodies
Lambda timeout (OLAP) up to 60 s effective for the GET path Per request Keep transforms fast
MRAP regions up to ~17–20 (account/region dependent) Per MRAP One bucket per region per MRAP
RTC SLA 99.99% within 15 min CRR with RTC Per-GB fee applies
Replication async, no SLA without RTC CRR “Eventually” unless RTC
MRAP provisioning time up to ~30 min Per create Async; poll the request token
OLAP WriteGetObjectResponse timeout tied to the GET request budget Per request A slow transform stalls the caller
Supporting AP per OLAP exactly 1 Per OLAP One data source; full ARN required
Object key length up to 1,024 bytes Per object Unchanged by access points
Versioning requirement (CRR) versioning must be on Both buckets CRR refuses un-versioned buckets

Best practices

Security notes

The security controls mapped to the threat each one closes:

Control Mechanism Closes which threat Also helps
Per-AP least-privilege policy scoped Action + /object/<prefix> Over-broad access via a shared door Blast-radius containment
VPC-bound access point VpcConfiguration + endpoint policy Public exposure of sensitive data Data-exfiltration paths
Account BPA ratchet most-restrictive-wins composition Accidental re-exposure by an AP Guardrail across all APs
aws:PrincipalOrgID on cross-account condition key Confused-deputy / third-party ride-along Scoping partner access
OLAP redaction transform Lambda on the GET path A caller class seeing raw PII Eliminates duplicate redacted copies
KMS key policy for consumers kms:Decrypt grant Silent AccessDenied on encrypted reads Cross-account key access
CloudTrail data events per-AP ARN logging Unattributable access on shared data Forensics / compliance

Cost & sizing

A few economics and limits worth internalizing before you build a sprawling topology:

For observability, enable request metrics with an access-point filter so each consumer’s traffic is independently visible, and turn on S3 server access logging or CloudTrail data events — both record the access point ARN, so you can attribute every request to the door it came through. A useful CloudWatch metric-math approach is to alarm per access point on 4xx rate, which surfaces a single broken consumer without noise from the others.

The cost drivers and what each one buys you, with rough figures:

Cost driver What you pay for Rough figure What it’s for Watch-out
Access points Nothing (free) ₹0 Policy decomposition No reason not to use them
S3 requests Per-1,000 GET/PUT ~₹0.03–0.4 / 1k Normal bucket access Unchanged by access points
Object Lambda invocations Lambda per GET + duration scales with reads In-flight transform Hot read path dominates
WriteGetObjectResponse data Per-GB returned per-GB OLAP response path Large objects add up
MRAP routing Per-GB routed per-GB Global endpoint + failover On top of S3 + CRR
CRR transfer Per-GB inter-region per-GB Keeping replicas in sync Doubles for two-way
RTC Per-GB + replication metrics per-GB fee 15-min SLA Only if you need bounded lag
CloudTrail data events Per-100k events per-event Per-door audit High-traffic = real volume

When to reach for each, sized by scale:

If your situation is… Reach for… Because
One bucket, a few IAM roles Plain bucket policy Decomposition is overkill
Shared bucket, many teams/prefixes Access points Kills the 20 KB / blast-radius / contention problem
Two views of one dataset (raw vs redacted) Object Lambda One authoritative copy, transformed per request
Data that must never be public VPC-bound access point No public DNS path
Cross-account sharing Access-point policy + OrgID Revocable, bucket untouched
Global low-latency reads, active-active MRAP + two-way CRR One endpoint, replicated data
Just need a regional read replica Plain CRR (no MRAP) Routing isn’t needed

Interview & exam questions

1. Why does a single bucket policy fail at scale, and what replaces it? A bucket policy is one 20 KB document with one change process governing every principal — it concentrates blast radius, creates change contention, can’t express clean per-tenant network rules, and eventually won’t save. Access points replace it: each consumer gets a named endpoint with its own 20 KB policy and BPA, and the bucket policy collapses to a one-statement delegation using s3:DataAccessPointAccount.

2. What does the bucket policy look like after you adopt access points? It contains a single Allow statement permitting s3:* on the bucket and its objects, conditioned on s3:DataAccessPointAccount equal to the owning account (or s3:DataAccessPointArn for specific access points). It authorizes nothing directly — all fine-grained authorization moves into the per-access-point policies.

3. How does Block Public Access compose between an access point and the account? The effective setting is the most restrictive of the access-point BPA and the account/bucket BPA. An access point can never loosen public access the account has locked down — it’s a one-way ratchet. If the account blocks public policies, no access point can re-expose the bucket.

4. What is a VPC-bound access point and why is it the strongest network control? An access point created with VpcConfiguration has no public DNS path — it answers only requests arriving over an S3 gateway/interface endpoint in that VPC. There is no policy mistake that can make it public, because the endpoint simply doesn’t exist on the internet. That’s stronger than any condition-based restriction.

5. Explain the three-layer Object Lambda topology. The bucket holds the authoritative object; a supporting access point (a plain AP) sits on the bucket; and the Object Lambda Access Point references that supporting AP’s full ARN and names the Lambda transform. Clients GET through the OLAP ARN, S3 invokes the Lambda with a pre-signed inputS3Url, and the function streams transformed bytes back via WriteGetObjectResponse.

6. Why is Range handling the classic Object Lambda trap? AWS SDKs send Range/partNumber headers constantly for large objects and multipart downloads. If your transform changes byte length (variable redaction, format conversion, compression), output offsets no longer map to input, so serving a “range” corrupts the download. You must either support ranges (only for length-preserving transforms) or reject them with 501 so the client falls back to a full GET.

7. What’s the difference between an MRAP and Cross-Region Replication? An MRAP is one global endpoint that routes each request to the closest healthy regional bucket (Global Accelerator anycast) — it does not move data. CRR copies objects between regions. An MRAP without replication is latency routing over divergent data. Routing and copying are separate jobs: MRAP routes, replication copies.

8. Why must MRAP requests use SigV4A? A single global MRAP request can be served from any region, so it can’t carry a region-scoped classic SigV4 signature. SigV4A signs for region *. Recent SDKs/CLI support it but require the CRT signing dependency; without it the first MRAP call fails with SignatureDoesNotMatch. Address the MRAP by its .mrap ARN and the SDK auto-selects SigV4A.

9. How do you perform a planned, reversible MRAP failover? Use submit-multi-region-access-point-routes to set a region’s TrafficDialPercentage to 0, which drains it without deleting anything; dial it back to 100 to restore. This is the canonical maintenance/failover move — distinct from automatic health-based failover (which routes around an impaired region without you doing anything).

10. A cross-account consumer gets AccessDenied despite a correct access-point policy. What’s missing? Cross-account always requires both sides: the access-point policy must grant the external principal, and that principal’s own IAM policy must allow s3:GetObject on the access-point ARN. The bucket policy stays untouched. Add an aws:PrincipalOrgID condition to prevent a confused-deputy.

11. Do access points multiply your request-rate ceiling? No. The 5,500 GET / 3,500 PUT per second is per prefix in the bucket key namespace, not per access point. Fronting a bucket with more access points doesn’t raise the ceiling; spreading keys across more prefixes does. Access points are an access-control and network tool, not a performance lever.

12. When is this whole pattern overkill? For a private bucket with a handful of IAM roles, a plain bucket policy is simpler and sufficient — there’s no contention, blast-radius, or PII-view problem to solve. Reach for access points when the bucket is shared (many teams/prefixes/accounts), for Object Lambda when two audiences need two views of one object, and for MRAP when you need a global active-active read endpoint.

These map primarily to the AWS Certified Solutions Architect – Professional (SAP-C02) (multi-account data sharing, cross-region architectures) and Security – Specialty (SCS-C02) (least-privilege data access, redaction, network isolation), with the storage mechanics relevant to Solutions Architect – Associate (SAA-C03). A compact cert mapping:

Question theme Primary cert Objective area
Access-point decomposition, delegation SAP-C02 / SAA-C03 Design secure, scalable data access
VPC-bound access points, endpoints SCS-C02 / SAP-C02 Network isolation; data perimeter
Object Lambda redaction SCS-C02 Data protection; least exposure
Cross-account sharing + OrgID SAP-C02 Multi-account architectures
MRAP + replication + SigV4A SAP-C02 Multi-region, resilient design
Request-rate scaling per prefix SAA-C03 Performant storage design

Quick check

  1. After adopting access points, what is the only thing the bucket policy should do, and which condition key expresses it?
  2. Your custom Object Lambda function returns the original, unredacted bytes through the OLAP. Name the two most likely causes.
  3. True or false: scaling out to more access points raises your bucket’s GET request-per-second ceiling.
  4. Your first MRAP get-object fails with SignatureDoesNotMatch. What’s wrong and what’s the fix?
  5. You need raw data to have no public path whatsoever for one team. Which access-point feature do you use, and what else must exist in the VPC for it to work?

Answers

  1. Delegate. It should contain a single Allow statement conditioned on s3:DataAccessPointAccount (or s3:DataAccessPointArn), permitting requests that arrived via an access point owned by the account. It authorizes nothing directly; the per-access-point policies do the fine-grained work.
  2. Either the OLAP’s SupportingAccessPoint points at the wrong access point, or the transform silently no-op’d (the regex didn’t match, or the function returned its input). Confirm with a diff of the raw read versus the OLAP read — identical output means the transform isn’t running.
  3. False. The 5,500 GET / 3,500 PUT per second is per prefix in the bucket key namespace, not per access point. More access points don’t raise the ceiling; spreading keys across more prefixes does.
  4. MRAP requires SigV4A (region *); your signer is classic SigV4-only. Install the CRT signing dependency (pip install 'awscli[crt]' or a CRT-bundled CLI v2) and address the MRAP by its .mrap ARN so the SDK auto-selects SigV4A.
  5. A VPC-bound access point (VpcConfiguration at create time) — it has no public DNS path. For it to be reachable, the VPC must also have an S3 gateway or interface endpoint; without an endpoint the access point answers nothing.

Glossary

Next steps

You can now decompose a shared bucket, transform on read, and serve globally without copying. Build outward:

awss3access-pointsobject-lambdadata-accessmulti-region
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments