A mid-tier retail bank in a market that has just mandated open banking gets a hard deadline from its regulator: within nine months it must expose account-information and payment-initiation APIs to licensed third-party providers (TPPs) — budgeting apps, lenders, accounting platforms — and it must do so under the Financial-grade API (FAPI) security profile, the same bar that the UK’s Open Banking and Brazil’s Open Finance hold banks to. The product team hears “expose some REST APIs” and reaches for the usual API gateway with a bearer token. The CISO hears “give regulated third parties direct, consented access to customer accounts and the ability to move money” and goes pale. Both are right. This article is the reference architecture for the second reading — a FAPI-grade open-banking platform on AWS that a TPP can integrate against, a customer can revoke in one tap, and a regulator’s auditor can reconstruct transaction by transaction.
The pressures here are specific to the domain and unforgiving. Regulation means the security profile is not a choice — FAPI 1.0 Advanced (and increasingly FAPI 2.0) prescribes the exact OAuth mechanics: sender-constrained tokens, request-object integrity, and authenticated authorization requests. Trust boundaries mean every caller is an external, only-partly-trusted party holding a certificate issued by a scheme directory, not an internal service. Consent means a human customer must explicitly grant, see, and revoke exactly what a TPP can read or initiate, and that grant is a first-class, auditable object with a lifecycle of its own. And non-repudiation means that when a TPP claims “the customer authorized this £4,000 payment,” the bank must be able to prove it cryptographically, years later. A plain bearer token satisfies none of these.
Why a bearer token is not enough
The naive design — OAuth with a bearer access token over TLS — fails the open-banking threat model in ways that are worth naming, because someone will propose it.
A bearer token is a bearer instrument: whoever holds it can use it. If a token leaks from a TPP’s logs, a proxy, or a compromised dependency, an attacker replays it against the account and payment APIs with no further check. The authorization request travels in the browser URL, so its parameters — scope, redirect_uri, the amount of a payment — can be tampered with or shoulder-surfed before the customer ever sees the consent screen. And nothing binds the token to the client that requested it, so there is no cryptographic link between “the TPP the scheme directory vouched for” and “the credential presented at the account API.”
FAPI closes each gap with a named mechanism, and the architecture below implements all of them:
- mTLS sender-constrained tokens (RFC 8705) — the access token is bound to the TPP’s client certificate, so a stolen token is useless without the matching private key.
- Pushed Authorization Requests (PAR, RFC 9126) — the TPP pushes the (signed) authorization request to the bank’s back channel first and receives an opaque
request_uri; the browser only ever carries that reference, so request parameters cannot be tampered with in transit. - Signed request objects (JAR) and JARM — the authorization request and the authorization response are signed JWTs, giving integrity and non-repudiation on both legs.
- Strong consent — an explicit, revocable, fully-audited grant object that the token is scoped to, stored and versioned independently of the token itself.
Architecture overview
The platform runs three planes that share infrastructure but live on different schedules, and keeping them separate is the first step to operating it: the authorization plane (where a TPP and a customer establish a consented, FAPI-compliant grant), the resource plane (where a consented TPP calls account and payment APIs), and the consent-management plane (where the customer and the bank view, govern, and revoke grants). The defining property of the whole topology is that every external caller is authenticated twice — once at the transport layer by client certificate, once at the application layer by a sender-constrained token — and every state change is written to an immutable audit trail before it takes effect.
Authorization plane, following the control flow:
- A TPP is onboarded out of band: its software statement and certificate come from the open-banking scheme directory, and its client registration lands in Okta acting as the FAPI-compliant authorization server. Okta is configured for the FAPI profile —
private_key_jwtclient authentication, PAR required, request objects required, andtls_client_authfor certificate binding. - To start a consent, the TPP calls API Gateway on a dedicated mutual-TLS custom domain. API Gateway terminates mTLS against a trust store of scheme-issued CA certificates in S3, and forwards the validated client-certificate fingerprint to the backend. A first AWS Lambda authorizer rejects any caller whose certificate is not in the directory or has been revoked.
- The TPP first calls the PAR endpoint: it pushes a signed request object (a JWT carrying
scope,redirect_uri, and — for a payment — the consent’s intent and amount) to Okta’s back channel and gets back a short-livedrequest_uri. Nothing sensitive rides the browser. - The customer is redirected to the bank’s authorization UI carrying only that
request_uri. They authenticate with strong customer authentication (SCA) and are shown a consent screen rendered from the request object: exactly which accounts, which data clusters (balances, transactions, standing orders), or which single payment, and for how long. Their decision is written as a consent record in DynamoDB before any token is issued. - On approval, Okta returns the authorization response as a signed JWT (JARM), and the TPP exchanges the code at the token endpoint using
private_key_jwt, presenting its client certificate. Okta issues an mTLS-bound access token whosecnfclaim carries the certificate thumbprint, scoped to the consent id.
Resource plane, following a data call:
- The TPP calls a resource API —
GET /accounts/{id}/transactionsorPOST /domestic-payments— at API Gateway over the same mTLS domain, presenting the access token. - AWS WAF sits in front, enforcing per-TPP rate-based rules, IP reputation, and request-shape validation; this is the first throttle against a misbehaving or compromised TPP hammering the estate.
- A Lambda token authorizer validates the JWT signature and claims, and — critically — confirms the certificate thumbprint in the token’s
cnfclaim matches the certificate presented on this very connection. This is the proof-of-possession check that makes a stolen token worthless. It then loads the consent record from DynamoDB, checks it isAuthorised(not revoked, not expired) and that its scope actually covers this resource and action. - Only then does the request reach the resource Lambda, which calls the bank’s core-banking systems through a private path — a VPC-attached integration over AWS PrivateLink to an internal account/payments service, never the public internet. The few secrets the function needs — the core-banking mTLS client key, signing keys for response objects — are pulled at runtime from HashiCorp Vault via short-lived dynamic leases, so nothing sensitive sits in a Lambda environment variable.
- For a payment, the response (and the customer’s authorization) is captured as a signed, non-repudiable record; for account information, the response is returned per the open-banking schema. Every call — consent created, token issued, resource accessed, payment initiated — is emitted to the audit pipeline.
Consent-management plane, customer-facing and independent: the bank’s mobile and web apps call an internal consent API to list, inspect, and revoke active grants. A revocation flips the DynamoDB consent record to Revoked and the very next resource call fails the authorizer’s consent check — revocation is immediate because the authorizer reads consent state on every request, not from a cached token.
Component breakdown
| Component | Service / tool | Role in the platform | Key configuration choices |
|---|---|---|---|
| Edge / CDN | Akamai | TLS for the public consent UI, anycast, WAF/bot mitigation at the perimeter | Bot rules on the auth UI; origin shield to the API front door; not in the mTLS API path |
| API front door | Amazon API Gateway | mTLS termination, routing, request validation for TPP traffic | Mutual-TLS custom domain; S3 trust store of scheme CAs; truststore versioning |
| Edge rate control | AWS WAF | Per-TPP rate-based rules, IP reputation, schema/shape checks | Rate-based rule per client id; managed rule groups; size constraints |
| Authorization server | Okta | FAPI-compliant OAuth: PAR, JAR, JARM, private_key_jwt, mTLS binding |
FAPI profile on; PAR required; request object signing; tls_client_auth |
| Cert / TPP authorizer | AWS Lambda | Validate scheme cert, check directory revocation, gate onboarding | Reads CRL/OCSP + directory; rejects unknown or revoked TPPs |
| Token authorizer | AWS Lambda | JWT validation, cnf thumbprint match, consent lookup + scope check |
Proof-of-possession check; DynamoDB consent read on every call |
| Resource APIs | AWS Lambda | Account-information and payment-initiation business logic | Per-API least-privilege role; concurrency caps per endpoint |
| Consent store | Amazon DynamoDB | Consent records, lifecycle state, full audit history | Streams on; PITR on; partition by consent id; TTL on expired grants |
| Core-banking link | AWS PrivateLink / VPC | Private path to internal account and payment systems | Interface endpoints; no public egress from resource functions |
| Secrets | HashiCorp Vault | Core-banking client keys, response-signing keys, introspection secrets | Dynamic short-lived leases; AWS auth method; no secrets in env vars |
| CSPM / posture | Wiz + Wiz Code | Cloud posture, exposure, attack-path analysis; IaC scanning pre-merge | Agentless scan of API GW/DynamoDB/S3; Wiz Code blocks risky Terraform in PRs |
| Runtime security | CrowdStrike Falcon | Runtime threat detection on container/VM workloads and the SOC feed | Sensor on any ECS/EC2 in the estate; detections piped to the SOC |
| Observability | Dynatrace / Datadog | Distributed tracing, FAPI-flow spans, latency and error SLOs | OTel spans across PAR→token→resource; SLOs on p95 and 5xx; anomaly alerts |
| ITSM / approvals | ServiceNow | TPP onboarding approvals, change gates, incident records | Change gate before a TPP goes live; auto-incident on auth-failure spikes |
| CI / IaC | GitHub Actions + Terraform | Pipeline build/test/security-gate; infrastructure as code | OIDC to AWS (no stored creds); Wiz Code + policy gate before deploy |
| Delivery / GitOps | Jenkins / Argo CD | Build pipelines and GitOps deploy of any containerized control-plane services | Argo CD syncs the consent-UI and admin services; Jenkins for legacy build steps |
A few choices deserve the why, because they are the ones teams get wrong.
Why mTLS at API Gateway, not just at the load balancer. FAPI’s sender-constrained tokens only work if the certificate presented at the resource API is the same one bound into the token. That means API Gateway must terminate mTLS itself and pass the verified client-certificate fingerprint to the authorizer, where the cnf-claim comparison happens. Terminating mTLS upstream and forwarding plain HTTP breaks the proof-of-possession chain and silently downgrades you to bearer-token security — the exact failure FAPI exists to prevent.
Why consent lives in DynamoDB and is read on every call, not baked into the token. A token is a snapshot; consent is a lifecycle. If you encode the grant into a long-lived token, a customer’s “revoke” cannot take effect until the token expires — unacceptable when the customer is trying to cut off a TPP they no longer trust. By keeping consent as an authoritative record the token authorizer reads on every request, revocation is effective on the next call, and the consent’s full history — created, amended, revoked, by whom, when — is an audit object in its own right. DynamoDB Streams ships every change to the audit pipeline; point-in-time recovery and the immutable trail are what an auditor reconstructs from.
Why PAR and signed request objects, not query-string parameters. Without PAR, a payment’s amount and the requested scope travel in the browser address bar, where they can be altered between the TPP and the bank. PAR moves the entire (signed) request to a back-channel call and leaves only an opaque request_uri in the browser, so what the customer consents to on screen is provably what the TPP asked for — integrity the regulator and the customer both depend on.
Implementation guidance
Provision with Terraform, and treat the trust store and mTLS domain as the first deliverable. The order matters: get the certificate plumbing wrong and TPPs get opaque TLS handshake failures with no useful error.
- An S3 truststore holding the scheme directory’s CA bundle, versioned, with object-lock so a bad rotation can be rolled back.
- The API Gateway mutual-TLS custom domain pointed at that truststore.
- The two Lambda authorizers (certificate/TPP and token) and their IAM roles.
- The DynamoDB consent table with Streams and PITR enabled.
- WAF web ACL with per-client rate-based rules associated to the stage.
- The PrivateLink endpoints to core banking and the Vault auth role.
A minimal Terraform shape for the mTLS front door communicates the intent:
resource "aws_api_gateway_domain_name" "fapi" {
domain_name = "api.openbanking.examplebank.com"
regional_certificate_arn = aws_acm_certificate.fapi.arn
security_policy = "TLS_1_2"
mutual_tls_authentication {
truststore_uri = "s3://ob-truststore-prod/scheme-ca-bundle.pem"
truststore_version = aws_s3_object.truststore.version_id # pin + roll safely
}
endpoint_configuration { types = ["REGIONAL"] }
}
The pipeline that applies this runs in GitHub Actions, authenticating to AWS via OIDC federation so there is no stored access key to leak — a hard lesson the platform team intends never to repeat. Wiz Code scans the Terraform on every pull request and fails the build if, say, the consent table is created without PITR or an S3 bucket is public, so a misconfiguration never reaches an account. Containerized control-plane services (the consent UI, the admin console) build through Jenkins and deploy via Argo CD GitOps, while the serverless APIs ship straight from the GitHub Actions pipeline.
The token authorizer is where FAPI lives — get it exactly right. Its job, on every resource call, is the proof-of-possession check plus the consent gate. In pseudocode the heart of it is:
def authorize(token, client_cert_thumbprint, resource, action):
claims = verify_jwt(token, okta_jwks) # signature, iss, aud, exp
if claims["cnf"]["x5t#S256"] != client_cert_thumbprint:
deny("token not bound to this client certificate") # stolen-token defense
consent = dynamodb.get(claims["consent_id"])
if consent["status"] != "Authorised": # revoked/expired -> immediate fail
deny("consent not active")
if not scope_covers(consent["permissions"], resource, action):
deny("consent does not grant this resource/action")
return allow(consent_id=consent["id"], tpp=claims["client_id"])
Skipping the cnf comparison is the single most common — and most catastrophic — implementation bug: it turns the whole platform back into bearer-token security while looking compliant on paper.
Identity and secrets: federate the TPPs, kill the static keys. TPP clients authenticate with private_key_jwt and their scheme certificate; there are no shared client secrets to leak. Each resource Lambda runs with a least-privilege IAM role granting exactly the DynamoDB and PrivateLink access it needs and nothing else. The residual secrets that are not IAM — the core-banking client key, the response-signing key — live in HashiCorp Vault, leased dynamically with short TTLs and fetched at invocation, never written to an environment variable or layer.
Enterprise considerations
Security & Zero Trust. The architecture is Zero Trust toward every external caller by construction: dual authentication (certificate + sender-constrained token), least-privilege per function, no public path to core banking, and an authoritative consent check on every request. Layer on top: (a) AWS WAF rate-based rules per TPP as the first throttle against a compromised client; (b) Wiz running continuous CSPM and attack-path analysis across API Gateway, DynamoDB, S3 and IAM, alerting the moment a bucket drifts public or a role widens — with Wiz Code catching the same classes of issue in Terraform before merge; © CrowdStrike Falcon sensors on any ECS/EC2 control-plane workloads for runtime threat detection feeding the bank’s SOC; (d) a spike in authentication or cnf-mismatch failures auto-raising a ServiceNow incident, because a surge of proof-of-possession failures is a credible token-theft signal, not just noise. The certificate authorizer additionally checks the scheme directory’s CRL/OCSP so a TPP whose certificate the directory revokes is locked out on its next call.
Cost optimization. A serverless FAPI estate is cheap at rest and scales with TPP adoption, but a few levers matter.
| Lever | Mechanism | Typical effect |
|---|---|---|
| Authorizer result caching | Cache the token-authorizer decision per token for a short TTL | Cuts repeat DynamoDB reads and Lambda invocations on chatty TPPs |
| DynamoDB on-demand → provisioned | Start on-demand, move hot consent reads to provisioned + autoscaling | Lower unit cost once traffic is predictable |
| WAF before Lambda | Block abusive/oversized requests at the edge | Stops you paying Lambda + DynamoDB for junk traffic |
| Right-sized Lambda memory | Tune memory to the p95 of each function | Avoids over-provisioning every invocation |
| Tiered logging retention | Hot audit in queryable store, cold in S3/Glacier | Keeps the multi-year audit affordable |
Be deliberate with authorizer caching: cache the authorization decision only briefly, because the whole point of reading consent live is that revocation must be near-immediate — a long cache TTL trades that away and is the wrong call in this domain.
Scalability. Each plane scales independently. API Gateway and Lambda scale horizontally with TPP traffic; mind the account concurrency limit and request reserved concurrency for the payment-initiation functions so a flood of account-info reads cannot starve payments. DynamoDB scales on partition design — partitioning by consent id spreads load evenly and avoids a hot partition. The natural ceilings are regional Lambda concurrency and the throughput of the core-banking systems behind PrivateLink, which are almost always the real bottleneck; rate-limit per TPP at WAF and API Gateway so the legacy core is shielded from a thundering herd.
Failure modes, and what each one looks like. Name them before they page you.
- Truststore misconfiguration — a botched CA-bundle rotation makes every TPP fail the mTLS handshake with an opaque error and no application log. Mitigation: version the truststore in S3, roll forward by pinning
truststore_version, and run a synthetic mTLS handshake check after every change. - The
cnfcheck is skipped or wrong — the platform looks compliant but accepts stolen tokens. Mitigation: make the proof-of-possession assertion a required, separately-tested unit in the authorizer, and add a negative test in CI that a token presented with the wrong certificate is rejected. - Stale-consent leak — a revoked consent still works because the decision was cached too long. Mitigation: short authorizer-cache TTL (or none for payments) and consent read on every call.
- Core-banking outage — PrivateLink target is down; resource calls hang. Mitigation: tight timeouts, circuit-breaking in the resource function, and a clean
503to the TPP rather than a silent hang. - TPP abuse / token theft — one client floods the estate or replays credentials. Mitigation: per-TPP WAF rate limits, the
cnfbinding, CRL/OCSP revocation checks, and a ServiceNow incident on the failure-rate alert.
Reliability & DR (RTO/RPO). Decide the numbers per tier. DynamoDB global tables give multi-region consent state with near-zero RPO and seconds RTO — and consent is the one piece of state you cannot afford to lose or diverge. API Gateway, Lambda, and WAF are regional but stateless, so DR is redeploying the stack (from Terraform) in a paired region and failing over DNS; with global tables already replicating consent, that failover is fast. A pragmatic target for this platform: RTO 15 minutes, RPO near-zero for the consent and authorization planes, with the audit trail durably replicated to S3 with object-lock so it is recoverable and tamper-evident regardless of regional state. Health checks at the edge drive failover for the public consent UI.
Observability. Instrument the FAPI flow span end to end in Dynatrace (or Datadog, per the team’s standard) with OpenTelemetry: one trace covering PAR → authorization → token issuance → resource call, with the consent id, TPP id, and outcome on each hop. Emit the metrics the business and the regulator actually care about — consent grant/revoke rates, authentication-failure and cnf-mismatch counts, payment-initiation success rate, per-TPP call volume and error rate, and p95 latency on the account and payment APIs (open-banking schemes publish performance SLAs the bank must meet and report). Anomaly detection on the auth-failure metric is an early token-theft tripwire, and a sustained breach auto-raises a ServiceNow incident so security has a ticket, not just a dashboard.
Governance. Every TPP goes live only after a ServiceNow change approval that records the directory entry, the scopes granted, and sign-off — giving compliance a documented gate. Pin the FAPI profile settings in Okta as code so the security mechanics cannot drift, and keep the API schemas (account-information and payment-initiation, per the open-banking standard) version-controlled and contract-tested in CI. Wiz is the independent check that the controls — PITR, no public buckets, least-privilege IAM, encryption — are actually holding, and Wiz Code keeps them from regressing in Terraform. Internally, the platform team onboards through structured enablement — runbooks plus a short Moodle course on the FAPI flow, the consent model, and the incident playbook — so on-call engineers understand why the cnf check and PAR exist before they are paged about them. Where the bank fronts legacy partner integrations or non-cloud security controls, those run as virtual appliances in the VPC, kept in the same Terraform/Wiz governance envelope as everything else.
Explicit tradeoffs
Accept these or do not build it. FAPI is genuinely more complex than OAuth-with-a-bearer-token, and the complexity is load-bearing: mTLS plumbing, PAR, signed request and response objects, and proof-of-possession at the resource API are all moving parts you must implement and test negatively, because a silently-skipped check looks compliant while being insecure. Latency picks up a back-channel PAR hop and a per-call consent read — the price of integrity and immediate revocation. The mTLS trust-store lifecycle is operationally fiddly, and a bad rotation fails every TPP at once with an unhelpful error. And reading consent live on every request trades a little per-call cost and latency for the property that makes the whole thing trustworthy — near-immediate revocation — which in this domain is not optional.
The alternatives, and when they win. If you are not under an open-banking mandate and your APIs are first-party only, plain OAuth 2.0 with bearer tokens behind your gateway is simpler and entirely appropriate — FAPI’s machinery is overkill for internal traffic. If a managed open-banking platform-as-a-service (a scheme-certified vendor that hosts the FAPI authorization server and consent dashboard) fits your timeline and budget, it gets you to market faster than building this — at the cost of control over the consent UX and lock-in to their roadmap. And if you only need account information and will never touch payment initiation, you can drop the non-repudiation and signed-payment machinery and ship a meaningfully smaller surface. Graduate to this full self-built platform when payment initiation, scheme certification, scale, or sovereignty over the consent experience demand it.
The shape of the win
For the bank, the payoff is not “some APIs.” It is that a licensed TPP integrates against a certified, FAPI-compliant platform; a customer grants a budgeting app read-only access to two accounts for ninety days, sees exactly that on a consent screen rendered from a signed request, and revokes it in one tap with the next API call blocked instantly; and when a regulator’s auditor asks the bank to prove that a specific £4,000 payment was authorized by the named customer, the answer is a signed, non-repudiable record pulled from an immutable trail. Everything upstream — the mTLS-bound tokens, the PAR back channel, the JARM-signed responses, the live DynamoDB consent gate, the Vault-held core-banking keys, the Wiz posture scanning, the Dynatrace FAPI span — exists to make a regulator, a CISO, and a partner TPP each say yes. The architecture here is the destination; start with account information if you must, but a regulated, at-scale open-banking platform has to land here.