A relational database has a hard ceiling on concurrent connections, and that ceiling is far lower than the concurrency your serverless and container fleets can generate. A db.r6g.large PostgreSQL instance defaults to roughly 850 max_connections; a single Lambda burst can ask for thousands of fresh TCP sessions in seconds, each one a forked backend process with real memory cost. The database does not gracefully shed that load. It runs out of connection slots, new sessions get FATAL: remaining connection slots are reserved, and your healthy application starts throwing 500s because it cannot open a socket — not because the query was slow.
RDS Proxy sits between the application and the database, maintains a warm pool of database connections, and multiplexes a large number of short-lived client connections onto a small number of long-lived database connections. It also holds client sockets open during a failover and routes them to the new writer, which shrinks application-observed failover time dramatically. This article is the production build: provisioning, the connection pinning trap that silently destroys multiplexing, tuning the pool under load, IAM authentication, Lambda and VPC wiring, and the CloudWatch signals that tell you whether the proxy is actually helping.
1. The connection-exhaustion problem and what multiplexing buys you
The math is unforgiving. Postgres allocates a backend process per connection; even idle, each one consumes work_mem-adjacent memory and a proc slot. The relationship between application concurrency and database connections is the whole game:
- Without a proxy: every application instance opens its own pool. 200 Lambda containers x a pool of 5 = 1000 database connections demanded, most of them idle most of the time.
- With a proxy: 1000 client connections borrow from a shared pool. At any instant only the connections actively running a query need a backend; the rest return to the pool. The proxy can serve those 1000 clients from 50-100 database connections.
The proxy does this through transaction-level multiplexing: a database connection is borrowed from the pool when a client begins a transaction and returned when the transaction commits or rolls back. Between transactions the client holds no backend. This is the same idea as PgBouncer in transaction mode, but managed, VPC-native, and integrated with Secrets Manager and IAM.
The caveat that defines everything downstream: multiplexing only works when the database connection is safe to hand to a different client after each transaction. Some session state makes that unsafe, and the proxy responds by pinning — dedicating a backend to one client for the life of its session. Pinning is multiplexing’s off switch. Section 3 is about not tripping it.
RDS Proxy supports MySQL and PostgreSQL on both Aurora and RDS. It is a managed, autoscaling fleet inside your VPC; you pay per vCPU-hour of the underlying database instance class it fronts, which is why the cost trade-off in section 8 is real and not an afterthought.
2. Provisioning the proxy: secrets, IAM, and target groups
The proxy needs three things wired correctly: a Secrets Manager secret holding the database credentials it uses to open backend connections, an IAM role granting it read access to that secret, and a target pointing at your cluster or instance.
First, the secret. RDS Proxy authenticates to the database with credentials from Secrets Manager — never inline. The secret JSON must use these keys:
{
"username": "app_proxy_user",
"password": "REDACTED",
"host": "prod-aurora.cluster-abc123.us-east-1.rds.amazonaws.com",
"port": 5432,
"engine": "postgres",
"dbname": "appdb"
}
Store it:
aws secretsmanager create-secret \
--name prod/aurora/proxy-user \
--secret-string file://proxy-secret.json
The proxy’s IAM role needs to read that secret and decrypt it with the KMS key:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ReadProxySecret",
"Effect": "Allow",
"Action": ["secretsmanager:GetSecretValue"],
"Resource": "arn:aws:secretsmanager:us-east-1:111122223333:secret:prod/aurora/proxy-user-*"
},
{
"Sid": "DecryptSecret",
"Effect": "Allow",
"Action": ["kms:Decrypt"],
"Resource": "arn:aws:kms:us-east-1:111122223333:key/KMS-KEY-ID",
"Condition": {
"StringEquals": { "kms:ViaService": "secretsmanager.us-east-1.amazonaws.com" }
}
}
]
}
The role’s trust policy must allow rds.amazonaws.com to assume it. Now create the proxy. Terraform is the durable way to express this:
resource "aws_db_proxy" "aurora" {
name = "prod-aurora-proxy"
engine_family = "POSTGRESQL"
role_arn = aws_iam_role.proxy.arn
vpc_subnet_ids = var.private_subnet_ids
vpc_security_group_ids = [aws_security_group.proxy.id]
require_tls = true
idle_client_timeout = 1800 # seconds; reap idle clients
debug_logging = false # enable temporarily to capture SQL in pinning logs
auth {
auth_scheme = "SECRETS"
secret_arn = aws_secretsmanager_secret.proxy_user.arn
iam_auth = "REQUIRED" # force IAM token auth from clients
description = "app proxy user"
}
}
resource "aws_db_proxy_default_target_group" "aurora" {
db_proxy_name = aws_db_proxy.aurora.name
connection_pool_config {
max_connections_percent = 75
max_idle_connections_percent = 50
connection_borrow_timeout = 120
}
}
resource "aws_db_proxy_target" "aurora" {
db_proxy_name = aws_db_proxy.aurora.name
target_group_name = aws_db_proxy_default_target_group.aurora.name
db_cluster_identifier = aws_rds_cluster.aurora.id # Aurora cluster (use db_instance_identifier for RDS)
}
Two distinctions that bite people:
iam_auth = REQUIREDcontrols how clients authenticate to the proxy (IAM tokens). Thesecret_arnis how the proxy authenticates to the database. These are independent layers — the proxy always uses the secret for backend connections regardless of how clients authenticate.- For an Aurora cluster target, the proxy automatically exposes a writer endpoint and supports adding read-only endpoints (section 5). For a single RDS instance, point at
db_instance_identifier.
3. Connection pinning: the silent killer of multiplexing
Pinning is the single most important RDS Proxy concept to internalize. When a client does something that makes a backend connection unsafe to share, the proxy stops multiplexing that connection and dedicates it to the client until the client disconnects. Enough pinning and your “pooled” proxy degrades into a 1:1 passthrough — you pay for the proxy and get none of the multiplexing.
For PostgreSQL, common pinning triggers include:
- Using
SETstatements that change session-level parameters (e.g.SET search_path,SET TIME ZONE,SET application_name) — these mutate session state the next client would inherit. - Creating temporary tables (session-scoped objects).
- Holding session-level advisory locks (
pg_advisory_lock), which outlive a transaction. - Using prepared statements at the session protocol level (PostgreSQL
PREPARE/ unnamed-statement reuse via the extended query protocol), andLISTEN/NOTIFY. - Any open transaction that spans the borrow window in a way the proxy cannot reset.
You detect pinning two ways. The DatabaseConnectionsCurrentlySessionPinned CloudWatch metric shows how many connections are pinned right now; if it tracks close to your client connection count, multiplexing is effectively off. For root cause, enable proxy logging and read the pinning reason in the logs:
fields @timestamp, @message
| filter @message like /pinned/
| sort @timestamp desc
| limit 50
A pinning log line names the cause, for example a session variable being set, so you can map it back to a query pattern.
Practical fixes:
- Move static session config into the database role or
postgresparameter group instead of per-sessionSET. Forsearch_path, set it on the role withALTER ROLE app_user SET search_path = ...so the proxy does not see a runtimeSET. - Prefer server-side functions or schema-qualified names over
SET search_pathat runtime. - For drivers/ORMs, disable server-side prepared statements when you need maximum multiplexing. With the PostgreSQL JDBC driver set
prepareThreshold=0; with libraries built on the extended protocol, force simple query mode. This is a genuine trade-off — you lose plan caching — so measure before disabling. - Keep transactions short and avoid session-scoped temp tables in hot paths; use CTEs or
ON COMMIT DROPtemp tables only where unavoidable, knowing they pin.
Pinning is not a bug, it is correctness. The proxy refuses to leak one client’s session state into another’s. The job is to write query patterns that do not require session state to survive across transactions.
4. Tuning the pool: MaxConnectionsPercent, idle, and borrow timeout
Three knobs in connection_pool_config govern behavior under load.
max_connections_percent caps the proxy’s database connections as a percentage of the target’s max_connections. At 75% against an 850-slot instance, the proxy will open up to ~637 backends. Leave headroom: administrative tools, replication, and direct connections also consume slots. If you front one database with multiple proxies or also allow direct app connections, their percentages must sum to under 100 with margin.
max_idle_connections_percent sets how many backends the proxy keeps warm when demand drops. Higher means faster response to the next burst (no cold connect) at the cost of holding idle backends. For spiky serverless traffic, keeping this meaningfully above zero (e.g. 50%) avoids re-establishing connections on every burst; for steady traffic you can run it lower to free slots. It must be less than or equal to max_connections_percent.
connection_borrow_timeout is how long a client waits for a backend when the pool is saturated before the proxy returns an error. This is your backpressure valve. Under a connection storm with the pool maxed, clients queue here. A short timeout fails fast and sheds load (good for Lambda, where a hung invocation burns billed duration and concurrency); a longer timeout absorbs brief spikes without errors. Tune it against your client’s own timeout so the proxy errors before the client gives up, giving you a clean signal.
A starting point for a Lambda-heavy workload:
connection_pool_config {
max_connections_percent = 75
max_idle_connections_percent = 50
connection_borrow_timeout = 30 # fail fast; let Lambda retry rather than hang
}
The signal to watch while tuning is DatabaseConnectionsBorrowLatency (section 8). If it climbs, clients are queuing for backends and you are either pinning too aggressively or max_connections_percent is too low for the offered load.
5. Failover acceleration and reader endpoints
The second reason to run RDS Proxy — independent of pooling — is failover behavior. During an Aurora or RDS Multi-AZ failover, the writer’s DNS flips to a new instance. Applications connecting directly must detect the broken connection, re-resolve DNS (subject to TTL and cached resolvers), and reconnect — and a fleet doing this simultaneously is a reconnect storm that can knock over the freshly promoted writer.
RDS Proxy changes the failure mode. It holds the client’s connection open, absorbs the reconnect against the database internally, and routes the held client connection to the new writer once promotion completes. The application sees a brief pause on in-flight transactions rather than a flood of connection errors. This is the single biggest lever for shrinking application-observed failover time, and it is why the related Aurora HA build puts the proxy in front of the writer as a baseline.
For Aurora clusters, create a read-only proxy endpoint so read traffic uses the proxy too and benefits from the same pooling and failover handling:
aws rds create-db-proxy-endpoint \
--db-proxy-name prod-aurora-proxy \
--db-proxy-endpoint-name prod-aurora-proxy-ro \
--target-role READ_ONLY \
--vpc-subnet-ids subnet-aaa subnet-bbb
Route writes at the default (read-write) endpoint and reads at the READ_ONLY endpoint. Two operational notes:
- Read-only proxy endpoints require the Aurora cluster to have at least one reader; with no reader, the read-only endpoint has no targets.
- The proxy does not magically make stale reads consistent — reader endpoints still serve from replicas with replica lag. Pooling does not change replication semantics.
6. Enforcing TLS and IAM database authentication
Long-lived database passwords sitting in application config or environment variables are the credential you most want to eliminate. RDS Proxy lets you replace them with IAM authentication: the application calls AWS to mint a short-lived (15-minute) token and uses it as the database password. No static secret in the app; access is governed by IAM and fully logged.
With iam_auth = REQUIRED and require_tls = true on the proxy, the path is:
- The application’s IAM principal must hold
rds-db:connectfor the specific database user it logs in as:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": "rds-db:connect",
"Resource": "arn:aws:rds-db:us-east-1:111122223333:dbuser:prx-0abc123def456/app_user"
}]
}
The resource ARN uses the proxy resource ID (prx-..., from describe-db-proxies), not the database instance ID, and the trailing segment is the database username. Scope it to exactly the user(s) the principal may assume.
- The application generates a token at connect time and uses it as the password:
export PGPASSWORD="$(aws rds generate-db-auth-token \
--hostname prod-aurora-proxy.proxy-abc123.us-east-1.rds.amazonaws.com \
--port 5432 \
--username app_user \
--region us-east-1)"
psql "host=prod-aurora-proxy.proxy-abc123.us-east-1.rds.amazonaws.com \
port=5432 user=app_user dbname=appdb sslmode=verify-full \
sslrootcert=/etc/ssl/certs/rds-combined-ca-bundle.pem"
- The database user must be allowed to authenticate via IAM. For PostgreSQL, grant the
rds_iamrole; once a role hasrds_iam, it authenticates only via IAM tokens:
CREATE USER app_user;
GRANT rds_iam TO app_user;
GRANT CONNECT ON DATABASE appdb TO app_user;
Use sslmode=verify-full (not require) with the RDS CA bundle so the client verifies the proxy’s certificate and hostname, defeating man-in-the-middle. The token is generated with SigV4 — possession of it requires valid IAM credentials at that moment, and it expires in 15 minutes, so a leaked token is short-lived. The result: the app holds no database password, only an IAM identity.
7. Lambda integration, VPC networking, and cold-start storms
RDS Proxy and Lambda are made for each other, but the wiring has sharp edges.
VPC and security groups. The proxy lives in private subnets. The chain of security groups must allow: Lambda SG -> proxy SG on 5432, and proxy SG -> database SG on 5432. The proxy’s own SG is the trust boundary for the database; the database should accept connections from the proxy SG, not from the Lambda SG directly, so that the proxy is the only path in.
Get the connection lifecycle right. The classic Lambda mistake is opening a connection inside the handler and never reusing it. Open the connection (and fetch the IAM token) in the module/init scope so it is reused across warm invocations on the same execution environment:
import os, boto3, psycopg2
rds = boto3.client("rds")
HOST = os.environ["PROXY_HOST"]
USER = os.environ["DB_USER"]
def _connect():
token = rds.generate_db_auth_token(
DBHostname=HOST, Port=5432, DBUsername=USER, Region=os.environ["AWS_REGION"]
)
return psycopg2.connect(
host=HOST, port=5432, user=USER, dbname=os.environ["DB_NAME"],
password=token, sslmode="verify-full",
sslrootcert="/var/task/rds-combined-ca-bundle.pem",
)
conn = _connect() # module scope: reused across warm invocations
def handler(event, context):
global conn
try:
with conn.cursor() as cur:
cur.execute("SELECT 1")
return cur.fetchone()[0]
except psycopg2.OperationalError:
conn = _connect() # reconnect on a dropped backend (e.g. after failover)
with conn.cursor() as cur:
cur.execute("SELECT 1")
return cur.fetchone()[0]
Why the proxy specifically helps cold starts. When Lambda scales out hard, hundreds of new execution environments each open a connection in init. Without the proxy that is a direct connection storm against the database. The proxy absorbs it: client connections land on the proxy and borrow from the warm pool, so the database sees a bounded number of backends no matter how wide Lambda fans out. Keep max_idle_connections_percent high enough that the warm pool is ready for the next burst rather than cold-connecting under the spike.
Token, not password, in the environment. Note there is no PGPASSWORD env var holding a secret — only the IAM token minted per connection. The Lambda execution role carries the rds-db:connect permission from section 6.
Enterprise scenario
A retail platform team ran order-processing on Lambda against a provisioned Aurora PostgreSQL cluster (db.r6g.2xlarge, ~3400 max_connections). They had already put RDS Proxy in front of the writer and considered the connection problem solved. During a flash sale, throughput tripled, and they watched DatabaseConnectionsBorrowLatency spike into the seconds while the database’s own connection count sat far below capacity — the pool was saturated even though the database was not. Clients were timing out on connection_borrow_timeout and Lambda was retrying, amplifying the load.
The constraint: they could not simply raise max_connections_percent, because the backends were not idle and available — they were pinned. DatabaseConnectionsCurrentlySessionPinned was tracking nearly 1:1 with client connections, so the proxy had silently degraded to passthrough and every Lambda invocation was effectively holding a dedicated backend for its whole lifetime.
Enabling debug logging surfaced the cause in the pinning logs: the ORM issued SET search_path on every connection, and an audit path used a session-level advisory lock. Both pin. The fix was three moves, none of them “buy a bigger database”:
-- 1. Move search_path off the runtime SET and onto the role so the proxy never sees it
ALTER ROLE app_user SET search_path = orders, public;
# 2. Restore headroom now that connections multiplex again
connection_pool_config {
max_connections_percent = 80
max_idle_connections_percent = 60
connection_borrow_timeout = 20 # fail fast, let Lambda retry, shed load cleanly
}
They also replaced the session advisory lock with a transaction-scoped pg_advisory_xact_lock, which releases at commit and does not pin. After the change, DatabaseConnectionsCurrentlySessionPinned dropped to near zero, borrow latency fell back into single-digit milliseconds, and the same cluster absorbed the next sale at double the concurrency without touching the instance class. The lesson the team took away: with RDS Proxy, the metric that predicts an outage is pinning, not CPU.
Verify
Confirm the proxy is healthy, multiplexing, and enforcing auth before you route production traffic.
Target health and endpoints:
aws rds describe-db-proxy-targets --db-proxy-name prod-aurora-proxy \
--query "Targets[].{Target:RdsResourceId,State:TargetHealth.State,Reason:TargetHealth.Reason}"
aws rds describe-db-proxy-endpoints --db-proxy-name prod-aurora-proxy \
--query "DBProxyEndpoints[].{Name:DBProxyEndpointName,Role:TargetRole,Status:Status}"
TargetHealth.State should be AVAILABLE. An UNAVAILABLE target with reason AUTH_FAILURE means the secret credentials are wrong or the role cannot read the secret.
IAM auth path works (and direct password auth is rejected):
# Should succeed with a freshly minted token over TLS
PGPASSWORD="$(aws rds generate-db-auth-token --hostname $PROXY_HOST --port 5432 \
--username app_user --region us-east-1)" \
psql "host=$PROXY_HOST user=app_user dbname=appdb sslmode=verify-full \
sslrootcert=rds-combined-ca-bundle.pem" -c "select current_user;"
Multiplexing is actually happening. Generate load, then compare client connections to database connections — the ratio is the proof:
fields @timestamp, ClientConnections, DatabaseConnectionsCurrentlyBorrowed, DatabaseConnectionsCurrentlySessionPinned
| sort @timestamp desc
| limit 20
ClientConnections should be well above DatabaseConnectionsCurrentlyBorrowed, and DatabaseConnectionsCurrentlySessionPinned should be near zero. If pinned tracks client connections, you are not multiplexing — go back to section 3.
Failover behaves. Trigger a controlled failover and watch the application:
aws rds failover-db-cluster --db-cluster-identifier prod-aurora
Through the proxy you should see a brief stall on in-flight transactions and a fast recovery, not a flood of connection-refused errors across the fleet.