A commercial weather-and-climate data company has spent fifteen years building the thing nobody else has: a fused, quality-controlled feed of radar, satellite, lightning, and proprietary hyperlocal model output. Insurers price crop policies off it, logistics firms reroute fleets with it, and an agritech startup just asked to pull it into a farmer-facing app. The CEO’s mandate to the platform team is blunt: “We are a data company that ships PDFs and SFTP drops. Our competitors ship APIs that customers self-serve and we invoice automatically. Catch us up before renewal season.” The constraint is that the business cannot run on goodwill and spreadsheets — a partner who calls the forecast API two million times in a billing cycle must be metered to the call, rate-limited so one customer cannot starve another, onboarded without a sales engineer holding their hand, and billed in a way the finance team trusts at audit. This article is the reference architecture for turning that data feed into a monetized API product on Google Cloud’s Apigee — a self-service, identity-gated, metered, and billable platform that a CFO and a head of partnerships will both sign.
The pressures here are different from a typical internal API program. Revenue means the gateway is now a cash register: a dropped usage record is lost money, and a double-counted one is a refund and a trust problem. Self-service means developers must discover, try, and subscribe to an API without a human in the loop, or the funnel never fills. Multi-tenancy means hundreds of external partners on shared infrastructure, each isolated by quota so a runaway client cannot degrade everyone else. And trust means the usage that drives an invoice has to be metered at a point the finance team and an external auditor will both accept as the source of truth. API monetization is the pattern that satisfies all four: it wraps your backend in a managed gateway that authenticates callers, enforces per-plan quotas, meters every call, and emits the usage events a billing system turns into money.
Why not the obvious shortcuts
The naive fixes each fail in a way that costs revenue or a customer, and naming them matters because someone on the project will propose all three.
Counting requests in the backend’s own logs seems free — you already have the access logs. But application logs are lossy by design (sampling, rotation, the occasional crash before flush), they have no concept of a billable unit versus a health-check, and no auditor will accept “we grep our app logs” as the basis for an invoice. Building a homegrown API key check and a rate limiter in the application means every backend team reimplements auth, throttling, and metering slightly differently, and you still have no developer portal, no plan catalog, and no billing feed. Putting a plain load balancer or a thin proxy in front and billing on a flat monthly fee ignores actual consumption, so your heaviest users are subsidized by your lightest, and you cannot offer the pay-as-you-go and tiered plans partners now expect.
A monetization platform threads the needle. The gateway authenticates the developer’s credential, resolves it to an API product and the rate plan they subscribed to, enforces the plan’s quota and protects the backend with spike arrest, and records a metered usage event for every billable call at the one place all traffic must pass through. The developer portal turns onboarding into self-service. And the usage stream becomes the single, defensible input to billing — the same number the partner sees on their dashboard is the number on their invoice.
Architecture overview
The platform runs three planes that share infrastructure but live on different schedules, and keeping them distinct in your head is the first step to operating it well: a synchronous runtime plane where partner traffic is authenticated, throttled, and metered; a developer-experience plane where humans discover APIs, subscribe to plans, and watch their usage; and an asynchronous monetization plane where metered usage is aggregated, rated against plans, and turned into invoices.
The defining property of the whole topology is the one finance cares about most: every billable call is metered at the Apigee gateway, and the gateway is the only path to the backend. The backend services have no public ingress; they accept traffic exclusively from Apigee over private networking. There is no side door a partner could call to consume data without being counted, which is exactly what makes the usage record trustworthy enough to invoice from.
Runtime path, following the control flow:
- A partner’s application calls the published API host. Traffic first hits Akamai at the edge for TLS termination, global anycast routing, WAF, and bot mitigation — so credential-stuffing and scraping bots are absorbed before they ever reach the gateway or burn a partner’s quota.
- The request lands on Apigee (the API gateway and management plane on Google Cloud). The proxy’s first policies authenticate the caller — a VerifyAPIKey for simple plans or an OAuthV2 / VerifyJWT for partners using OAuth client credentials — and resolve the credential to the developer app, the API product it is subscribed to, and therefore its rate plan.
- Apigee enforces consumption controls in order: a SpikeArrest policy smooths bursts to protect the backend (e.g.
30ps— thirty requests per second, evened to one every ~33 ms), then a Quota policy enforces the plan’s allowance (e.g. 1,000,000 calls per month for the Growth tier), returning429with the remaining count when exhausted. - For any secrets the proxy needs that should not live in Apigee KVM — a backend mTLS client key, a partner-specific HMAC signing secret — a service callout retrieves them at runtime from HashiCorp Vault, leased dynamically, so sensitive material is short-lived and never baked into a proxy revision.
- The proxy routes to the backend — the weather data services running on GKE (a private cluster) or Cloud Run — over private Google networking, never the public internet. The backend returns the forecast payload.
- On the response, a DataCapture / monetization step records the billable unit. Apigee’s built-in monetization increments the usage counter for that developer + product + plan; in parallel, the proxy emits a structured usage event to Pub/Sub for the independent metering pipeline (below). The response — enriched with
X-RateLimit-Remainingheaders so the developer’s client can self-throttle — streams back through Akamai to the partner.
Developer-experience path, where the funnel lives: a developer arrives at the Apigee integrated developer portal (or a Drupal-based portal for heavier customization). They authenticate through Okta as the external-identity provider — Okta handles partner SSO, social/business login, MFA, and the B2B org structure — federated to the portal over OIDC so the portal sees a first-class identity without managing partner passwords. They browse the API catalog, read the OpenAPI-rendered reference, try calls in the live API console, register an app to mint credentials, pick a rate plan, and (for paid tiers) enter payment details. Crucially, self-service subscription is a control-plane action that provisions the developer app and its product association in Apigee, so the very next runtime call is already metered against the right plan.
Monetization path, asynchronous and the part that becomes money: the Pub/Sub usage stream lands in BigQuery as the immutable, append-only record of every billable call — partner, product, plan, timestamp, response class, and any custom dimension (data tiles served, forecast horizon requested). A scheduled job aggregates usage per developer per plan per cycle, rates it against the plan’s pricing (per-call, tiered, or revenue-share), and produces a billable line. That line is handed to the billing engine — Stripe for card-billed self-service developers, or an event into the enterprise billing/AR system for invoiced partners — and the resulting invoices and entitlement changes are reflected back so the portal dashboard, the customer’s invoice, and BigQuery all agree on the same number.
Component breakdown
| Component | Service / tool | Role in the platform | Key configuration choices |
|---|---|---|---|
| Edge | Akamai | TLS, anycast, WAF, bot mitigation before the gateway | Bot rules for scraping/credential-stuffing; origin shield to Apigee’s host |
| API gateway & mgmt | Apigee (GCP) | Auth, spike arrest, quota, routing, usage metering, monetization | API products + rate plans; SpikeArrest + Quota policies; private backend target |
| Developer identity | Okta (federated to portal) | Partner SSO, MFA, B2B orgs, social/business login | OIDC federation to portal; group claims drive plan visibility; no partner passwords stored |
| Developer portal | Apigee integrated portal / Drupal | Catalog, docs, live console, app registration, plan subscription | OpenAPI-rendered reference; self-service app + plan provisioning |
| Secrets | HashiCorp Vault | Backend mTLS keys, partner HMAC secrets, signing keys | Dynamic leases; short-lived; retrieved by proxy service callout, not stored in KVM |
| Backend | GKE (private) / Cloud Run | The weather/climate data services | Private ingress only from Apigee; no public endpoint |
| Usage stream | Pub/Sub | Decoupled, durable transport for metered usage events | At-least-once delivery; dedup key = request id |
| Usage record | BigQuery | Immutable append-only billable-event store and rating source | Partitioned by event date; clustered by developer/product |
| Rating & aggregation | Scheduled job (Cloud Run / Dataflow) | Aggregate → rate against plan → emit billable lines | Idempotent per cycle; reconciles against Apigee counters |
| Billing | Stripe + enterprise AR | Card billing (self-service) + invoiced enterprise partners | Metered/usage prices in Stripe; AR event for invoiced tier |
| CSPM / posture | Wiz + Wiz Code | Cloud posture, exposed-secret + attack-path detection; IaC scanning | Agentless scan of GCP project; Wiz Code blocks risky Terraform in PR |
| Runtime security | CrowdStrike Falcon | Runtime threat detection on GKE nodes and backend compute | Sensor on node pool; detections piped to the SOC |
| Observability | Dynatrace / Datadog | Distributed tracing, latency/error SLOs, business KPIs | OneAgent/agent on GKE; trace spans the proxy→backend hop; revenue dashboards |
| ITSM / approvals | ServiceNow | Plan-change approvals, partner onboarding, incident records | Change gate before a new public product/plan goes live; auto-ticket on metering drift |
| CI / IaC | GitHub Actions + Argo CD + Terraform | Proxy bundle build/test/deploy; GitOps; infra as code | WIF to GCP (no stored keys); promotion gates; Apigee config as code |
A few of these choices deserve the why, because they are the ones teams get wrong.
Why API products are the unit of monetization, not individual proxies. A common mistake is to sell access to a proxy. Apigee’s model is deliberately one level up: an API product is a bundle of API resources (specific paths/verbs across one or more proxies) plus the quota and access scope you want to sell, and a rate plan attaches pricing to that product. The same underlying forecast proxy can therefore appear in a free “Hobby” product (100 calls/day, current conditions only), a “Growth” product (1M calls/month, 48-hour forecast), and an “Enterprise” product (custom quota, full 14-day horizon, revenue-share plan) — without forking the backend. Package the business offer as a product; keep the proxy a clean technical artifact.
Why spike arrest and quota are different controls you need both of. Teams conflate them and then wonder why the backend still falls over. Quota is a business limit — “your plan allows a million calls this month” — counted over long windows and enforced for billing fairness. Spike arrest is an infrastructure protection — “no more than thirty per second, smoothed” — that shields the backend from a thundering herd or a buggy client retry-storm regardless of whether the monthly quota has room. A partner with 900,000 calls left in their quota can still take down your data service with a tight loop; spike arrest is what stops them.
<SpikeArrest name="SA-protect-backend">
<Rate>30ps</Rate> <!-- smoothed to ~1 req / 33ms -->
</SpikeArrest>
<Quota name="Q-enforce-plan">
<Allow countRef="verifyapikey.plan.quotaLimit"/> <!-- per-product plan limit -->
<Interval>1</Interval>
<TimeUnit>month</TimeUnit>
<Identifier ref="verifyapikey.developer.app.id"/> <!-- count per app -->
</Quota>
Why metering belongs at the gateway and is mirrored into BigQuery. Apigee’s own monetization counters are what enforce the plan in real time, but for billing you want an independent, immutable record you can re-rate, audit, and reconcile. Emitting each billable event to Pub/Sub → BigQuery gives finance an append-only ledger keyed by request id, so a re-run is idempotent and a disputed invoice can be traced to the exact call. The two systems cross-check each other: if the gateway’s counter and the BigQuery aggregate diverge, that is a metering incident — and it auto-raises a ServiceNow ticket before it becomes a customer’s refund.
Implementation guidance
Provision with Terraform and treat Apigee config as code. The reproducible parts — the Apigee organization/environment, the network peering to the backend VPC, the API products and rate plans, the portal, and the BigQuery dataset — are Terraform. Proxy bundles and policies live in Git and deploy through CI, never hand-edited in the console, so a rate-plan change is a reviewed pull request with an audit trail, not a click someone forgets.
- The Apigee org and environment(s), with private connectivity (PSC / VPC peering) to the backend VPC so the gateway reaches GKE/Cloud Run privately.
- The API products and their rate plans as declarative resources — quota limits, included resources, and pricing per tier.
- The developer portal, with Okta wired as the OIDC identity provider.
- The BigQuery dataset and the Pub/Sub topic/subscription for the metering pipeline.
- The backend on a private GKE cluster (or Cloud Run) with ingress locked to Apigee only.
A minimal rate-plan shape communicates the intent — a metered Growth tier with a per-call rate above an included bundle:
# growth-rate-plan (illustrative)
apiProduct: weather-forecast-growth
monetizationConfig:
billingType: PREPAID
fixedRecurringFee: { units: 199, period: MONTH } # $199/mo platform fee
includedTransactions: 1000000 # 1M calls included
rateCards:
- start: 1000001 # overage
fee: { nanos: 200000 } # $0.0002 / extra call
The pipeline that applies this runs in GitHub Actions, authenticating to Google Cloud via Workload Identity Federation so there is no stored service-account key to leak — a hard lesson the platform team intends never to repeat. Application and config promotion across environments is GitOps via Argo CD, so the live state of the cluster and the desired state in Git are continuously reconciled, and Terraform / Ansible handle the underlying infrastructure and any VM-based components (for example, partner-facing virtual appliances some enterprise customers require for private connectivity into the platform). Wiz Code scans the Terraform in the pull request and blocks a plan that would, say, expose the backend publicly or commit a secret, before it can merge.
Identity: federate the partners, never store their passwords. External developers authenticate through Okta, which owns partner SSO, MFA, the B2B organization model (a partner company with many developer users), and social/business login. The portal trusts Okta over OIDC, and the group/org claims in the token drive what each developer sees — a partner in the “enterprise” org sees enterprise products and plans; a self-service signup sees the public catalog. At runtime, partner applications authenticate to the gateway with their minted API key or OAuth client credentials, distinct from the human SSO that governs the portal. The secrets the platform itself needs — backend mTLS keys, partner HMAC secrets for webhook signing — live in HashiCorp Vault, leased dynamically and fetched by the proxy at call time, so nothing sensitive is frozen into a proxy revision or a key-value map.
Portal and docs wiring. Treat the OpenAPI spec as the contract and the portal’s source of truth: publish each product’s spec so the reference docs, the live “try it” console, and the SDK snippets are all generated from one artifact that ships in the same pull request as the proxy. Carry a stable, human-readable product name and a clear plan comparison on the catalog page, because the subscription funnel is a conversion surface, not an afterthought — a developer who cannot self-serve a trial key in two minutes churns before they ever generate a billable call. Use Moodle to host partner enablement: structured onboarding courses and a short certification (“Integrating the Forecast API”) that reduces support load and gives enterprise partners a credential their own auditors like to see.
Enterprise considerations
Security & Zero Trust. The platform is Zero Trust by construction: no public backend surface, every call authenticated and resolved to a plan at the gateway, least-privilege service identities. Layer on top: (a) Akamai WAF and bot mitigation at the edge to absorb credential-stuffing and scraping before they reach the gateway; (b) Apigee policies for JWT/OAuth verification, payload threat protection (JSON/XML bombs, injection), and per-app spike arrest as the abuse backstop; © HashiCorp Vault so backend and partner secrets are short-lived and never embedded in config; (d) Wiz running continuous CSPM and attack-path analysis across the GCP project, alerting the moment a resource drifts to public exposure or a service account is over-privileged, with Wiz Code shifting that check left into the pull request; (e) CrowdStrike Falcon sensors on the GKE node pool and backend compute for runtime threat detection feeding the SOC; (f) a metering anomaly or an abuse spike auto-raises a ServiceNow incident so security and finance get a ticket, not just a log line. Because the usage record drives revenue, its integrity is a security property, not just a billing one.
Cost optimization. Two cost surfaces matter and they pull in opposite directions: the platform’s own run cost, and the partner’s bill (which must be predictable enough to sell).
| Lever | Mechanism | Typical effect |
|---|---|---|
| Plan packaging | Free/Growth/Enterprise products off one proxy; overage rate cards | Captures heavy users without forking infrastructure |
| Response caching | Apigee ResponseCache for identical, non-personalized forecast tiles |
Cuts backend calls (and your compute) on hot data |
| Spike arrest + quota | Protect backend; cap runaway clients before they cost you | Avoids over-provisioning the data service for abuse |
| BigQuery partitioning | Partition usage by date, cluster by developer/product | Cheap rating queries even at billions of events |
| Right-sized backend | Autoscale GKE/Cloud Run on real metered demand | Pay for actual consumption, not peak guesswork |
Meter consumption per partner and per product, pipe the metric to Dynatrace / Datadog, and put revenue-per-product and cost-per-thousand-calls on the dashboard the CFO sees — the same telemetry that proves the platform’s gross margin.
Scalability. Each plane scales independently. Apigee’s managed runtime scales horizontally with traffic; the backend on GKE scales pods on concurrency and nodes via the cluster autoscaler, driven by real metered demand. The metering pipeline scales by design — Pub/Sub absorbs bursts, BigQuery ingests append-only at effectively unbounded volume, and the rating job runs on a schedule against partitioned data, so a 10× partner-growth quarter changes a query’s cost, not its architecture. The natural ceilings to plan for are Apigee environment throughput (add environments/regions behind Akamai) and any per-partner quota you have promised in contracts.
Failure modes, and what each one looks like. Name them before they page you.
- A lost usage event — Pub/Sub or the consumer drops a billable call, and a partner is under-billed. Mitigation: at-least-once delivery, request-id dedup so re-delivery is safe, and a daily reconciliation of gateway counters against the BigQuery aggregate that flags any gap.
- A double-counted event — a retry or redelivery bills a call twice, producing a refund and a trust hit. Mitigation: idempotent rating keyed on request id; the reconciliation job catches divergence either way.
- Quota counter staleness across regions — distributed quota can briefly let a partner slightly exceed their limit at the edges. Mitigation: accept a small tolerance for protection-style limits, and rate billing off the authoritative BigQuery record, not the live counter.
- Backend overload despite quota — a client with quota to spare hammers the backend. Mitigation: spike arrest sized to backend capacity, independent of monthly quota.
- Portal/identity outage — Okta or the portal is down, blocking new signups and logins. Mitigation: runtime auth (API keys/OAuth) is independent of the portal, so existing partners keep calling and billing continues even while onboarding is degraded.
- Regional outage — see DR below.
Reliability & DR (RTO/RPO). Decide the numbers per plane, and notice they differ. The runtime plane is the one partners feel: target RTO 15 minutes by running Apigee across regions behind Akamai health-checked failover, with the backend deployed in a paired region. The monetization plane can tolerate a longer RTO but demands a tight RPO — losing usage data is losing money — so Pub/Sub’s durability plus BigQuery as the system of record gives effectively RPO near zero for billable events; the rating job can always be re-run from the immutable ledger. A pragmatic target: runtime RTO 15 min, billing RPO ~0 with rating recomputable from BigQuery within the cycle.
Observability. Instrument the proxy→backend span end to end in Dynatrace / Datadog with distributed tracing, and emit the metrics the business cares about, not just infrastructure health: calls per product and per partner, quota-exhaustion (429) rate (a partner hitting their cap is an upsell signal), p95 gateway latency (the number partners hold you to in an SLA), revenue per product, and the reconciliation delta between gateway counters and the BigQuery ledger — the single most important operational metric, because a non-zero delta is money going wrong. A new public product or a rate-plan change passes through a ServiceNow change approval before it goes live, giving finance and partnerships a documented gate on anything that alters what customers are charged.
Governance. Version proxy bundles and rate plans in Git so a pricing change is reviewable and instantly revertable, and pin published OpenAPI specs so the contract a partner integrated against does not drift under them. Apply org policy to deny any backend resource created with public ingress, with Wiz as the independent check that the control holds. Keep an audit log of every plan subscription, plan change, and the metered events behind each invoice — both because finance needs it at audit and because a disputed bill must be traceable to the call. Communicate API deprecations on a published schedule through the portal; breaking a partner’s integration without notice is how a monetized program loses the revenue it just built.
Explicit tradeoffs
Accept these or do not build it. Monetization adds real moving parts — a product/plan catalog to maintain, a metering pipeline to keep lossless, a rating job to keep correct, and a billing integration to keep reconciled. The usage record is now financially load-bearing, which means a metering bug is a revenue or refund event, not a cosmetic one, and you will spend ongoing effort on reconciliation you would not need for a free internal API. The self-service portal is a product in its own right — docs, a live console, an onboarding funnel — that someone has to own. And the Okta-federated, multi-tenant, quota-isolated posture that lets you safely put hundreds of external partners on shared infrastructure is overhead you can skip for a five-partner pilot billed by hand, and absolutely cannot skip once invoices are generated automatically.
The alternatives, and when they win. If you have a handful of strategic partners and a salesperson who negotiates each contract, manual onboarding plus flat-fee billing is simpler and may be right until volume justifies automation — graduate to this platform when the partner count or the billing complexity outgrows a spreadsheet. If you are deep in a single cloud and want native gateway plus billing primitives, the equivalent managed stacks — AWS API Gateway usage plans with AWS Marketplace metering, or Azure API Management with its developer portal and an Azure Marketplace SaaS billing offer — are credible substitutes and worth choosing if your data and team already live there; Apigee earns its place when you want a best-in-class developer portal, mature built-in monetization, and a gateway that is comfortable spanning hybrid and multi-cloud backends. And if your goal is purely internal API governance with no external billing, you want an API management program, not a monetization one — skip the rating and billing planes entirely and keep the gateway, portal, and policies.
The shape of the win
For the weather-data company, the payoff is not “an API.” It is that the agritech startup’s developer lands on the portal at 11pm, reads the forecast reference, mints a trial key, makes a live call from the console inside two minutes, subscribes to the Growth plan with a card, and starts integrating — while the platform meters every call, enforces their quota, protects the backend, and, at the end of the cycle, hands finance a usage-backed invoice the startup’s own dashboard already agrees with. No sales engineer, no SFTP credentials, no spreadsheet. Everything upstream — the Akamai edge, the Apigee products and plans, the spike-arrest and quota policies, the Okta-federated portal, the Vault-held secrets, the Pub/Sub-to-BigQuery ledger, the Wiz posture scanning, the Dynatrace span, the reconciliation gate — exists so that a partner can self-serve their way to a billable integration, and so the CFO can trust the number on the invoice. That last clause is the one that turns a data feed into a business. Start narrower if you must, but this is where a real API monetization program has to land.