The first thing every IoT architecture diagram on Google Cloud gets wrong is the box on the far left. It says “IoT Core,” and IoT Core has not existed since August 2023. Google retired its managed device-connectivity service — the MQTT broker, the device registry, the per-device auth — and did not replace it. So the modern Google Cloud IoT reference architecture starts with a hole where the front door used to be, and the single most important design decision you make is how you fill it: which MQTT broker terminates a million long-lived device connections, authenticates each device, and bridges their telemetry into Pub/Sub — because everything to the right of that broker is the part Google does superbly and the part this article spends most of its words on.
The second thing those diagrams get wrong is treating IoT like clickstream. It is not. Clickstream is bursty human traffic that you analyse after the fact. A device fleet is millions of always-on emitters producing dense, regular time-series, and the questions you ask of it are two completely different shapes: “show me the last reading and current status of this one asset, right now, in single-digit milliseconds” (a point lookup against a key), and “chart the temperature trend of these 50,000 assets over the last quarter and tell me which models are drifting” (an analytical scan over history). Those two access patterns do not belong in the same store. The whole architecture below exists to fan a single ingest stream into Bigtable for the hot point-lookup and BigQuery for the analytical scan — and to do command-and-control back down to the devices on the same backbone. This is the reference, built end to end on a broker tier, Pub/Sub, Dataflow, Bigtable, BigQuery, and Looker.
The business scenario
The shape of the problem is identical whether you operate 800 connected devices or 8 million; only the broker sizing and the autoscaling ceilings change. So picture an operator who manufactures and runs physical things in the field that emit data and currently cannot see them in aggregate or in real time.
A commercial refrigeration and HVAC operator is the clean example — the same pattern fits connected vehicles, smart meters, industrial machinery, agricultural sensors, or medical devices. They have tens of thousands of refrigeration units deployed across supermarkets, restaurants, and cold-chain warehouses. Each unit has a controller that knows its compressor temperature, door-open events, power draw, defrost cycles, and fault codes. Today that data lives on the unit. A technician sees it only on a site visit. When a compressor is about to fail, the first signal the business gets is a 2 a.m. phone call about spoiled inventory and a freezer full of ruined stock — a claim that can run into lakhs per incident.
What the business actually needs splits cleanly into the two access shapes that drive the entire design:
- Operational, per-device, now. A field-service dispatcher needs to pull up unit #4471 in the Pune warehouse and see its current temperature and live status in milliseconds, and the on-call platform needs to alert the instant a unit crosses a fault threshold. This is a key lookup and a last-value read, not an analytical query.
- Analytical, fleet-wide, over time. Reliability engineering needs to chart degradation curves across every unit of a given compressor model over 18 months, correlate failures with ambient conditions, and feed a predictive-maintenance model. This is a partitioned scan over billions of rows.
- Command-and-control, downlink. Having seen a problem, an operator needs to act on the device — push a new temperature setpoint, trigger a manual defrost, schedule a firmware update — and know the device acknowledged it. Data does not only flow up.
- Correctness under field conditions. Devices go offline and replay a backlog when connectivity returns, so events arrive late, out of order, and duplicated. Clocks drift. Payloads occasionally corrupt. The pipeline must not drop, double-count, or stall.
- Cost that scales with a fleet, not a demo. At millions of devices and billions of readings a day, storage layout and retention tiers are the difference between a sane bill and an absurd one.
The architecture below meets all five with managed, mostly serverless Google Cloud services behind one self-managed broker tier — so a small team runs a fleet of thousands without a streaming-infrastructure group, and the same design scales to millions of devices without a redesign.
Architecture overview
Read the data path as up (telemetry ingest), across (fan-out to two stores plus the warehouse), and down (command-and-control), with one connectivity tier you operate and everything else managed.
The connectivity tier (devices → MQTT broker → Pub/Sub). Because IoT Core is gone, devices connect over MQTT to a broker you run: the managed ClearBlade IoT Core (a drop-in successor that keeps the IoT Core device registry and MQTT API), or a self-hosted EMQX / HiveMQ cluster on GKE for full control. The broker is the part that terminates a million long-lived TLS connections, authenticates each device by per-device certificate or JWT, enforces per-device topic ACLs, and handles last-will/keep-alive for unreliable links. A thin bridge — the broker’s native Pub/Sub connector, or a sidecar — republishes each device’s telemetry onto Pub/Sub topics keyed by message type (telemetry, events, state). From this point on, the workload is pure Google Cloud and the broker is just an edge that feeds the log.
Stage 1 — Durable ingest (Pub/Sub). Pub/Sub is the decoupling buffer: it absorbs the thundering herd when a regional network blip reconnects thousands of devices at once, retains messages (default 7 days, up to 31) so any consumer can replay, and lets multiple downstream consumers read the same stream independently. Device messages carry the device ID and an event timestamp as attributes so downstream can order by event time, not arrival time. Pub/Sub schemas (Avro/Protobuf) attached to topics reject malformed payloads at publish.
Stage 2 — Stream processing and fan-out (Dataflow). A single Dataflow streaming job (Apache Beam) subscribes to the telemetry topics and is the brain of the pipeline. It parses and validates, deduplicates on (device_id, sample_time) so a device’s reconnect-replay doesn’t double-count, resolves event-time windowing with watermarks and allowed lateness so a reading that was buffered on the device for an hour lands in the correct historical window rather than “now,” enriches each reading with device metadata (model, site, install date, warranty tier) joined from a reference store, computes rolling aggregates and threshold breaches, and then fans the same record out to three sinks: it writes the latest reading and live status to Bigtable for millisecond point lookups, appends the full validated history to BigQuery via the Storage Write API (exactly-once), and emits alert events back onto a Pub/Sub topic when a threshold is crossed. Anything that fails validation goes to a dead-letter topic and a BigQuery errors table.
Stage 3a — Hot operational store (Bigtable). Bigtable holds the time-series telemetry for the operational access pattern: “give me the recent readings and current state of this device” in single-digit milliseconds. The row-key design (device_id#reverse_timestamp) puts the newest reading first, so a “latest state” read is the first row of a one-device scan, and a “last 24 hours” read is a tight contiguous scan. Bigtable serves the dispatcher console, the live-status API, and the alerting checks — none of which should ever hit a scan engine.
Stage 3b — Analytical warehouse (BigQuery). BigQuery is the analytical store of record: time-partitioned (by sample day/hour) and clustered (by model, site, device_id) tables that hold months-to-years of every reading for fleet-wide trend analysis, reliability engineering, and BigQuery ML predictive-maintenance models scored in place. Materialized views and scheduled queries roll raw readings into marts (uptime by model, fault rate by site, energy by region).
Stage 4 — Command-and-control (downlink). Operators don’t only watch; they act. A command (new setpoint, defrost, firmware target) is published to a commands Pub/Sub topic, the bridge/broker delivers it to the device’s downlink MQTT topic (MQTT QoS 1 for acknowledged delivery), and the device’s ACK flows back up the same telemetry path and is recorded. This closes the loop without a second control plane.
Stage 5 — Serving and presentation (Looker). Looker is the semantic and dashboard layer over BigQuery — fleet health, model-level reliability, energy and SLA reporting — with LookML defining metrics once and governing per-customer/per-region row-level access. The live single-device console reads Bigtable directly (via a small API) for its millisecond freshness; Looker handles the fleet analytics. BI Engine can accelerate the heavy Looker dashboards.
The flow in one breath: devices → MQTT broker (ClearBlade / EMQX on GKE) → Pub/Sub (durable buffer) → Dataflow (validate, dedupe, window, enrich, fan-out) → {Bigtable for hot point lookups, BigQuery for analytical history, Pub/Sub for alerts} → Looker + live console, with a commands topic running the other direction back through the broker to the device, and a dead-letter branch off Dataflow for anything malformed. One stream in; two stores plus a warehouse out; a control path back down.
Component breakdown
| Component | Role in the path | Key configuration choices |
|---|---|---|
| MQTT broker (ClearBlade IoT Core or EMQX/HiveMQ on GKE) | Device connectivity, per-device auth, MQTT termination, Pub/Sub bridge | Per-device X.509 certs or JWT; per-device topic ACLs; MQTT QoS 1 for telemetry + commands; last-will + keep-alive; native Pub/Sub connector or sidecar bridge; HA cluster sized to concurrent connections |
| Pub/Sub topics | Durable ingest buffer; absorbs reconnect storms; decouples broker from processing | Topic per message type (telemetry/events/state/commands); attach Avro/Protobuf schema; device ID + event time as attributes; retention 7→31 days for replay |
| Pub/Sub subscriptions | Deliver to Dataflow; isolate consumers; dead-letter | One subscription per consumer; dead-letter topic + max delivery attempts; tune ack deadline for long Dataflow bundles |
| Dataflow (Beam) streaming job | Validate, dedupe, event-time window, enrich, aggregate, fan-out to 3 sinks | Streaming Engine on; autoscaling; dedupe on (device_id, sample_time); event-time windowing + watermarks + allowed lateness (devices replay backlogs); side-input/Bigtable enrichment with device metadata; BigtableIO + BigQueryIO (Storage Write API, exactly-once) + alert PubsubIO; dead-letter branch |
| Bigtable | Hot time-series store for per-device millisecond lookups + live state | Row key device_id#reverse_timestamp (newest first); tall-narrow rows; per-column-family GC policy (e.g. raw maxage=90d, latest-state maxversions=1); SSD; autoscaling nodes; separate app-profile for serving vs. ingest |
| BigQuery | Analytical warehouse: months/years of full history for fleet-wide trends | Time-unit partitioning on sample time; clustering on (model, site, device_id); partition expiration for tiered retention; materialized views + scheduled queries for marts |
| BigQuery ML | Predictive maintenance in-warehouse | Failure-prediction / anomaly models on the reading history; ML.PREDICT in scheduled queries; no data movement |
| Looker (LookML) | Fleet semantic model + governed dashboards | Metrics (uptime, MTBF, fault rate) defined once; row-level access by customer/region/site; PDTs for heavy rollups; optional BI Engine acceleration |
| Commands path | Acknowledged downlink to devices | commands Pub/Sub topic → broker bridge → device downlink topic (QoS 1); device ACK flows back up telemetry and is recorded |
A few choices deserve the why, because they are where an IoT architecture differs from a generic streaming one.
Bigtable and BigQuery are not redundant — they answer opposite questions, and the fan-out is the point. This is the decision that defines the whole design. BigQuery is a magnificent scan engine and a terrible key-value store: a point lookup of one device’s latest reading is a full query with seconds of latency and a per-query cost, which is unacceptable for a console refresh or an alert check fired thousands of times a second. Bigtable is the opposite: a constant-low-latency key-value/range store with no good story for “scan a billion rows and GROUP BY model.” So the operational, per-asset, now-questions go to Bigtable (point lookups, last-value, recent-range), and the analytical, fleet-wide, over-time questions go to BigQuery (partitioned scans, aggregates, ML). Dataflow writes the same validated record to both in one pass. Skipping Bigtable and serving the live console from BigQuery is the most common and most expensive mistake in Google Cloud IoT.
The Bigtable row key is the entire performance story. device_id#reverse_timestamp — where the reverse timestamp is LONG_MAX − sample_millis — does three things at once: it co-locates one device’s readings contiguously (fast single-device range scan), it sorts newest-first so “current state” is the first row read (a one-row lookup, not a sort), and prefixing with device_id (high-cardinality, well-distributed) avoids the hotspotting that a raw-timestamp prefix would cause by funnelling all writes to one tablet. Use tall, narrow rows (one logical reading per row, few columns) and set per-column-family garbage-collection policies so raw samples auto-expire (maxage=90d) while a latest-state family keeps only the current version (maxversions=1). Get this key wrong and Bigtable is either hotspotted on write or slow on read.
Dataflow exists to do what a direct Pub/Sub→BigQuery subscription cannot — and IoT needs every bit of it. The zero-code BigQuery subscription is real and useful, but it is at-least-once only, applies no transformation, and writes to exactly one destination. IoT specifically needs the four things it can’t do: deduplicate a device’s reconnect-replay, window a late buffered reading into the correct historical bucket (not “now”), enrich a bare (device_id, value) with model/site/warranty metadata, and fan out to Bigtable and BigQuery and an alert topic from one stream. Beam’s event-time windowing with watermarks and allowed lateness is exactly the mechanism that makes “the 14:05 reading” correct even when a device that was offline uploads it at 16:30.
The broker is the one box you own, so treat its sizing as capacity planning, not autoscaling magic. Unlike everything to its right, the broker holds stateful long-lived connections; you size it to concurrent connections and message rate, run it HA (multi-zone GKE or managed ClearBlade), and per-device auth + topic ACLs are non-negotiable so a compromised device can only publish its own topic and receive its own commands.
Implementation guidance
Project, region, and instance layout. Put the platform in a dedicated Google Cloud project (or one per environment: iot-dev, iot-prod) under your landing-zone folder hierarchy, on a Shared VPC from the network host project. Co-locate the GKE broker cluster (if self-hosted), Dataflow workers, Bigtable instance, and BigQuery dataset in one region (e.g. asia-south1) to avoid cross-region latency and egress; pin the region to your data-residency requirement. Bigtable and BigQuery are both regional resources here — keep them together with the Dataflow job.
Infrastructure as code (Terraform). Provision the whole right-hand side declaratively so it is reproducible and reviewable; the broker is the only piece needing app-level deployment (Helm/manifests on GKE, or ClearBlade console + API).
google_pubsub_schema+google_pubsub_topicper message type, withgoogle_pubsub_subscriptioncarryingdead_letter_policy.google_bigtable_instance(SSD, autoscaling cluster) andgoogle_bigtable_tablewith column families; set GC policies viagoogle_bigtable_gc_policy(e.g.max_age90d on raw,max_version1 on latest-state).google_bigquery_dataset+google_bigquery_tablewithtime_partitioningon sample time,clusteringon["model","site","device_id"], andrequire_partition_filter = trueso no one scans all history by accident.- The Dataflow job via
google_dataflow_flex_template_job(or a CI step runninggcloud dataflow flex-template run) pointing at your packaged Beam pipeline, withenableStreamingEngineand bounded autoscaling. google_bigquery_data_transfer_config(scheduled queries) and materialized-view tables for marts; the Looker (Google Cloud core) instance with its LookML project in Git and a dedicated BigQuery service account.- For self-hosted brokers:
google_container_cluster+ node pools sized for connection count, with the EMQX/HiveMQ Helm release and the Pub/Sub bridge configured via Workload Identity.
Keep the Beam pipeline in its own repo with unit tests on the transforms — Beam’s TestStream lets you assert dedup and late-data/windowing behaviour deterministically — and a CI pipeline that builds the Flex Template image and updates the streaming job.
The Dataflow pipeline, concretely. A streaming Beam pipeline (Java or Python) that: reads the telemetry subscription, extracting device_id and sample_time from message attributes for event time; deduplicates on (device_id, sample_time) with a stateful keyed dedup over a bounded window; applies fixed event-time windows with a watermark and generous allowed_lateness (devices buffer offline); enriches via a slowly-refreshed side input of device metadata (or a per-key Bigtable lookup for very large fleets); computes threshold breaches and rolling aggregates; branches invalid records to a dead-letter PubsubIO + BigQuery errors table with a reason; and then writes to three sinks in the same pipeline — BigtableIO.write() keyed device_id#reverse_timestamp for the hot store, BigQueryIO.write() with the Storage Write API (exactly-once) for history, and PubsubIO.write() to the alerts topic on breach. Prefer the Storage Write API over legacy streaming inserts — it is the current path, cheaper, and supports exactly-once.
Networking. Run Dataflow workers in the Shared VPC subnet with Private Google Access and --no_use_public_ips, so they reach Pub/Sub, Bigtable, and BigQuery over Google’s private network with no public egress. The broker is the only public ingress: front a self-hosted EMQX/HiveMQ with a TCP/TLS Load Balancer (MQTT over TLS on 8883) and tight firewall rules; managed ClearBlade exposes its own secured endpoint. Lock data services inside a VPC Service Controls perimeter so Bigtable and BigQuery cannot exfiltrate data even with valid credentials. The live-status API (reading Bigtable) sits behind a global HTTPS LB with Cloud Armor; Looker reaches BigQuery over Google’s network and its UI is gated by IAP/allowlist.
Identity and access (least privilege). Each component gets its own service account. The broker bridge SA gets pubsub.publisher on telemetry topics and pubsub.subscriber on commands only. The Dataflow worker SA gets pubsub.subscriber on its subscriptions, bigtable.user on the instance, bigquery.dataEditor on the target dataset, and pubsub.publisher on the alerts topic. The live-status API SA gets bigtable.reader on the serving app-profile and nothing else. The Looker SA gets bigquery.dataViewer + bigquery.jobUser on the serving dataset only. Devices authenticate at the broker with per-device certs/JWT — they never hold Google Cloud credentials. Humans get access through groups mapped to IAM roles. Use column-level policy tags (Dataplex/Data Catalog) to mask any customer-identifying fields in BigQuery, and Looker row-level access so each customer/region sees only its own fleet.
Enterprise considerations
Security and Zero Trust. Trust is established by identity and context at every hop, not by network position. Devices are the largest and least trustworthy population, so each authenticates individually at the broker (per-device cert/JWT), is constrained by per-device topic ACLs (it can publish only its telemetry and receive only its commands), and is never granted a Google Cloud identity — a stolen device key compromises one device, not the fleet, and is revoked at the broker. Inside Google Cloud, no service account has standing broad access; each is scoped to exactly the topics/tables it needs, and VPC Service Controls wrap Bigtable and BigQuery in a data perimeter so even a leaked key can’t move telemetry out. Dataflow runs without public IPs. The command path uses QoS 1 acknowledged delivery so control actions are confirmed, and every command is logged in Cloud Audit Logs — critical when “who told that freezer to stop cooling?” is an auditable question. Pub/Sub schemas plus the dead-letter branch contain malformed or hostile payloads instead of propagating them.
Cost optimization. IoT economics live and die on storage layout and retention, because the data volume is enormous and most of it is read rarely. The meters and the levers:
- Bigtable — billed on nodes (provisioned) + storage, not per-query. It is the priciest idle component, so size nodes to the serving load (autoscale on CPU), and crucially use GC policies to expire raw samples after the hot window (e.g. 90 days) so the hot store stays small. Bigtable holds recent data for fast access; it is not your archive.
- BigQuery — billed on bytes scanned (on-demand) or slots, plus storage. Partition + cluster so a model-level trend query scans a few partitions, not the whole table; require a partition filter; serve dashboards from materialized views and BI Engine so analysts never re-scan raw history. Use partition expiration to tier old raw data out (or to long-term storage pricing) while keeping aggregates.
- Dataflow — billed on worker time; Streaming Engine + autoscaling means you pay for current ingest rate. Cap
maxNumWorkers. - Pub/Sub — billed on throughput; cheap relative to the stores. Retention length is the lever.
- Broker — fixed GKE node cost (or ClearBlade per-device/connection pricing); size to peak concurrent connections.
The decisive move is tiering by access pattern: hot recent data in a small, GC-bounded Bigtable; full history in cheap BigQuery storage scanned only by partitioned analytical queries; dashboards served from MVs/BI Engine. Curiosity does not generate a scan bill, and you are not paying Bigtable node hours to store two years of cold samples.
Scalability. Every tier scales independently. The broker scales horizontally (more GKE nodes / managed capacity) to more concurrent connections — this is the tier you actively plan. Pub/Sub is effectively unbounded throughput and is precisely what absorbs a reconnect storm when a region’s devices come back online and replay buffers simultaneously: the log just grows and drains rather than overwhelming the processors. Dataflow autoscaling adds workers as backlog grows. Bigtable scales by adding nodes (linear throughput) and resharding tablets — provided the row key avoids hotspots. BigQuery scales storage and (with editions autoscaling) compute without capacity planning. The same design runs at thousands of devices for a mid-market operator and millions for a global one; the broker fleet and the autoscaling ceilings change, the shape does not.
Reliability and DR (RTO/RPO). The durable Pub/Sub log is the backbone of recoverability: with retention configured you can replay from a timestamp to rebuild a derived table or re-hydrate a store after a logic bug or bad deploy, and the dead-letter topic preserves anything that failed. Dataflow checkpoints state in Streaming Engine (a worker loss doesn’t lose in-flight data) and exactly-once writes mean a retry doesn’t double-count in BigQuery. RPO: within the Pub/Sub retention window, effective loss approaches zero — design RPO inside that window (e.g. 7 days); for the device side, MQTT QoS 1 + on-device buffering means a disconnected device’s readings arrive (late) when it reconnects rather than vanishing. RTO: Bigtable supports cross-region replication for an active-active or warm-standby serving store (the live console survives a regional fault); BigQuery can be a multi-region location or have critical datasets replicated; and because the entire Google Cloud side is reproducible via Terraform, you can stand the pipeline back up in a paired region in well under an hour. Most operators target minutes of RTO for the live console (replicated Bigtable), an hour or two for fleet analytics (BigQuery), and seconds of effective RPO inside the retention window. The broker tier needs its own HA/multi-zone design and a documented failover for the device endpoint (DNS/anycast) — it is the only component without a free Google-managed failover.
Observability. Cloud Monitoring + Logging give the operational picture across the whole path. The leading indicators of trouble are Pub/Sub subscription backlog / oldest-unacked-message age (processing falling behind a reconnect storm), Dataflow system lag / data freshness / watermark, Bigtable CPU utilization and p99 read/write latency (the signal a hot row key or undersized cluster is forming), BigQuery bytes scanned and slot utilization, and dead-letter depth (should be near zero). On the device side, track connected-device count and disconnect rate at the broker — a cliff there means a connectivity or auth incident, not a data one. Set SLOs on end-to-end freshness (device sample → queryable / → Bigtable-visible) and on live-console read latency, and alert on backlog age, dead-letter growth, and Bigtable p99 — those catch most real incidents early.
Governance. Dataplex / Data Catalog provide the catalog, lineage, and policy tags that drive any column masking, and describe the telemetry as discoverable data products. Looker’s LookML is the governed semantic layer — uptime, MTBF, fault rate are defined once, version-controlled in Git, and access-scoped — so reliability engineering and the customer-facing SLA report compute the same numbers. A device registry (ClearBlade’s, or your own table) is the governance backbone for the fleet: which device exists, its model/firmware/owner, its certificate state, and its decommission status. IAM through groups, audit logs (including the command path), and VPC Service Controls round out a setup that satisfies SOC 2 / ISO and data-residency requirements.
Reference enterprise example
Frostline Systems is a mid-market commercial-refrigeration operator: roughly 1,100 employees, about ₹3,000 crore in annual revenue, and a fleet of 62,000 connected units (supermarket display cases, restaurant walk-ins, and cold-chain warehouse systems) across India and the Gulf. Each unit samples compressor temperature, door state, power draw, defrost cycle, and fault codes every 10 seconds — roughly 535 million readings a day, peaking around 9,000 messages/sec with reconnect bursts to 40,000/sec when a regional ISP flaps. Their old model was reactive: the first signal of a failing compressor was spoiled stock and an emergency call-out, and a single warehouse spoilage event could cost ₹6–8 lakh plus the customer relationship.
They built exactly this architecture in asia-south1 (Mumbai) for residency, and chose ClearBlade IoT Core as the managed broker successor so the existing units’ IoT-Core-style MQTT clients connected with minimal firmware change — keeping the per-device certificate auth and device registry they already had. The broker bridges three Pub/Sub topics (telemetry, events, commands), each with an Avro schema and a dead-letter topic. A single Dataflow streaming job (Streaming Engine, autoscaling 5→60 workers) deduplicates on (device_id, sample_time), applies 1-minute event-time windows with 15 minutes of allowed lateness (units buffer readings when offline and replay on reconnect), enriches each reading with model/site/warranty metadata, routes ~0.2% malformed messages to a dead-letter table, and fans out to all three sinks: Bigtable (row key device_id#reverse_timestamp, raw column family maxage=90d, latest-state family maxversions=1, autoscaling on a 3-node minimum), BigQuery (partitioned by sample-hour, clustered on model, site, device_id, via Storage Write API exactly-once), and an alerts topic on threshold breach.
The dispatcher console and the alerting service read Bigtable directly through a small Cloud Run API; reliability engineering and the customer SLA reports run on BigQuery, where a BigQuery ML failure-prediction model scores each compressor’s recent trend nightly. Looker (Google Cloud core) serves fleet-health, model-reliability, and per-customer SLA dashboards, with row-level access so each supermarket chain sees only its own units. Operators push setpoint changes and manual-defrost commands from Looker actions through the commands topic; the broker delivers them at QoS 1 and the device ACK is recorded.
The outcome after two quarters:
- Live console freshness: ~3–4 seconds device-sample-to-visible in Bigtable; single-device read p99: ~9 ms — a dispatcher pulls up any of 62,000 units instantly.
- Analytical history: the full 535M-readings/day stream lands in BigQuery and stays queryable for 18 months; a model-level degradation query over a quarter scans tens of GB (partitioned + clustered), not the multi-TB table.
- Predictive maintenance working: the BQML model now flags ~70% of compressor failures 5–10 days ahead, converting emergency call-outs into scheduled service. Spoilage incidents on monitored units dropped sharply — the single biggest line on the business case.
- Reconnect storms absorbed: a regional ISP outage that reconnected ~8,000 units at once produced a 40,000/sec replay burst; Pub/Sub buffered it and Dataflow drained it in minutes with zero data loss and no double-counting, thanks to dedup + windowing.
- Cost: the platform runs about ₹14–17 lakh/month — the ClearBlade broker and Bigtable nodes are the largest fixed lines; BigQuery cost stayed bounded because dashboards hit MVs and analytical queries are partitioned, and Bigtable stayed small because GC policies expire raw samples at 90 days while history lives cheaply in BigQuery.
- Operability: a six-person platform team runs it on four alerts that matter — Pub/Sub backlog age, dead-letter depth, Bigtable p99, and broker connected-device count — and rebuilt the Bigtable hot store once after a key-design fix by replaying Pub/Sub, with no loss.
The decision that paid off most was refusing to serve the live console from BigQuery. Fanning Dataflow out to Bigtable (for the millisecond per-device lookup) and BigQuery (for the fleet-wide scan) — instead of forcing both questions onto one store — is the entire difference between a console a dispatcher actually uses and a per-query bill that scales with every refresh.
When to use it
Use this architecture when you operate a fleet of devices that emit time-series telemetry and you need both per-device millisecond operational lookups and fleet-wide analytical history, and you need an acknowledged way to send commands back to devices. It is the reference for connected products, fleet/asset telemetry, smart metering, industrial and agricultural IoT, predictive maintenance, and cold-chain/medical-device monitoring. It scales from a few hundred devices to millions because every tier behind the broker is managed and elastic, and the broker tier scales horizontally.
The trade-offs and anti-patterns:
- Don’t serve point lookups from BigQuery. A live single-device console or a high-frequency alert check against BigQuery is slow and expensive per query. That access pattern belongs in Bigtable. Conversely, don’t try fleet-wide
GROUP BY modelscans in Bigtable — that’s BigQuery’s job. Putting the wrong question on the wrong store is the defining IoT mistake. - Don’t skip Dataflow with a direct BigQuery subscription if you need more than raw landing. The zero-code subscription is at-least-once, single-destination, and transform-free. IoT needs dedup of reconnect-replays, correct late-data windowing, enrichment, and fan-out to two stores plus alerts — that’s Dataflow.
- Don’t treat the broker as an afterthought. It is the one stateful, self-managed, capacity-planned tier and the entire security perimeter for untrusted devices. Per-device auth and topic ACLs are mandatory; HA and a device-endpoint failover plan are mandatory.
- Don’t get the Bigtable row key wrong. A timestamp-prefixed key hotspots writes onto one tablet and throttles the whole fleet. Prefix with the high-cardinality
device_idand reverse the timestamp. - Don’t store everything hot forever. Bigtable is the recent store; set GC policies and let BigQuery be the cheap archive. Skipping retention tiering is how the bill explodes.
Alternatives, and when they fit better. If your fleet is small (thousands, not millions) and your needs are purely analytical with no millisecond console, you can drop Bigtable and run the simpler Pub/Sub → Dataflow → BigQuery real-time-analytics pattern, or even the zero-code Pub/Sub→BigQuery subscription for raw landing — cheaper and less to operate. If you need sub-second time-series with built-in downsampling and a Prometheus-style query layer rather than a key-value store, a managed time-series database is a closer fit than Bigtable for some metrics workloads. If your team standardises on Kafka semantics and open-source streaming, Pub/Sub Lite or self-managed Kafka on GKE plus Dataproc/Flink substitutes for the Pub/Sub + Dataflow pair at the cost of more operations. And if you want to push intelligence to the edge — filtering, aggregating, or running ML on the device before sending — pair this with on-device inference and an edge gateway so only meaningful events traverse the network, which materially cuts Pub/Sub and Dataflow volume for very large or bandwidth-constrained fleets. But for a managed, scalable Google Cloud IoT platform that answers both the per-device and the fleet-wide question from one ingest stream, broker → Pub/Sub → Dataflow → {Bigtable, BigQuery} → Looker is the reference to reach for first.