GCP Lesson 17 of 98

Google Cloud Firestore, In Depth: Native vs Datastore Mode, Documents, Indexes & Queries

Most applications need somewhere to keep structured data that is not a spreadsheet and not a full relational schema — user profiles, chat messages, game state, shopping carts, IoT readings, the mutable “state of the app right now”. On Google Cloud the default answer for that shape of problem is Firestore: a fully managed, serverless, horizontally-scalable document database that stores JSON-like documents, scales reads and writes automatically with no servers to size, replicates synchronously across zones (and optionally regions) for very high availability, and — its signature feature — pushes real-time updates to connected clients so your UI re-renders the instant the data changes.

“Serverless document database” sounds simple, and the happy path genuinely is: you write a document, you read it back, it just works at any scale. But Firestore makes a small number of decisions for you that are very hard to undo, and it enforces a query model that is unfamiliar to anyone coming from SQL. The single hardest choice — Native mode versus Datastore mode — is fixed for the life of the database. The data model rewards (and the query engine requires) thinking in indexes from day one, because in Firestore “every query is served by an index” is not advice, it is the architecture. And the security story is genuinely two stories — Security Rules for apps that talk to Firestore directly from a phone or browser, IAM for trusted backends — that beginners routinely conflate and get badly wrong.

This is the exhaustive version. We will pin down Native versus Datastore mode with a comparison table and the rule for choosing; walk the full data model (collections, documents, fields, subcollections, references, and every data type); explain indexing completely (automatic single-field indexes, composite indexes, single-field exemptions, and the “queries need indexes” rule that surprises everyone); cover the query model and every limitation that bites — no native joins, the inequality-field rule, ordering and pagination with cursors; separate Security Rules from IAM precisely; cover transactions, batched writes, real-time listeners, TTL, backups/PITR, and how to choose a location; and finish with the architect’s decision — Firestore versus Bigtable versus Cloud SQL. Commands are real gcloud firestore against current Firestore (2026), with the console called out so you can follow along either way.

Learning objectives

By the end of this lesson you will be able to:

Prerequisites & where this fits

You should be comfortable with the Google Cloud resource hierarchy (organisation, folders, projects) and basic IAM, have the gcloud CLI installed and initialised, and understand JSON (Firestore documents are essentially typed JSON). A little experience with any database — SQL or NoSQL — helps the comparisons land, but no prior NoSQL knowledge is assumed; every term is defined. This is the Databases track of the GCP Zero-to-Hero course, sitting alongside the relational deep dive Google Cloud SQL, In Depth (gcp-cloud-sql-deep-dive-engines-ha-replicas-backups) — read that one for the relational side of the same decisions. After this, the in-memory caching companion Google Cloud Memorystore, In Depth (gcp-memorystore-deep-dive-redis-memcached-clusters) covers the cache layer that often sits in front of Firestore.

Core concepts: the mental model

Before any settings, fix the vocabulary — most Firestore confusion is vocabulary confusion, and the SQL terms you already know map only loosely.

The single most important idea: Firestore is a tree of collections and documents, and you query it through indexes, not scans. Model your data around the queries you will run and the access path you will use (client vs server), and the rest follows.

Native mode vs Datastore mode: the permanent choice

This is the first and hardest decision, so we tackle it first. Both modes are the same underlying storage and scaling engine (“Firestore”), but they expose different APIs and feature sets, and the mode is fixed for the life of the database — you cannot flip a database from one to the other (you migrate data to a new database). Choose deliberately.

Native mode is the modern, full-featured experience. It is what you want for almost every new project. It gives you:

Datastore mode exists for server-side workloads and for backward compatibility with the older Cloud Datastore API. It runs on the same modern Firestore backend (so it inherits strong consistency and the newer scaling), but it deliberately omits the client-facing features:

Dimension Native mode Datastore mode
Intended for Mobile/web apps and servers; greenfield Server-side only; lift-and-shift from legacy Cloud Datastore
Client SDKs (mobile/web) Yes No
Real-time listeners Yes No
Offline persistence Yes No
Access control Security Rules (client) + IAM (server) IAM only
Data model exposed Collections / documents / fields Entities / kinds / keys (Datastore API)
Consistency Strongly consistent Strongly consistent (modern backend)
Aggregation (count/sum/avg) Yes Yes (modern backend)
API Firestore API Datastore API
Mutable later? No — mode is fixed at creation No — mode is fixed at creation

The rule for choosing: unless you are explicitly maintaining a legacy Cloud Datastore application, choose Native mode. It is the strict superset for new work — anything Datastore mode does for servers, Native mode also does, plus the client SDKs, real-time listeners and Security Rules. Pick Datastore mode only when you are migrating an existing App Engine / Cloud Datastore codebase that depends on the Datastore API and entity model. The choice is per-database and permanent, so getting it right at creation matters; if you chose wrong, you create a new database in the correct mode and migrate (export/import or Dataflow).

# Native mode (the default and recommended choice)
gcloud firestore databases create \
  --database='(default)' \
  --location=nam5 \
  --type=firestore-native

# Datastore mode (only for legacy Datastore compatibility)
gcloud firestore databases create \
  --database=legacy-ds \
  --location=us-central1 \
  --type=datastore-mode

The rest of this lesson is written for Native mode, which is what you will almost always use.

The data model: collections, documents, fields

Firestore stores a tree. Understanding the shape prevents most modelling mistakes.

The hierarchy

(root)
 └── users                    ← collection
       └── u_abc123           ← document (id = u_abc123)
             ├─ displayName: "Asha"          ← field (string)
             ├─ createdAt: <timestamp>       ← field (timestamp)
             ├─ prefs: { theme: "dark" }     ← field (map / nested object)
             ├─ roles: ["editor","admin"]    ← field (array)
             └── orders               ← subcollection (under the document)
                   └── o_001          ← document
                         ├─ total: 4999       ← field (integer)
                         └─ items: [...]      ← field (array of maps)

The rules of the tree:

Data types

A Firestore field can hold any of these typed values. Knowing them — and how they sort — matters because ordering and range queries depend on type order.

Type What it is Notes / gotchas
String UTF-8 text Up to ~1 MiB (within the document limit); sorts lexicographically by UTF-8 byte.
Integer 64-bit signed Distinct from floating-point in sort order within numbers.
Floating-point 64-bit double NaN sorts in a defined position; mixing int/float is fine — they sort together as numbers.
Boolean true / false
Timestamp Date+time, microsecond precision The correct type for “when”; use server timestamps to avoid client-clock skew.
Map Nested object of fields Nesting is allowed up to 20 levels deep; you can index and query nested fields by dotted path (prefs.theme).
Array Ordered list of values Query with array-contains / array-contains-any; arrays cannot be nested directly inside arrays; you cannot range-query inside an array.
Null Absence of value Sorts first; ==-queryable.
Bytes Raw binary blob Up to ~1 MiB; for small binaries — large blobs belong in Cloud Storage with a reference field.
Reference Pointer to another document (a path) First-class type; lets you “link” documents. Resolving it is a separate read — Firestore does not auto-join.
Geographical point Latitude/longitude pair Stored as a type; note Firestore has no native geo-radius query — use geohash techniques.

Firestore defines a global type ordering (null < boolean < number < timestamp < string < bytes < reference < geopoint < array < map). When a field holds mixed types across documents, this ordering governs how they sort in a query — a subtle source of surprise, so prefer consistent types per field.

Modelling patterns (and anti-patterns)

Indexes: the heart of Firestore querying

This is the concept that surprises everyone from SQL. In Firestore, there is no table scanevery query reads from an index. If a query needs an index that does not exist, the query fails with an error that includes a link to create the exact index needed. So indexing is not an optimisation; it is how queries run at all.

There are two kinds of index, and one kind of exemption.

Single-field indexes (automatic)

By default Firestore automatically creates and maintains single-field indexes for every field in every document — actually two per field: one ascending and one descending (plus an array-contains index for array fields). This is why you can immediately filter or order by any single field with no setup. It is also why writes cost more than you might expect: each write updates the index entries for every indexed field, so a document with many fields incurs many index writes.

You generally leave single-field indexing on. You change it only through exemptions (below) when a field should not be indexed.

Composite indexes (you declare these)

A composite index spans multiple fields and is required whenever a query combines conditions that a single-field index cannot serve — most commonly a query that filters on one field and orders by another, or filters on two or more fields, or combines an equality with a range. Composite indexes are not created automatically (the combinatorial space is too large); you declare them.

You will rarely write them by hand. The normal workflow is:

  1. Run the query in development.
  2. If it needs a composite index, Firestore returns an error containing a direct console link that pre-fills the exact index definition.
  3. Click it (or run the CLI), and the index builds. While building, the query fails; once enabled, it serves.

You can also define them declaratively in a firestore.indexes.json file and deploy with the Firebase CLI, which is the right approach for source-controlled, repeatable environments:

{
  "indexes": [
    {
      "collectionGroup": "orders",
      "queryScope": "COLLECTION",
      "fields": [
        { "fieldPath": "status",    "order": "ASCENDING" },
        { "fieldPath": "createdAt", "order": "DESCENDING" }
      ]
    }
  ],
  "fieldOverrides": []
}
# Deploy declared indexes (Firebase CLI)
firebase deploy --only firestore:indexes

# Or list / inspect indexes via gcloud
gcloud firestore indexes composite list --database='(default)'

A few composite-index rules worth memorising:

Index exemptions (single-field overrides)

Sometimes automatic single-field indexing is wrong for a field:

An index exemption (a single-field “field override”) lets you turn off ascending, descending and/or array-contains indexing for a specific field — reducing storage and write cost, at the price of not being able to query/order on that field.

# Exempt a field from automatic indexing (stop indexing the 'rawPayload' field)
gcloud firestore indexes fields update rawPayload \
  --collection-group=events \
  --database='(default)' \
  --disable-indexes
Index type Created by Spans Use it for Cost lever
Single-field Automatic (asc + desc, + array-contains) One field Filtering/ordering by any single field out of the box Each adds write cost; exempt to reduce
Composite You declare (console link / JSON) Two or more fields Filter+order on different fields, multi-field filters, equality+range Storage + write cost per index
Exemption (field override) You declare One field (to disable) Big/unused fields, large arrays/maps Reduces storage + write cost

The mental rule: single-field indexes are free to use and automatic; composite indexes you create on demand when a query asks for one; exemptions you create to stop indexing fields you never query.

Queries: filters, ordering, pagination — and the limits

Firestore queries read from indexes and return whole documents. The API is expressive but deliberately constrained so that every query stays fast regardless of dataset size — query latency depends on the size of the result set, not the size of the collection. The constraints are exactly what trips up SQL users, so learn them.

What you can do

A query in the client SDK (illustrative, JavaScript):

import { collection, query, where, orderBy, limit, startAfter, getDocs } from "firebase/firestore";

const q = query(
  collection(db, "orders"),
  where("status", "==", "paid"),
  where("total", ">=", 1000),        // range on 'total'
  orderBy("total", "desc"),          // first order MUST be the range field
  orderBy("createdAt", "desc"),
  limit(25)
);
const snap = await getDocs(q);       // needs a composite index on (status, total, createdAt)

The limitations — every one that bites

This is the part to know cold for interviews and to internalise before you design.

Pagination: use cursors, not offsets

Because there is no cheap offset, paginate by remembering the last document of a page and starting the next page after it:

// Page 1
let q = query(collection(db, "orders"), orderBy("createdAt", "desc"), limit(25));
let snap = await getDocs(q);
const lastDoc = snap.docs[snap.docs.length - 1];

// Page 2 — start after the last document of page 1 (cheap, index-anchored)
q = query(collection(db, "orders"), orderBy("createdAt", "desc"), startAfter(lastDoc), limit(25));
snap = await getDocs(q);

Cursor pagination is O(page size) regardless of how deep you are; offset pagination is O(offset) in both cost and latency. Always use cursors.

Security: Security Rules vs IAM (two gates, two paths)

This is the section that prevents data breaches, so be precise. Firestore has two completely separate access-control systems, and which one applies depends on how the caller connects.

Security Rules — for client SDKs

When a mobile or web app talks to Firestore directly (Firebase client SDKs), there is no trusted backend in the path — the code runs on the user’s device and cannot be trusted. Access is governed by Security Rules: a declarative language you deploy to the database that evaluates every read and write against the request, the authenticated user (via Firebase Authentication), and even the data being written. IAM does not gate client access in this model; Security Rules do, and they default to deny.

rules_version = '2';
service cloud.firestore {
  match /databases/{database}/documents {

    // A user can read and write only their own profile document.
    match /users/{uid} {
      allow read, write: if request.auth != null && request.auth.uid == uid;
    }

    // Orders are readable by their owner; writes must keep the owner field honest.
    match /users/{uid}/orders/{orderId} {
      allow read:  if request.auth != null && request.auth.uid == uid;
      allow create: if request.auth != null
                    && request.auth.uid == uid
                    && request.resource.data.total is int
                    && request.resource.data.total >= 0;
      allow update, delete: if false;          // immutable once created
    }
  }
}

Security Rules can match document paths (with wildcards), inspect request.auth (the signed-in user and custom claims), compare against existing data (resource.data) and incoming data (request.resource.data), call helper functions, and even perform get()/exists() lookups of other documents to make authorisation decisions. They are validation and authorisation. Deploy them with the Firebase CLI:

firebase deploy --only firestore:rules

Key truths about Security Rules:

IAM — for server SDKs and admins

When a trusted backend (the Admin SDK, a Cloud Function, a Cloud Run service, a server using a service account) or an operator (gcloud, the console) accesses Firestore, access is governed by IAM, and Security Rules are bypassed entirely — the Admin SDK has full data access subject only to the IAM role of its identity. The relevant predefined roles:

Role Grants Typical holder
roles/datastore.user Read/write access to data in any database Application service accounts (backends, Cloud Functions)
roles/datastore.viewer Read-only data access Reporting/read-only services
roles/datastore.owner Full data + admin (indexes, etc.) Admins / setup automation
roles/datastore.importExportAdmin Run managed export/import Backup automation
roles/datastore.indexAdmin Manage indexes only CI that deploys indexes

(Firestore’s IAM permissions live under the historical datastore.* namespace regardless of mode — a naming quirk, not a behaviour difference.)

The rule to never get wrong

Security Rules protect the untrusted path (client SDKs with end-user identity); IAM protects the trusted path (server/Admin SDK with a service-account identity). Datastore-mode databases have only the IAM path — no Security Rules exist there. The most common production breach in this area is shipping allow read, write: if true; rules to a Native-mode database that real users hit directly. Write least-privilege rules, and remember that a Cloud Function using the Admin SDK is not governed by your rules — secure it with IAM and your own code.

Transactions and batched writes

Firestore gives you two atomicity primitives.

Batched writes

A batched write groups up to 500 write operations (set/update/delete) into a single atomic commit — all succeed or all fail. There is no read involved and no contention check; it is simply “apply these N writes together”. Use it for fan-out denormalisation (update a username on the user doc and on every cached copy) and bulk imports.

import { writeBatch, doc } from "firebase/firestore";
const batch = writeBatch(db);
batch.set(doc(db, "users", uid), { displayName: "Asha" }, { merge: true });
batch.update(doc(db, "stats", "global"), { userCount: increment(1) });
batch.delete(doc(db, "temp", "draft"));
await batch.commit();   // all-or-nothing, up to 500 writes

Transactions

A transaction is a read-then-write unit with optimistic concurrency control. You read some documents, compute new values, and write — and Firestore guarantees the documents you read did not change between your read and your commit; if they did, it retries the transaction automatically. This is how you implement correct counters, inventory decrements, transfers and any “read-modify-write” that must not race.

import { runTransaction, doc } from "firebase/firestore";
await runTransaction(db, async (tx) => {
  const ref = doc(db, "inventory", "sku-42");
  const snap = await tx.get(ref);              // read
  const qty = snap.data().qty;
  if (qty < 1) throw new Error("out of stock");
  tx.update(ref, { qty: qty - 1 });            // write, only commits if 'ref' unchanged
});

Transaction rules and gotchas:

Both batched writes and transactions are atomic and strongly consistent, and both can span multiple collections (unlike some databases, there is no “same partition” restriction for transactional writes).

Real-time listeners

The feature that sets Firestore apart: instead of polling, a client subscribes to a document or query and Firestore pushes changes as they happen. The listener fires once with the current state, then again on every change, delivering only the deltas (which documents were added/modified/removed).

import { collection, query, where, onSnapshot } from "firebase/firestore";
const q = query(collection(db, "messages"), where("room", "==", "general"));
const unsubscribe = onSnapshot(q, (snap) => {
  snap.docChanges().forEach((change) => {
    if (change.type === "added")    renderMessage(change.doc);
    if (change.type === "modified") updateMessage(change.doc);
    if (change.type === "removed")  removeMessage(change.doc);
  });
});
// later: unsubscribe();

Listener facts:

TTL (time-to-live): automatic expiry

A TTL policy tells Firestore to automatically delete documents in a collection once a timestamp field you nominate passes. It is the clean way to expire sessions, ephemeral tokens, soft-deleted records or old events without writing a cleanup job.

# Delete documents in 'sessions' once their 'expireAt' timestamp is reached
gcloud firestore fields ttls update expireAt \
  --collection-group=sessions \
  --database='(default)' \
  --enable-ttl

TTL truths:

Backups, point-in-time recovery and exports

Three different durability tools — know what each protects against.

Scheduled backups & PITR

Firestore offers managed backups: you define a backup schedule (daily, or weekly with a retention window) and Firestore takes consistent backups of the whole database that you can restore into a new database. Separately, point-in-time recovery (PITR) retains a rolling window (up to 7 days) of fine-grained versions, letting you read or restore the database as of any minute within that window — the tool for “undo the bad bulk write from 40 minutes ago”.

# Enable PITR on a database
gcloud firestore databases update --database='(default)' \
  --enable-pitr

# Create a daily backup schedule retained for 7 days
gcloud firestore backups schedules create \
  --database='(default)' \
  --recurrence=daily \
  --retention=7d

# Restore a backup into a NEW database (restore is never in place)
gcloud firestore databases restore \
  --source-backup=projects/PRJ/locations/nam5/backups/BACKUP_ID \
  --destination-database=restored-db

Like Cloud SQL, restore creates a new database — it never overwrites the live one — so recovery is non-destructive: you restore to a fresh database, verify, then migrate or repoint.

Managed export/import

The older, still-useful tool is managed export/import: Firestore writes the database (or selected collections) to a Cloud Storage bucket, and you can import it back (into the same or a different database/project) or load it into BigQuery for analytics. Exports are not transactionally point-in-time consistent across the whole database unless you take them carefully, so prefer backups/PITR for disaster recovery and use export for migration, cross-project copies and analytics.

gcloud firestore export gs://my-firestore-exports/$(date +%F) \
  --database='(default)'
Tool Protects against Restores to Note
PITR Recent logical errors (bad write/delete) A new database, any minute in last 7 days Rolling 7-day window
Scheduled backups Data loss / corruption A new database Daily/weekly, retained per policy
Export/import Migration, cross-project copy, analytics Same/other DB, or BigQuery Not a substitute for backups

Location and multi-region: the durability/latency choice

When you create a database you pick a location, and like the mode it is permanent for that database. Two flavours:

Property Regional Multi-region
Failure domain tolerated A zone in the region An entire region
Replication Synchronous across zones Synchronous across regions
Availability SLA Regional (high) 99.999% (highest)
Write latency Lower Higher (cross-region consensus)
Cost Lower Higher
Choose for Single-region apps, cost-sensitive Global, mission-critical

Two rules: the location is immutable (to move regions you create a new database and migrate via export/import), and for multi-database projects each database has its own location, so you can mix (a regional analytics DB next to a multi-region production DB). Pick the region nearest your users (and your other GCP services) to minimise latency and egress.

Embedded diagram

Google Cloud Firestore deep dive

The diagram captures the whole model in one frame: the collection → document → field tree with a subcollection hanging off a document; the two access paths converging on the database — client SDKs gated by Security Rules on the left and server / Admin SDK gated by IAM on the right; the index layer underneath (automatic single-field plus declared composite) that every query reads from; and the durability stack — PITR, scheduled backups, export — with the regional vs multi-region replication choice. Keep this picture: a tree you query through indexes, two security gates for two callers, and a layered durability story.

Hands-on lab

We will create a Native-mode Firestore database, write and query documents with the CLI, trigger and create a composite index, set a TTL policy, enable PITR, and clean up. Firestore has a generous Always Free tier (a daily allowance of reads/writes/deletes and 1 GiB stored), so this lab typically costs nothing; we still clean up. Use a sandbox project.

0. Set context and enable the API.

gcloud config set project YOUR_SANDBOX_PROJECT
gcloud services enable firestore.googleapis.com

1. Create a Native-mode database in a region.

gcloud firestore databases create \
  --database='(default)' \
  --location=us-central1 \
  --type=firestore-native

Expected: the command reports the database created in firestore-native mode. Verify:

gcloud firestore databases describe --database='(default)' \
  --format="yaml(type, locationId, pointInTimeRecoveryEnablement)"

Expected: type: FIRESTORE_NATIVE, locationId: us-central1.

2. Write a few documents. (The gcloud firestore surface for document writes is limited; the simplest cross-platform way is the REST API via gcloud auth, but for the lab we use the Firebase CLI’s data tools or the console. Here we use gcloud to call the Firestore REST endpoint to create a document.)

ACCESS_TOKEN=$(gcloud auth print-access-token)
PROJECT=$(gcloud config get-value project)
curl -s -X POST \
  "https://firestore.googleapis.com/v1/projects/${PROJECT}/databases/(default)/documents/orders" \
  -H "Authorization: Bearer ${ACCESS_TOKEN}" -H "Content-Type: application/json" \
  -d '{ "fields": {
        "status":    { "stringValue": "paid" },
        "total":     { "integerValue": "4999" },
        "createdAt": { "timestampValue": "2026-06-15T09:00:00Z" } } }'

Expected: JSON describing the created document with an auto-generated name (path). Repeat with different status/total values to have data to query.

3. List the documents (a simple read).

curl -s \
  "https://firestore.googleapis.com/v1/projects/${PROJECT}/databases/(default)/documents/orders" \
  -H "Authorization: Bearer ${ACCESS_TOKEN}" | head -40

Expected: the documents you wrote.

4. Force a composite-index requirement. A query that filters on status and orders by createdAt needs a composite index. Run such a structured query and observe the error that hands you the index. (Using the runQuery REST endpoint.)

curl -s -X POST \
  "https://firestore.googleapis.com/v1/projects/${PROJECT}/databases/(default)/documents:runQuery" \
  -H "Authorization: Bearer ${ACCESS_TOKEN}" -H "Content-Type: application/json" \
  -d '{ "structuredQuery": {
        "from": [{ "collectionId": "orders" }],
        "where": { "fieldFilter": { "field": { "fieldPath": "status" },
                   "op": "EQUAL", "value": { "stringValue": "paid" } } },
        "orderBy": [{ "field": { "fieldPath": "createdAt" }, "direction": "DESCENDING" }] } }'

Expected: an error of type FAILED_PRECONDITION whose message contains a console URL to create the exact composite index (status ASC, createdAt DESC). Create it via CLI:

gcloud firestore indexes composite create \
  --database='(default)' \
  --collection-group=orders \
  --field-config=field-path=status,order=ascending \
  --field-config=field-path=createdAt,order=descending

Expected: the index is created and begins building. List it:

gcloud firestore indexes composite list --database='(default)' \
  --format="table(name.basename(), state)"

Expected: the index reaching state READY. Re-run the query from step 4 — it now succeeds.

5. Set a TTL policy. Suppose orders carry an expireAt timestamp; expire them automatically.

gcloud firestore fields ttls update expireAt \
  --collection-group=orders --database='(default)' --enable-ttl
gcloud firestore fields ttls list --collection-group=orders --database='(default)'

Expected: a TTL configuration on expireAt in state ACTIVE.

6. Enable point-in-time recovery.

gcloud firestore databases update --database='(default)' --enable-pitr
gcloud firestore databases describe --database='(default)' \
  --format="value(pointInTimeRecoveryEnablement)"

Expected: POINT_IN_TIME_RECOVERY_ENABLED.

7. Cleanup. Delete the documents you created (delete by path), remove the composite index, and — if this is a throwaway project — delete the database.

# Delete the composite index
gcloud firestore indexes composite list --database='(default)' --format="value(name)" \
  | xargs -I{} gcloud firestore indexes composite delete {} --quiet

# Delete the database entirely (only on a sandbox you own)
gcloud firestore databases delete --database='(default)' --quiet

Cost note. This lab fits inside the Firestore Always Free daily allowance (tens of thousands of reads/writes/deletes per day and 1 GiB storage), so it is normally free. Stored data, reads/writes beyond the free tier, PITR retention and multi-region replication are the things that cost money in production; the free tier easily covers a small lab. Deleting the database stops all storage charges.

Common mistakes & troubleshooting

Symptom Likely cause Fix
Query fails with FAILED_PRECONDITION: The query requires an index No composite index for that filter+order combination Click the console link in the error (or gcloud firestore indexes composite create); wait for READY
“I filtered on a range and ordered by another field and it errors” The inequality/range field must be the first orderBy orderBy the range field first, then the others; add the matching composite index
Client app can read data it shouldn’t (or all data) Security Rules left in test mode (if true) or too permissive Write least-privilege rules; remember rules deny by default and are not filters
Admin SDK / Cloud Function ignores my Security Rules IAM, not rules, governs server/Admin SDK access Secure the service account’s IAM role (roles/datastore.user) and validate in code
Writes are slow / a document is a bottleneck A single hot document (counter) hits the ~1 write/sec per-document limit Use a sharded distributed counter; spread writes; use random IDs
Document write rejected — too large Hit the 1 MiB document limit (unbounded array/map) Move the growing data into a subcollection of small documents
Deleted a parent document but child data remains Subcollections are not cascaded on parent delete Delete subcollection documents explicitly (recursive delete / TTL / bulk delete)
Surprisingly high write bill Every field is auto-indexed (asc + desc), so each write updates many index entries Exempt large/unused fields from single-field indexing (gcloud firestore indexes fields update ... --disable-indexes)
Stale results in a real-time listener Listener served from local cache (offline / fromCache) before server response Check the fromCache flag; the server snapshot follows
not-in / != query missing some documents These operators exclude documents where the field is absent Ensure the field exists, or model with a sentinel value

Best practices

Security notes

Cost & sizing

Firestore bills on operations and storage, not provisioned servers — there is nothing to size, so the levers are about how much you do:

  1. Document reads / writes / deletes. The primary cost. Every document a query returns is a read; every changed document a listener delivers is a read; every document write/delete is billed. Denormalise carefully (fan-out writes cost), scope queries and listeners tightly, and use aggregation queries (count) instead of reading documents just to count them.
  2. Index writes (hidden in write cost). Each indexed field adds index-entry writes per document write. Exempt fields you never query to cut this.
  3. Stored data. GiB-months of documents and indexes (indexes can be a large fraction of storage). PITR and backups add storage.
  4. Network egress. Reads to clients in other regions/continents incur egress; co-locate.
  5. Multi-region. Costs more than regional for both storage and writes (cross-region replication) — pay for it only where you need region-failure survival.
  6. Free tier. A daily Always Free allowance of reads/writes/deletes plus 1 GiB storage covers small apps and all labs.

Right-sizing in Firestore means modelling for fewer operations (the right denormalisation, the right indexes, cursor pagination, scoped listeners) rather than picking a machine.

Interview & exam questions

  1. What is the difference between Firestore Native mode and Datastore mode, and can you switch? Both run on the same Firestore backend, but Native mode exposes client SDKs, real-time listeners, offline persistence and Security Rules; Datastore mode is server-only (IAM-only, no listeners/clients) and keeps the legacy Datastore entity/API for backward compatibility. The mode is fixed at database creation and cannot be changed — you migrate to a new database. Choose Native for new work.

  2. Why does a Firestore query sometimes fail asking for an index? Because every query is served by an index — there are no scans. Single-field indexes are automatic, but a query that filters and orders on different fields (or filters on multiple fields) needs a composite index, which is not auto-created. Firestore returns a FAILED_PRECONDITION error with a link to build the exact index.

  3. Explain the inequality/range-and-order rule. If a query uses a range/inequality filter on a field and also orders results, the first orderBy must be on that same field, and a matching composite index must exist. Historically range filters were limited to one field per query; modern Firestore relaxed that, but the order-the-inequality-field-first discipline (and the index) remains.

  4. Security Rules vs IAM — which applies when? Security Rules govern client SDK access (mobile/web with Firebase Auth identities) and run at the Firestore boundary, default-deny. IAM governs server/Admin SDK and gcloud access; the Admin SDK bypasses Security Rules and is limited only by its IAM role. Datastore mode has IAM only.

  5. Are Security Rules a query filter? No. Rules allow or deny an operation; they do not narrow results. A query that could return documents the rules forbid is rejected entirely, so you must constrain the query to only request permitted documents.

  6. Batched write vs transaction? A batched write atomically applies up to 500 writes with no reads and no contention check. A transaction is a read-then-write with optimistic concurrency — it re-runs if any read document changed before commit, guaranteeing correct read-modify-write. Use transactions for counters/inventory; batches for fan-out and bulk writes.

  7. What is the per-document write limit and how do you exceed a hot counter? A single document supports roughly one sustained write per second. For a high-write counter, use a sharded/distributed counter (split it into N documents and sum on read) to multiply throughput.

  8. How do you paginate efficiently, and why not use offsets? Use cursors (startAfter with the last document or a field value). Offsets still read and bill the skipped entries (O(offset)); cursors are O(page size) and anchored in the index.

  9. Why can’t you store an ever-growing chat log in one document’s array? The 1 MiB document limit — and every write rewrites the whole document. Put each message in a subcollection of small documents instead.

  10. What does TTL do and how immediate is it? A TTL policy auto-deletes documents once a nominated timestamp field passes. Deletion is best-effort within ~24 hours, not instant — enforce hard expiry in queries/rules too. TTL deletes still bill and fire listeners/triggers.

  11. Regional vs multi-region Firestore? Regional replicates synchronously across zones in one region (zone-failure tolerant, lower cost/latency). Multi-region replicates synchronously across regions (region-failure tolerant, 99.999% SLA, higher cost/latency). The location is permanent.

  12. When would you choose Firestore over Bigtable or Cloud SQL? Firestore for app/document data needing real-time sync, flexible schema, strong consistency and direct client access at moderate per-key throughput. Bigtable for massive, write-heavy, low-latency wide-column workloads (time-series, IoT, analytics keyed by row) at huge scale. Cloud SQL when you need a relational model with joins, transactions across rows, and SQL on a single-primary managed engine.

  13. How do you recover from a bad bulk write made an hour ago? Use point-in-time recovery (rolling 7-day window) to restore the database as of a minute before the mistake into a new database, verify, then repoint — or restore a scheduled backup. Restore is never in place.

  14. Does deleting a document delete its subcollections? No. Subcollections are independent; you must delete their documents explicitly (recursive delete, bulk delete, or a TTL policy).

Quick check

  1. You are building a new mobile app that needs offline support and live updates — Native mode or Datastore mode?
  2. A query filters where('status','==','open') and orderBy('createdAt','desc') and errors. What do you create, and why?
  3. True/false: Security Rules act as a filter that trims a query’s results to the documents a user may see.
  4. Which atomicity primitive guarantees a correct read-modify-write of a counter under contention?
  5. You need to keep ever-growing per-user event records. Where do they go, and why not in the user document?

Answers

  1. Native mode — it is the only mode with client SDKs, offline persistence, real-time listeners and Security Rules.
  2. A composite index on (status ASC, createdAt DESC) — every Firestore query is served by an index, and a filter-plus-order on different fields needs a composite one (not auto-created).
  3. False — rules allow or deny an operation as a whole; they are not filters. A query that could return forbidden documents is rejected; you must constrain the query.
  4. A transaction (optimistic concurrency, automatic retry on conflict). For very high write rates, shard the counter.
  5. In a subcollection of small documents under the user (users/{uid}/events/{eventId}) — a single document caps at 1 MiB and an unbounded array would blow it and rewrite the whole doc on every append.

Exercise

Design and partially build a small “team chat” data model in a sandbox project, Native mode:

  1. Create a Native-mode database in a region near you; enable PITR and a daily backup schedule.
  2. Model workspaces/{wsId}/channels/{chId}/messages/{msgId} with messages carrying authorId, text, createdAt, and an expireAt for ephemeral messages.
  3. Write Security Rules so a user can read a channel only if they are a member (use a get() lookup of a membership document) and can create messages only as themselves with a server timestamp; make messages immutable.
  4. Add the composite index needed to list a channel’s messages by createdAt filtered by authorId, deploying it from a firestore.indexes.json.
  5. Exempt the text field from single-field indexing (you never query message text) and note the storage/write-cost effect.
  6. Add a TTL policy on expireAt for ephemeral messages, and a sharded counter for channels/{chId} unread-message counts.
  7. Implement a transaction that decrements a user’s “daily message quota” document atomically when they post.
  8. Use PITR to “undo” a deliberately bad bulk delete by restoring to a new database and comparing.

Write a short paragraph for each of the two security gates (Rules vs IAM) explaining which callers each protects and what would break if you relied on the wrong one.

Certification mapping

Glossary

Next steps

GCPFirestoreDatabasesNoSQLIndexesSecurity Rules
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments