Cloud Storage durability (eleven nines) protects you from disk failure. It does nothing for the threats that actually destroy data in production: a fat-fingered gsutil rm -r, a compromised service account running objects.delete across a bucket, a compliance auditor who needs to prove an object could not have been altered, or a regional outage during a sensitive batch window. Each of those is a different failure mode, and Cloud Storage gives you a different control for each. The mistake teams make is treating them as interchangeable. They are not. Versioning, soft delete, retention policies, object holds, and Object Retention are overlapping but distinct safety nets, and a serious design uses several of them at once. This guide walks each control, where it stops and the next one starts, and how to compose them into a ransomware-resilient bucket whose recovery you have actually rehearsed.
1. Bucket retention policies and locking for compliance immutability
A bucket-level retention policy sets a minimum duration that every object in the bucket must persist before it can be deleted or replaced. It is a floor on object lifetime, enforced server-side. While the policy is in force, an object whose age is below the retention period cannot be deleted or overwritten, regardless of IAM.
# Apply a 7-year retention policy (in seconds) to a bucket
gcloud storage buckets update gs://kv-compliance-archive \
--retention-period=220752000s
# Inspect it
gcloud storage buckets describe gs://kv-compliance-archive \
--format="yaml(retentionPolicy)"
A retention policy on its own is mutable: an admin can shorten or remove it. For regulated immutability (SEC 17a-4, FINRA, many internal “legal hold infrastructure” requirements) you must lock it. Locking is irreversible. Once locked, the retention period can be increased but never decreased or removed, and the bucket itself cannot be deleted while it holds objects under retention.
# Irreversible. There is no undo. Increasing the period later is the only edit allowed.
gcloud storage buckets update gs://kv-compliance-archive --lock-retention-period
Treat
--lock-retention-periodlike a one-way door in a change ticket. I require a second approver and a written confirmation of the period, because the only way to “fix” an over-long locked period is to abandon the bucket once objects age out. Test the whole flow in a throwaway bucket first.
A subtle point that trips people up: the retention clock is based on each object’s creation/storage time, not when the policy was applied. Applying a 7-year policy today does not retroactively protect a 6-year-old object for 7 more years; that object already satisfies the floor and becomes deletable.
2. Object versioning vs soft delete: overlapping but distinct safety nets
Both protect against deletion and overwrite, but they answer different questions.
Object versioning keeps a noncurrent version every time you overwrite or delete a live object. It is opt-in, has no fixed expiry (versions live until lifecycle or you remove them), and is your primary tool for intentional version history and rollback of overwrites.
gcloud storage buckets update gs://kv-app-state --versioning
Soft delete is on by default for every new bucket and retains deleted and overwritten objects for a configurable window (default 7 days, settable from 0 to 90 days). Crucially it covers objects even in buckets without versioning, and it survives gcloud storage rm. It is your “oops” net for accidental and malicious deletes.
# Set a 30-day soft delete retention window
gcloud storage buckets update gs://kv-app-state \
--soft-delete-duration=30d
# List soft-deleted objects and restore one
gcloud storage ls --soft-deleted gs://kv-app-state/
gcloud storage restore gs://kv-app-state/path/to/object.parquet
| Property | Versioning | Soft delete |
|---|---|---|
| Default state | Off | On (7 days) |
| Expiry | None (until lifecycle/manual) | 0-90 day window |
| Covers overwrite | Yes (noncurrent version) | Yes |
Covers rm of live object |
Yes | Yes |
| Cost model | You store all versions indefinitely | You store deleted bytes for the window |
| Best for | History, rollback | Accidental/malicious delete recovery |
They stack. Run both: versioning for deliberate history, soft delete as a time-boxed safety net that catches the case where someone deletes all versions.
3. Object holds: event-based and temporary holds for legal preservation
Holds are a per-object flag that blocks deletion and overwrite while set, independent of any retention period. They are how you preserve a specific object indefinitely (litigation, investigation) without imposing a bucket-wide policy.
- Temporary hold: a simple on/off latch. While set, the object cannot be deleted or replaced. Cleared manually.
- Event-based hold: also blocks deletion, and additionally resets the object’s retention period when the hold is released, so the retention clock starts from release time. This is the building block for record-keeping where the clock should start at an event (account closure, contract end), not at object creation.
# Place a temporary hold (e.g., a legal preservation request landed)
gcloud storage objects update gs://kv-records/case-4471/contract.pdf \
--temporary-hold
# Release it later
gcloud storage objects update gs://kv-records/case-4471/contract.pdf \
--no-temporary-hold
# Default new objects in a bucket to event-based hold on upload
gcloud storage buckets update gs://kv-records --default-event-based-hold
An object with any hold set will not be deleted even after its retention period expires. Holds win.
4. Lifecycle management: storage class transitions and noncurrent cleanup
Versioning without lifecycle is a slow-motion budget leak: every overwrite accretes a noncurrent version you pay Standard rates for forever. Lifecycle rules tier aging data down and reap old versions. The key conditions for a protection-tuned policy are daysSinceNoncurrentTime (age of a noncurrent version) and numNewerVersions (how many newer versions exist).
{
"rule": [
{
"action": {"type": "SetStorageClass", "storageClass": "NEARLINE"},
"condition": {"age": 30, "matchesStorageClass": ["STANDARD"]}
},
{
"action": {"type": "SetStorageClass", "storageClass": "COLDLINE"},
"condition": {"age": 90, "matchesStorageClass": ["NEARLINE"]}
},
{
"action": {"type": "Delete"},
"condition": {
"daysSinceNoncurrentTime": 30,
"numNewerVersions": 3,
"isLive": false
}
}
]
}
gcloud storage buckets update gs://kv-app-state --lifecycle-file=lifecycle.json
The third rule keeps the three most recent noncurrent versions, then deletes older ones once they have been noncurrent for 30 days. Tune numNewerVersions to your recovery point objective for overwrites.
Lifecycle and retention interact with a hard guarantee: a
Deleteaction will never remove an object that is still under an active retention policy or hold. Retention always wins over lifecycle, so you can run aggressive cleanup rules on a locked bucket without fear of violating immutability. Also note Autoclass is the hands-off alternative to manual class transitions, but it cannot move objects to Archive and bills a per-object management fee; use explicit rules when you need Archive or predictable cost.
5. Dual-region buckets and turbo replication RPO/RTO characteristics
Region resilience is a location property set at bucket creation. A dual-region bucket stores data in two specific regions you choose (e.g., nam4 = us-central1 + us-east1, or a configurable dual-region). Standard async replication targets eventual consistency with no contractual recovery point.
Turbo replication adds an RPO guarantee: GCS targets replicating newly written objects to the second region within 15 minutes (a Recovery Point Objective, not RTO). It is available only on dual-region buckets and is set at creation.
# Create a configurable dual-region bucket with turbo replication
gcloud storage buckets create gs://kv-critical-pipeline \
--location=us \
--placement=us-central1,us-east1 \
--rpo=ASYNC_TURBO \
--uniform-bucket-level-access
# Check replication progress / per-object metadata
gcloud storage buckets describe gs://kv-critical-pipeline \
--format="yaml(rpo,customPlacementConfig)"
Mental model for the geo controls:
- RPO (data-loss window): standard async is best-effort; turbo targets 15 minutes for new writes.
- RTO (time to serve from the other region): effectively zero from the application’s view because the bucket is a single namespace fronting both regions; a read served from the surviving region is transparent. There is no failover step to perform.
- Turbo only protects objects written after it is enabled and only on dual-region. Backfilling an existing single-region bucket means a new bucket plus a copy (
gcloud storage cpor Storage Transfer Service).
6. Object Retention Lock for per-object WORM requirements
Bucket retention applies one floor to the whole bucket. Object Retention (Object Retention Lock) sets a retain-until timestamp on individual objects, so different objects in the same bucket can carry different WORM durations. The bucket must have the object-retention capability enabled at creation; it cannot be turned on later.
# Object retention must be enabled at bucket creation
gcloud storage buckets create gs://kv-mixed-records \
--enable-per-object-retention \
--uniform-bucket-level-access
# Set an Unlocked retention until a date (can be shortened/removed while Unlocked)
gcloud storage objects update gs://kv-mixed-records/report-q1.pdf \
--retain-until=2031-01-01T00:00:00Z \
--retention-mode=Unlocked
# Promote to Locked: now it can only be extended, never reduced
gcloud storage objects update gs://kv-mixed-records/report-q1.pdf \
--retention-mode=Locked
Unlocked lets you correct mistakes (shorten or clear the date); Locked is true per-object WORM that can only be lengthened. This is the right tool when a single bucket mixes records with heterogeneous legal hold durations and a one-size bucket policy would be wrong.
7. IAM and signed URLs: scoping access without weakening protection
Protection controls govern deletion and mutation; IAM governs who can do anything at all. They are independent layers, and a common error is to assume a retention policy compensates for sloppy IAM. It does not: an over-privileged principal can still read, exfiltrate, or (where retention does not apply) overwrite.
Principles I enforce:
- Uniform bucket-level access on protected buckets. It disables per-object ACLs so access is auditable purely through IAM. Required for most org-policy guardrails anyway.
- Split the delete permission out. The dangerous verb is
storage.objects.delete. Grant writersroles/storage.objectCreator(create only, no overwrite) rather thanroles/storage.objectAdminwherever the workload only appends. - No standing delete at the bucket level. Bucket deletion and policy changes belong to a separate, heavily audited admin identity, ideally gated behind VPC Service Controls.
# Append-only writer: can create objects but not delete or overwrite existing ones
gcloud storage buckets add-iam-policy-binding gs://kv-app-state \
--member="serviceAccount:ingest@kv-prod.iam.gserviceaccount.com" \
--role="roles/storage.objectCreator"
For sharing specific objects with external parties, use signed URLs rather than broadening IAM. A signed URL grants time-bounded access to a single operation on a single object and expires on its own. Scope it to GET and a short TTL.
# Read-only, 15-minute access to one object, no IAM change required
gcloud storage sign-url gs://kv-records/case-4471/contract.pdf \
--http-verb=GET \
--duration=15m \
--impersonate-service-account=url-signer@kv-prod.iam.gserviceaccount.com
The signed URL never grants delete and never touches the bucket’s protection posture. That is the point: distribute access narrowly without loosening the controls in steps 1-6.
8. Designing a ransomware-resilient bucket and validating recovery
Compose the layers. A ransomware actor with a stolen writer credential will try to encrypt-in-place (overwrite) and then delete originals and versions. Defeat that with overlapping nets:
gcloud storage buckets create gs://kv-resilient \
--location=us \
--placement=us-central1,us-east1 \
--rpo=ASYNC_TURBO \
--uniform-bucket-level-access \
--enable-per-object-retention
# Versioning (deliberate history) + a long soft-delete window (malicious-delete net)
gcloud storage buckets update gs://kv-resilient \
--versioning \
--soft-delete-duration=90d
- Overwrite attempt -> versioning preserves the prior version; the attacker cannot remove history without
objects.delete, which append-only writers do not have. - Delete-all-versions attempt -> soft delete retains them for 90 days; restoration is a
gcloud storage restoreaway. - Permanent-record subset -> Object Retention
Lockedmakes those objects un-deletable for their full term even by an admin. - Region failure during the attack window -> turbo replication keeps the second region within 15 minutes RPO, served transparently.
The control nobody tests is the one that fails in the incident. Rehearse it.
Verify
Confirm each net actually catches what it should:
# 1. Retention floor blocks early deletes (expect a failure on a young object)
gcloud storage rm gs://kv-compliance-archive/some-recent-object.bin
# 2. Soft delete caught a deleted object and you can restore it
gcloud storage ls --soft-deleted gs://kv-resilient/ | head
gcloud storage restore gs://kv-resilient/path/to/object.parquet
# 3. Versioning kept a prior generation; list and restore a specific generation
gcloud storage ls --all-versions gs://kv-resilient/path/to/object.parquet
gcloud storage cp \
"gs://kv-resilient/path/to/object.parquet#1718000000000000" \
gs://kv-resilient/path/to/object.parquet
# 4. Object Retention is Locked with the expected retain-until
gcloud storage objects describe gs://kv-mixed-records/report-q1.pdf \
--format="yaml(retention)"
# 5. Turbo replication is configured and placement is correct
gcloud storage buckets describe gs://kv-resilient \
--format="yaml(rpo,customPlacementConfig,softDeletePolicy,versioning)"
If gcloud storage rm on the young object in step 1 succeeds, the retention policy is not what you think it is; stop and re-inspect before declaring the bucket compliant.
Enterprise scenario
A fintech platform team I worked with had a single regional us-central1 bucket holding both transactional document records (SEC 17a-4: 6-year WORM) and high-churn ML feature exports that were rewritten hourly. They had naively applied a bucket-wide 6-year locked retention policy to satisfy the auditor. It passed compliance and quietly created a six-figure cost problem: every hourly feature overwrite became a new object the locked policy refused to let lifecycle delete, so the bucket grew unboundedly with data nobody needed for six years. They also had no region resilience for the records, which their DR standard now required.
The constraint: they could not loosen WORM on the records, could not co-mingle the cost profiles, and could not afford downtime to re-architect ingestion.
The fix was to stop using one bucket as one policy. They created a dual-region bucket with per-object retention enabled and turbo replication, moved the records into it, and set Locked retain-until per object so each record carried exactly its own 6-year clock and nothing else did. The feature exports moved to a separate, cheaply lifecycled bucket with no bucket-wide retention. The records bucket gained 15-minute RPO across regions for free, and the runaway cost vanished because the feature data was no longer trapped under a blanket policy.
# Records bucket: dual-region, turbo, per-object WORM, no blanket policy
gcloud storage buckets create gs://kv-records-worm \
--location=us \
--placement=us-central1,us-east1 \
--rpo=ASYNC_TURBO \
--enable-per-object-retention \
--uniform-bucket-level-access
# Each record carries its own locked 6-year clock
gcloud storage objects update gs://kv-records-worm/2026/acct-88102.pdf \
--retain-until=2032-06-08T00:00:00Z \
--retention-mode=Locked
The lesson: bucket-wide retention is a blunt instrument. When durations are heterogeneous, per-object retention is both the compliant and the cost-correct answer.