IaC AWS

Terraform Module: AWS Timestream — a reusable serverless time-series store with tiered retention

Quick take — Wrap aws_timestreamwrite_database and aws_timestreamwrite_table in a reusable Terraform module: KMS-encrypted, magnetic+memory retention tiers, and S3 rejection logging for production IoT and metrics workloads. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "aws" {
  region = "us-east-1"
}

module "timestream" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-timestream?ref=v1.0.0"

  database_name = "..."  # Timestream database (namespace) name; 3-256 chars of `[…
  tables        = {}     # Map of table name to config: `memory_retention_hours`, …
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Amazon Timestream for LiveAnalytics is a serverless, purpose-built time-series database. Instead of stuffing timestamped points into a relational table or an over-provisioned cluster, you write records that each carry a time, one or more dimensions (the metadata you filter and group by — device_id, region, host), and measure values. Timestream automatically routes recent data into a fast in-memory store for low-latency queries and ages it into a cheaper magnetic store for long-term analytics, so you pay for the storage tier the data actually lives in rather than a fixed instance running 24/7.

The two resources that matter for provisioning are aws_timestreamwrite_database (the namespace and KMS encryption boundary) and aws_timestreamwrite_table (the actual table that holds points, owns the memory/magnetic retention windows, and optionally writes rejected records to S3). Click-ops here is dangerous in a subtle way: the memory store retention and magnetic store retention are the single biggest cost and query-latency levers in the whole service, and they are easy to set wrong by hand and forget. A 30-day memory window on a high-ingest IoT table can cost an order of magnitude more than a 12-hour window, and you only discover it on the bill.

Wrapping the pair in a module pins those retention windows, the customer-managed KMS key, and the magnetic-store rejection log location as reviewed, version-controlled inputs. Every team that calls the module gets the same encryption, the same naming, and the same magnetic_store_write_properties for capturing late-arriving records — instead of each squad inventing its own retention policy.

When to use it

Skip Timestream (and this module) if your access pattern is random point updates/deletes, you need multi-row transactions, or your data is low-volume and relational — a normal RDS/Aurora table is simpler and cheaper. Timestream is for write-heavy, time-ordered, mostly-immutable data.

Module structure

terraform-module-aws-timestream/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

main.tf

locals {
  # Common tags merged onto every resource.
  base_tags = merge(
    var.tags,
    {
      ManagedBy = "terraform"
      Module    = "terraform-module-aws-timestream"
    },
  )

  # Build the magnetic_store_write_properties block only when a
  # rejection-report S3 destination is supplied.
  enable_magnetic_rejection_log = var.magnetic_store_rejection_s3_bucket != null
}

resource "aws_timestreamwrite_database" "this" {
  database_name = var.database_name

  # When null, Timestream uses the AWS-owned key; pass a CMK ARN for
  # customer-managed encryption and key-policy control.
  kms_key_id = var.kms_key_id

  tags = local.base_tags
}

resource "aws_timestreamwrite_table" "this" {
  for_each = var.tables

  database_name = aws_timestreamwrite_database.this.database_name
  table_name    = each.key

  retention_properties {
    # Hot, in-memory store — drives low-latency queries; priced highest.
    memory_store_retention_period_in_hours = each.value.memory_retention_hours
    # Cheap long-term store — data ages here automatically.
    magnetic_store_retention_period_in_days = each.value.magnetic_retention_days
  }

  magnetic_store_write_properties {
    # Allow late-arriving records to be written straight to the magnetic
    # store (records older than the memory window). Essential for backfills
    # and out-of-order IoT data.
    enable_magnetic_store_writes = each.value.enable_magnetic_store_writes

    # Capture records that fail magnetic-store ingestion to S3 for replay.
    dynamic "magnetic_store_rejected_data_location" {
      for_each = local.enable_magnetic_rejection_log ? [1] : []
      content {
        s3_configuration {
          bucket_name       = var.magnetic_store_rejection_s3_bucket
          object_key_prefix = "${var.magnetic_store_rejection_s3_prefix}/${each.key}"
          encryption_option = var.magnetic_store_rejection_kms_key_id != null ? "SSE_KMS" : "SSE_S3"
          kms_key_id        = var.magnetic_store_rejection_kms_key_id
        }
      }
    }
  }

  # Co-locate measures into a single partition by a high-cardinality
  # dimension (e.g. device_id) for faster, cheaper partition pruning.
  dynamic "schema" {
    for_each = each.value.partition_key_dimension != null ? [1] : []
    content {
      composite_partition_key {
        type                  = "DIMENSION"
        name                  = each.value.partition_key_dimension
        enforcement_in_record = each.value.partition_key_enforced ? "REQUIRED" : "OPTIONAL"
      }
    }
  }

  tags = merge(local.base_tags, each.value.tags)
}

variables.tf

variable "database_name" {
  description = "Name of the Timestream database (namespace). Letters, numbers, dashes, dots and underscores; 3-256 chars."
  type        = string

  validation {
    condition     = can(regex("^[a-zA-Z0-9_.-]{3,256}$", var.database_name))
    error_message = "database_name must be 3-256 chars: letters, numbers, '_', '.' or '-'."
  }
}

variable "kms_key_id" {
  description = "ARN (or alias ARN) of a customer-managed KMS key for the database. Null uses the AWS-owned key."
  type        = string
  default     = null

  validation {
    condition     = var.kms_key_id == null || can(regex("^arn:aws[a-zA-Z-]*:kms:", var.kms_key_id))
    error_message = "kms_key_id must be a KMS key/alias ARN (arn:aws:kms:...) or null."
  }
}

variable "tables" {
  description = "Map of table_name => table config. The map key becomes the Timestream table name."
  type = map(object({
    memory_retention_hours       = number
    magnetic_retention_days      = number
    enable_magnetic_store_writes = optional(bool, true)
    partition_key_dimension      = optional(string)
    partition_key_enforced       = optional(bool, false)
    tags                         = optional(map(string), {})
  }))

  validation {
    # Memory store: 1 hour to 8766 hours (~1 year).
    condition = alltrue([
      for t in values(var.tables) :
      t.memory_retention_hours >= 1 && t.memory_retention_hours <= 8766
    ])
    error_message = "memory_retention_hours must be between 1 and 8766 for every table."
  }

  validation {
    # Magnetic store: 1 day to 73000 days (200 years).
    condition = alltrue([
      for t in values(var.tables) :
      t.magnetic_retention_days >= 1 && t.magnetic_retention_days <= 73000
    ])
    error_message = "magnetic_retention_days must be between 1 and 73000 for every table."
  }

  validation {
    # Memory window must not exceed the magnetic window (hours vs days).
    condition = alltrue([
      for t in values(var.tables) :
      t.memory_retention_hours <= t.magnetic_retention_days * 24
    ])
    error_message = "memory_retention_hours cannot exceed magnetic_retention_days * 24 for any table."
  }
}

variable "magnetic_store_rejection_s3_bucket" {
  description = "S3 bucket name to capture records rejected by magnetic-store writes. Null disables rejection logging."
  type        = string
  default     = null
}

variable "magnetic_store_rejection_s3_prefix" {
  description = "Key prefix under the rejection bucket. The table name is appended automatically."
  type        = string
  default     = "timestream-rejected"
}

variable "magnetic_store_rejection_kms_key_id" {
  description = "KMS key ARN for SSE-KMS on the rejection bucket objects. Null falls back to SSE-S3."
  type        = string
  default     = null
}

variable "tags" {
  description = "Tags applied to the database and all tables."
  type        = map(string)
  default     = {}
}

outputs.tf

output "database_name" {
  description = "Name of the Timestream database."
  value       = aws_timestreamwrite_database.this.database_name
}

output "database_arn" {
  description = "ARN of the Timestream database (use in IAM policies and Grafana/Athena data sources)."
  value       = aws_timestreamwrite_database.this.arn
}

output "kms_key_id" {
  description = "KMS key ID/ARN protecting the database (AWS-owned key if none was supplied)."
  value       = aws_timestreamwrite_database.this.kms_key_id
}

output "table_names" {
  description = "Map of table_name => Timestream table name."
  value       = { for k, t in aws_timestreamwrite_table.this : k => t.table_name }
}

output "table_arns" {
  description = "Map of table_name => table ARN, for scoping write/query IAM permissions per table."
  value       = { for k, t in aws_timestreamwrite_table.this : k => t.arn }
}

How to use it

module "timestream" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-timestream?ref=v1.0.0"

  database_name = "kv-iot-prod"
  kms_key_id    = aws_kms_key.timestream.arn

  tables = {
    device_telemetry = {
      # 24h hot for live dashboards, 18 months cheap for trend analysis.
      memory_retention_hours  = 24
      magnetic_retention_days = 545
      partition_key_dimension = "device_id"
      partition_key_enforced  = true
    }
    fleet_health = {
      memory_retention_hours  = 12
      magnetic_retention_days = 90
    }
  }

  magnetic_store_rejection_s3_bucket  = aws_s3_bucket.ts_rejected.id
  magnetic_store_rejection_kms_key_id = aws_kms_key.timestream.arn

  tags = {
    environment = "prod"
    team        = "iot-platform"
    cost_center = "8841"
  }
}

# Downstream: grant the ingestion Lambda write access to one table only,
# using the per-table ARN output from the module.
resource "aws_iam_role_policy" "ingest_write" {
  name = "timestream-write-telemetry"
  role = aws_iam_role.ingest_lambda.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "timestream:WriteRecords",
          "timestream:DescribeTable",
        ]
        Resource = module.timestream.table_arns["device_telemetry"]
      },
      {
        # WriteRecords also needs the endpoint-discovery actions, which
        # are account-wide and cannot be resource-scoped.
        Effect   = "Allow"
        Action   = ["timestream:DescribeEndpoints"]
        Resource = "*"
      },
    ]
  })
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "s3"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...s3 state bucket/container + key per path...
  }
}

2. Module configlive/prod/timestream/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-timestream?ref=v1.0.0"
}

inputs = {
  database_name = "..."
  tables = {}
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/timestream && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
database_name string Yes Timestream database (namespace) name; 3-256 chars of [a-zA-Z0-9_.-].
kms_key_id string null No Customer-managed KMS key ARN/alias for the database; null uses the AWS-owned key.
tables map(object) Yes Map of table name to config: memory_retention_hours, magnetic_retention_days, enable_magnetic_store_writes, partition_key_dimension, partition_key_enforced, tags.
magnetic_store_rejection_s3_bucket string null No S3 bucket for records rejected during magnetic-store writes; null disables rejection logging.
magnetic_store_rejection_s3_prefix string "timestream-rejected" No Key prefix under the rejection bucket; the table name is appended automatically.
magnetic_store_rejection_kms_key_id string null No KMS key ARN for SSE-KMS on rejection objects; null uses SSE-S3.
tags map(string) {} No Tags applied to the database and every table.

Outputs

Name Description
database_name Name of the Timestream database.
database_arn ARN of the Timestream database, for IAM policies and Grafana/Athena data sources.
kms_key_id KMS key ID/ARN protecting the database.
table_names Map of table key to Timestream table name.
table_arns Map of table key to table ARN, for per-table IAM scoping.

Enterprise scenario

A connected-vehicle platform ingests ~80,000 telemetry points per second from a global EV fleet via IoT Core rules that call WriteRecords into the device_telemetry table. The 24-hour memory window backs the live fleet dashboard and the over-temperature alerting pipeline, while the 18-month magnetic store feeds the data-science team’s battery-degradation models through Athena federated queries. Because enable_magnetic_store_writes is on and rejected records land in a KMS-encrypted S3 bucket, the platform team can replay any late-arriving batches from vehicles that were offline in a tunnel without silently losing data.

Best practices

TerraformAWSTimestreamModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading