Terraform Module: AWS Kinesis Data Stream — on-demand or provisioned shards with KMS encryption baked in

Quick take — A reusable Terraform module for aws_kinesis_stream supporting both ON_DEMAND and PROVISIONED capacity, KMS-at-rest encryption, tunable retention, and enhanced monitoring for production-grade streaming. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "aws" {
  region = "us-east-1"
}

module "kinesis" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-kinesis?ref=v1.0.0"

  name = "..."  # Name of the Kinesis data stream; unique per account per…
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Amazon Kinesis Data Streams is a serverless, durable log for real-time data: producers PutRecord/PutRecords into ordered partitions called shards, and consumers (Lambda, Kinesis Client Library applications, Firehose, Flink) read those records within a configurable retention window. It is the backbone for clickstream ingestion, change-data-capture fan-out, IoT telemetry, and event pipelines where you need replayable, ordered-by-partition-key data at scale.

The raw aws_kinesis_stream resource has a handful of foot-guns that are easy to get wrong per-team: people forget to enable encryption, leave retention at the 24-hour default, hard-code shard_count even when on-demand would be cheaper, or skip shard-level metrics so they can’t tell a stream is throttling. Wrapping it in a module lets you bake in a secure, observable default once — server-side encryption with a customer-managed KMS key, sane retention, opt-in enhanced shard metrics, and a stream_mode_details block that toggles cleanly between ON_DEMAND and PROVISIONED — then let consuming teams supply only a name, a capacity mode, and tags.

When to use it

You are standing up multiple Kinesis streams across services or environments and want one encrypted, monitored baseline instead of copy-pasted resource blocks.
You want to start a workload in ON_DEMAND mode (no capacity planning, pay-per-throughput) and later flip specific high-volume streams to PROVISIONED shards for cost control — without rewriting the module call.
You need server-side encryption with a customer-managed KMS key enforced by default for compliance (PCI, HIPAA, internal data-classification policy).
You want shard-level enhanced metrics (IncomingBytes, IteratorAgeMilliseconds, WriteProvisionedThroughputExceeded) emitted to CloudWatch so alerting and autoscaling have something to act on.
Skip it for a one-off throwaway stream in a sandbox — but the moment it touches a real pipeline, the module pays for itself.

Module structure

terraform-module-aws-kinesis/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf

# versions.tf
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

# main.tf

locals {
  # In ON_DEMAND mode AWS manages capacity, so shard_count must be null.
  # In PROVISIONED mode it must be a positive integer.
  effective_shard_count = var.stream_mode == "PROVISIONED" ? var.shard_count : null

  base_tags = {
    "ManagedBy"  = "terraform"
    "Module"     = "terraform-module-aws-kinesis"
    "StreamMode" = var.stream_mode
  }
}

resource "aws_kinesis_stream" "this" {
  name             = var.name
  shard_count      = local.effective_shard_count
  retention_period = var.retention_period

  # Shard-level CloudWatch metrics. Empty list = no enhanced metrics (cheapest).
  shard_level_metrics = var.shard_level_metrics

  # Max records a consumer may "skip ahead" of via GetShardIterator AT_TIMESTAMP, etc.
  enforce_consumer_deletion = var.enforce_consumer_deletion

  encryption_type = var.encryption_type
  # KMS key is only valid (and required) when encryption_type = "KMS".
  kms_key_id = var.encryption_type == "KMS" ? var.kms_key_id : null

  stream_mode_details {
    stream_mode = var.stream_mode
  }

  tags = merge(local.base_tags, var.tags)
}

# variables.tf

variable "name" {
  description = "Name of the Kinesis data stream. Must be unique per account per region."
  type        = string

  validation {
    condition     = can(regex("^[a-zA-Z0-9_.-]{1,128}$", var.name))
    error_message = "name must be 1-128 chars and may only contain letters, numbers, underscores, hyphens and periods."
  }
}

variable "stream_mode" {
  description = "Capacity mode for the stream: ON_DEMAND (auto-scaling, pay-per-use) or PROVISIONED (fixed shards)."
  type        = string
  default     = "ON_DEMAND"

  validation {
    condition     = contains(["ON_DEMAND", "PROVISIONED"], var.stream_mode)
    error_message = "stream_mode must be either ON_DEMAND or PROVISIONED."
  }
}

variable "shard_count" {
  description = "Number of shards (PROVISIONED mode only). Ignored when stream_mode is ON_DEMAND. Each shard supports 1 MB/s or 1000 records/s ingest."
  type        = number
  default     = 1

  validation {
    condition     = var.shard_count >= 1 && var.shard_count <= 10000
    error_message = "shard_count must be between 1 and 10000."
  }
}

variable "retention_period" {
  description = "Data retention in hours. Valid range is 24 to 8760 (365 days). Anything above 24 incurs extended-retention charges."
  type        = number
  default     = 24

  validation {
    condition     = var.retention_period >= 24 && var.retention_period <= 8760
    error_message = "retention_period must be between 24 and 8760 hours."
  }
}

variable "encryption_type" {
  description = "Server-side encryption type: KMS (customer/AWS-managed key) or NONE."
  type        = string
  default     = "KMS"

  validation {
    condition     = contains(["KMS", "NONE"], var.encryption_type)
    error_message = "encryption_type must be either KMS or NONE."
  }
}

variable "kms_key_id" {
  description = "KMS key GUID, ARN, alias name (alias/...) or alias ARN used when encryption_type is KMS. Defaults to the AWS-managed alias/aws/kinesis key."
  type        = string
  default     = "alias/aws/kinesis"
}

variable "shard_level_metrics" {
  description = "List of shard-level CloudWatch metrics to enable. Common: IncomingBytes, IncomingRecords, OutgoingBytes, IteratorAgeMilliseconds, ReadProvisionedThroughputExceeded, WriteProvisionedThroughputExceeded. Empty list disables enhanced metrics."
  type        = list(string)
  default     = ["IncomingBytes", "OutgoingBytes", "IteratorAgeMilliseconds"]

  validation {
    condition = alltrue([
      for m in var.shard_level_metrics : contains([
        "IncomingBytes",
        "IncomingRecords",
        "OutgoingBytes",
        "OutgoingRecords",
        "WriteProvisionedThroughputExceeded",
        "ReadProvisionedThroughputExceeded",
        "IteratorAgeMilliseconds",
      ], m)
    ])
    error_message = "shard_level_metrics contains an unsupported metric name."
  }
}

variable "enforce_consumer_deletion" {
  description = "If true, registered enhanced fan-out consumers are deleted automatically when the stream is destroyed (prevents destroy errors)."
  type        = bool
  default     = true
}

variable "tags" {
  description = "Additional tags to apply to the stream, merged over the module's base tags."
  type        = map(string)
  default     = {}
}

# outputs.tf

output "id" {
  description = "The unique stream ID (same as the stream name)."
  value       = aws_kinesis_stream.this.id
}

output "name" {
  description = "The name of the Kinesis data stream."
  value       = aws_kinesis_stream.this.name
}

output "arn" {
  description = "The ARN of the Kinesis data stream, for IAM policies and event source mappings."
  value       = aws_kinesis_stream.this.arn
}

output "stream_mode" {
  description = "The effective capacity mode of the stream (ON_DEMAND or PROVISIONED)."
  value       = var.stream_mode
}

output "shard_count" {
  description = "The configured shard count (null in ON_DEMAND mode, where AWS manages capacity)."
  value       = aws_kinesis_stream.this.shard_count
}

output "encryption_type" {
  description = "The server-side encryption type applied to the stream."
  value       = aws_kinesis_stream.this.encryption_type
}

How to use it

# A customer-managed KMS key so encryption isn't tied to the shared AWS-managed key.
resource "aws_kms_key" "kinesis" {
  description             = "CMK for clickstream Kinesis data stream"
  deletion_window_in_days = 14
  enable_key_rotation     = true
}

module "kinesis_data_stream" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-kinesis?ref=v1.0.0"

  name             = "prod-clickstream-events"
  stream_mode      = "ON_DEMAND"
  retention_period = 168 # 7 days of replay for reprocessing pipelines

  encryption_type = "KMS"
  kms_key_id      = aws_kms_key.kinesis.arn

  shard_level_metrics = [
    "IncomingBytes",
    "OutgoingBytes",
    "IteratorAgeMilliseconds",
  ]

  tags = {
    Environment = "prod"
    Team        = "data-platform"
    CostCenter  = "analytics"
  }
}

# Downstream: wire the stream into a Lambda consumer using the module's ARN output.
resource "aws_lambda_event_source_mapping" "clickstream_processor" {
  event_source_arn  = module.kinesis_data_stream.arn
  function_name     = aws_lambda_function.processor.arn
  starting_position = "LATEST"
  batch_size        = 200

  # Tolerate poison-pill records without blocking the shard.
  maximum_retry_attempts             = 3
  bisect_batch_on_function_error     = true
  maximum_record_age_in_seconds      = 3600
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root config — live/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "s3"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...s3 state bucket/container + key per path...
  }
}

2. Module config — live/prod/kinesis/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-kinesis?ref=v1.0.0"
}

inputs = {
  name = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/kinesis && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name	Type	Default	Required	Description
`name`	`string`	n/a	yes	Name of the Kinesis data stream; unique per account per region (1-128 chars).
`stream_mode`	`string`	`"ON_DEMAND"`	no	Capacity mode: `ON_DEMAND` (auto-scaling) or `PROVISIONED` (fixed shards).
`shard_count`	`number`	`1`	no	Number of shards in PROVISIONED mode (1-10000); ignored in ON_DEMAND.
`retention_period`	`number`	`24`	no	Data retention in hours (24-8760); above 24 incurs extended-retention cost.
`encryption_type`	`string`	`"KMS"`	no	Server-side encryption: `KMS` or `NONE`.
`kms_key_id`	`string`	`"alias/aws/kinesis"`	no	KMS key GUID/ARN/alias used when `encryption_type = "KMS"`.
`shard_level_metrics`	`list(string)`	`["IncomingBytes","OutgoingBytes","IteratorAgeMilliseconds"]`	no	Enhanced shard-level CloudWatch metrics to enable; empty list disables them.
`enforce_consumer_deletion`	`bool`	`true`	no	Auto-delete enhanced fan-out consumers on stream destroy.
`tags`	`map(string)`	`{}`	no	Additional tags merged over the module’s base tags.

Outputs

Name	Description
`id`	The unique stream ID (identical to the stream name).
`name`	The name of the Kinesis data stream.
`arn`	The stream ARN, for IAM policies and event source mappings.
`stream_mode`	The effective capacity mode (`ON_DEMAND` or `PROVISIONED`).
`shard_count`	Configured shard count (`null` in ON_DEMAND mode).
`encryption_type`	The server-side encryption type applied to the stream.

Enterprise scenario

A retail analytics platform ingests checkout and browse events from 40+ regional storefronts into a single prod-clickstream-events stream running in ON_DEMAND mode, so Black Friday traffic spikes scale automatically without a capacity-planning fire drill. Records are encrypted at rest with a team-owned KMS CMK to satisfy the company’s PCI data-classification policy, retention is set to 7 days so the data-science team can replay a full week to backfill a new feature, and the IteratorAgeMilliseconds shard metric feeds a CloudWatch alarm that pages on-call when a downstream Flink consumer falls behind. Reusing the module meant the storefront, fraud, and recommendations teams each got the same encrypted, monitored baseline by supplying nothing more than a stream name and tags.

Best practices

Encrypt with a customer-managed key, not the default. The module defaults to alias/aws/kinesis, but for regulated data pass your own CMK ARN with key rotation enabled so you control the key policy, grants, and audit trail independently of AWS.
Match capacity mode to traffic shape. Use ON_DEMAND for spiky or unpredictable workloads to avoid throttling and over-provisioning; switch a stream to PROVISIONED once throughput is steady and high enough that fixed shards are materially cheaper than per-GB on-demand pricing.
Right-size retention deliberately. Every hour beyond the 24-hour default is billed; only extend retention_period to the actual replay window your consumers need (e.g. 168h for a week of reprocessing) rather than reflexively maxing it out.
Always enable IteratorAgeMilliseconds and alarm on it. A rising iterator age is the single clearest signal that consumers are falling behind producers; pair it with WriteProvisionedThroughputExceeded (PROVISIONED) or on-demand throttle metrics to catch back-pressure early.
Name streams with environment and domain prefixes. A convention like prod-clickstream-events keeps IAM resource ARNs least-privilege-scoped (arn:aws:kinesis:*:*:stream/prod-*) and makes cost allocation by tag and stream name unambiguous.
Set enforce_consumer_deletion = true for streams with enhanced fan-out. Otherwise a terraform destroy fails when registered consumers still exist, leaving orphaned resources and a half-torn-down environment.

Terraform Module: AWS Kinesis Data Stream — on-demand or provisioned shards with KMS encryption baked in

Quickstart (copy-paste)

What this module is

When to use it

Module structure

How to use it

With Terragrunt

Inputs

Outputs

Enterprise scenario

Best practices

Written by Vinod

Comments

Keep Reading

The Terraform Architecting Ladder: From a Single Module to an Enterprise IaC Platform

HashiCorp Terraform Associate (003) Prep Kit: Objectives, Practice Questions & Cheat Sheet

Terraform Fundamentals: HCL, Providers, State & the Core Workflow