Quick take — A reusable Terraform module for aws_kinesis_stream supporting both ON_DEMAND and PROVISIONED capacity, KMS-at-rest encryption, tunable retention, and enhanced monitoring for production-grade streaming. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "aws" {
region = "us-east-1"
}
module "kinesis" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-kinesis?ref=v1.0.0"
name = "..." # Name of the Kinesis data stream; unique per account per…
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
Amazon Kinesis Data Streams is a serverless, durable log for real-time data: producers PutRecord/PutRecords into ordered partitions called shards, and consumers (Lambda, Kinesis Client Library applications, Firehose, Flink) read those records within a configurable retention window. It is the backbone for clickstream ingestion, change-data-capture fan-out, IoT telemetry, and event pipelines where you need replayable, ordered-by-partition-key data at scale.
The raw aws_kinesis_stream resource has a handful of foot-guns that are easy to get wrong per-team: people forget to enable encryption, leave retention at the 24-hour default, hard-code shard_count even when on-demand would be cheaper, or skip shard-level metrics so they can’t tell a stream is throttling. Wrapping it in a module lets you bake in a secure, observable default once — server-side encryption with a customer-managed KMS key, sane retention, opt-in enhanced shard metrics, and a stream_mode_details block that toggles cleanly between ON_DEMAND and PROVISIONED — then let consuming teams supply only a name, a capacity mode, and tags.
When to use it
- You are standing up multiple Kinesis streams across services or environments and want one encrypted, monitored baseline instead of copy-pasted resource blocks.
- You want to start a workload in
ON_DEMANDmode (no capacity planning, pay-per-throughput) and later flip specific high-volume streams toPROVISIONEDshards for cost control — without rewriting the module call. - You need server-side encryption with a customer-managed KMS key enforced by default for compliance (PCI, HIPAA, internal data-classification policy).
- You want shard-level enhanced metrics (
IncomingBytes,IteratorAgeMilliseconds,WriteProvisionedThroughputExceeded) emitted to CloudWatch so alerting and autoscaling have something to act on. - Skip it for a one-off throwaway stream in a sandbox — but the moment it touches a real pipeline, the module pays for itself.
Module structure
terraform-module-aws-kinesis/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf
# versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
# main.tf
locals {
# In ON_DEMAND mode AWS manages capacity, so shard_count must be null.
# In PROVISIONED mode it must be a positive integer.
effective_shard_count = var.stream_mode == "PROVISIONED" ? var.shard_count : null
base_tags = {
"ManagedBy" = "terraform"
"Module" = "terraform-module-aws-kinesis"
"StreamMode" = var.stream_mode
}
}
resource "aws_kinesis_stream" "this" {
name = var.name
shard_count = local.effective_shard_count
retention_period = var.retention_period
# Shard-level CloudWatch metrics. Empty list = no enhanced metrics (cheapest).
shard_level_metrics = var.shard_level_metrics
# Max records a consumer may "skip ahead" of via GetShardIterator AT_TIMESTAMP, etc.
enforce_consumer_deletion = var.enforce_consumer_deletion
encryption_type = var.encryption_type
# KMS key is only valid (and required) when encryption_type = "KMS".
kms_key_id = var.encryption_type == "KMS" ? var.kms_key_id : null
stream_mode_details {
stream_mode = var.stream_mode
}
tags = merge(local.base_tags, var.tags)
}
# variables.tf
variable "name" {
description = "Name of the Kinesis data stream. Must be unique per account per region."
type = string
validation {
condition = can(regex("^[a-zA-Z0-9_.-]{1,128}$", var.name))
error_message = "name must be 1-128 chars and may only contain letters, numbers, underscores, hyphens and periods."
}
}
variable "stream_mode" {
description = "Capacity mode for the stream: ON_DEMAND (auto-scaling, pay-per-use) or PROVISIONED (fixed shards)."
type = string
default = "ON_DEMAND"
validation {
condition = contains(["ON_DEMAND", "PROVISIONED"], var.stream_mode)
error_message = "stream_mode must be either ON_DEMAND or PROVISIONED."
}
}
variable "shard_count" {
description = "Number of shards (PROVISIONED mode only). Ignored when stream_mode is ON_DEMAND. Each shard supports 1 MB/s or 1000 records/s ingest."
type = number
default = 1
validation {
condition = var.shard_count >= 1 && var.shard_count <= 10000
error_message = "shard_count must be between 1 and 10000."
}
}
variable "retention_period" {
description = "Data retention in hours. Valid range is 24 to 8760 (365 days). Anything above 24 incurs extended-retention charges."
type = number
default = 24
validation {
condition = var.retention_period >= 24 && var.retention_period <= 8760
error_message = "retention_period must be between 24 and 8760 hours."
}
}
variable "encryption_type" {
description = "Server-side encryption type: KMS (customer/AWS-managed key) or NONE."
type = string
default = "KMS"
validation {
condition = contains(["KMS", "NONE"], var.encryption_type)
error_message = "encryption_type must be either KMS or NONE."
}
}
variable "kms_key_id" {
description = "KMS key GUID, ARN, alias name (alias/...) or alias ARN used when encryption_type is KMS. Defaults to the AWS-managed alias/aws/kinesis key."
type = string
default = "alias/aws/kinesis"
}
variable "shard_level_metrics" {
description = "List of shard-level CloudWatch metrics to enable. Common: IncomingBytes, IncomingRecords, OutgoingBytes, IteratorAgeMilliseconds, ReadProvisionedThroughputExceeded, WriteProvisionedThroughputExceeded. Empty list disables enhanced metrics."
type = list(string)
default = ["IncomingBytes", "OutgoingBytes", "IteratorAgeMilliseconds"]
validation {
condition = alltrue([
for m in var.shard_level_metrics : contains([
"IncomingBytes",
"IncomingRecords",
"OutgoingBytes",
"OutgoingRecords",
"WriteProvisionedThroughputExceeded",
"ReadProvisionedThroughputExceeded",
"IteratorAgeMilliseconds",
], m)
])
error_message = "shard_level_metrics contains an unsupported metric name."
}
}
variable "enforce_consumer_deletion" {
description = "If true, registered enhanced fan-out consumers are deleted automatically when the stream is destroyed (prevents destroy errors)."
type = bool
default = true
}
variable "tags" {
description = "Additional tags to apply to the stream, merged over the module's base tags."
type = map(string)
default = {}
}
# outputs.tf
output "id" {
description = "The unique stream ID (same as the stream name)."
value = aws_kinesis_stream.this.id
}
output "name" {
description = "The name of the Kinesis data stream."
value = aws_kinesis_stream.this.name
}
output "arn" {
description = "The ARN of the Kinesis data stream, for IAM policies and event source mappings."
value = aws_kinesis_stream.this.arn
}
output "stream_mode" {
description = "The effective capacity mode of the stream (ON_DEMAND or PROVISIONED)."
value = var.stream_mode
}
output "shard_count" {
description = "The configured shard count (null in ON_DEMAND mode, where AWS manages capacity)."
value = aws_kinesis_stream.this.shard_count
}
output "encryption_type" {
description = "The server-side encryption type applied to the stream."
value = aws_kinesis_stream.this.encryption_type
}
How to use it
# A customer-managed KMS key so encryption isn't tied to the shared AWS-managed key.
resource "aws_kms_key" "kinesis" {
description = "CMK for clickstream Kinesis data stream"
deletion_window_in_days = 14
enable_key_rotation = true
}
module "kinesis_data_stream" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-kinesis?ref=v1.0.0"
name = "prod-clickstream-events"
stream_mode = "ON_DEMAND"
retention_period = 168 # 7 days of replay for reprocessing pipelines
encryption_type = "KMS"
kms_key_id = aws_kms_key.kinesis.arn
shard_level_metrics = [
"IncomingBytes",
"OutgoingBytes",
"IteratorAgeMilliseconds",
]
tags = {
Environment = "prod"
Team = "data-platform"
CostCenter = "analytics"
}
}
# Downstream: wire the stream into a Lambda consumer using the module's ARN output.
resource "aws_lambda_event_source_mapping" "clickstream_processor" {
event_source_arn = module.kinesis_data_stream.arn
function_name = aws_lambda_function.processor.arn
starting_position = "LATEST"
batch_size = 200
# Tolerate poison-pill records without blocking the shard.
maximum_retry_attempts = 3
bisect_batch_on_function_error = true
maximum_record_age_in_seconds = 3600
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "s3"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...s3 state bucket/container + key per path...
}
}
2. Module config — live/prod/kinesis/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-kinesis?ref=v1.0.0"
}
inputs = {
name = "..."
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/kinesis && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
name |
string |
n/a | yes | Name of the Kinesis data stream; unique per account per region (1-128 chars). |
stream_mode |
string |
"ON_DEMAND" |
no | Capacity mode: ON_DEMAND (auto-scaling) or PROVISIONED (fixed shards). |
shard_count |
number |
1 |
no | Number of shards in PROVISIONED mode (1-10000); ignored in ON_DEMAND. |
retention_period |
number |
24 |
no | Data retention in hours (24-8760); above 24 incurs extended-retention cost. |
encryption_type |
string |
"KMS" |
no | Server-side encryption: KMS or NONE. |
kms_key_id |
string |
"alias/aws/kinesis" |
no | KMS key GUID/ARN/alias used when encryption_type = "KMS". |
shard_level_metrics |
list(string) |
["IncomingBytes","OutgoingBytes","IteratorAgeMilliseconds"] |
no | Enhanced shard-level CloudWatch metrics to enable; empty list disables them. |
enforce_consumer_deletion |
bool |
true |
no | Auto-delete enhanced fan-out consumers on stream destroy. |
tags |
map(string) |
{} |
no | Additional tags merged over the module’s base tags. |
Outputs
| Name | Description |
|---|---|
id |
The unique stream ID (identical to the stream name). |
name |
The name of the Kinesis data stream. |
arn |
The stream ARN, for IAM policies and event source mappings. |
stream_mode |
The effective capacity mode (ON_DEMAND or PROVISIONED). |
shard_count |
Configured shard count (null in ON_DEMAND mode). |
encryption_type |
The server-side encryption type applied to the stream. |
Enterprise scenario
A retail analytics platform ingests checkout and browse events from 40+ regional storefronts into a single prod-clickstream-events stream running in ON_DEMAND mode, so Black Friday traffic spikes scale automatically without a capacity-planning fire drill. Records are encrypted at rest with a team-owned KMS CMK to satisfy the company’s PCI data-classification policy, retention is set to 7 days so the data-science team can replay a full week to backfill a new feature, and the IteratorAgeMilliseconds shard metric feeds a CloudWatch alarm that pages on-call when a downstream Flink consumer falls behind. Reusing the module meant the storefront, fraud, and recommendations teams each got the same encrypted, monitored baseline by supplying nothing more than a stream name and tags.
Best practices
- Encrypt with a customer-managed key, not the default. The module defaults to
alias/aws/kinesis, but for regulated data pass your own CMK ARN with key rotation enabled so you control the key policy, grants, and audit trail independently of AWS. - Match capacity mode to traffic shape. Use
ON_DEMANDfor spiky or unpredictable workloads to avoid throttling and over-provisioning; switch a stream toPROVISIONEDonce throughput is steady and high enough that fixed shards are materially cheaper than per-GB on-demand pricing. - Right-size retention deliberately. Every hour beyond the 24-hour default is billed; only extend
retention_periodto the actual replay window your consumers need (e.g. 168h for a week of reprocessing) rather than reflexively maxing it out. - Always enable
IteratorAgeMillisecondsand alarm on it. A rising iterator age is the single clearest signal that consumers are falling behind producers; pair it withWriteProvisionedThroughputExceeded(PROVISIONED) or on-demand throttle metrics to catch back-pressure early. - Name streams with environment and domain prefixes. A convention like
prod-clickstream-eventskeeps IAM resource ARNs least-privilege-scoped (arn:aws:kinesis:*:*:stream/prod-*) and makes cost allocation by tag and stream name unambiguous. - Set
enforce_consumer_deletion = truefor streams with enhanced fan-out. Otherwise aterraform destroyfails when registered consumers still exist, leaving orphaned resources and a half-torn-down environment.