Quick take — Provision encrypted, VPC-isolated AWS OpenSearch domains with multi-AZ data nodes, dedicated masters, fine-grained access control, EBS storage, and CloudWatch slow logs using one reusable Terraform module. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "aws" {
region = "us-east-1"
}
module "opensearch" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-opensearch?ref=v1.0.0"
domain_name = "..." # Domain name; 3-28 chars, lowercase, starts with a lette…
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
Amazon OpenSearch Service is the AWS-managed successor to the Elasticsearch service: a distributed search and analytics engine you point logs, traces, metrics, and full-text documents at, then query with the OpenSearch REST API or explore through OpenSearch Dashboards. AWS runs the cluster orchestration, patching, and node replacement for you, but you still own a long list of consequential decisions — instance types and counts, dedicated master nodes, EBS sizing, Multi-AZ placement, encryption keys, the access policy, fine-grained access control (FGAC), TLS enforcement, and which slow/error logs ship to CloudWatch.
The aws_opensearch_domain resource exposes every one of those knobs as deeply nested blocks (cluster_config, ebs_options, encrypt_at_rest, node_to_node_encryption, domain_endpoint_options, advanced_security_options, vpc_options, log_publishing_options). Hand-writing that for each domain is how teams end up with one cluster on gp2 with no dedicated master, another reachable over plaintext HTTP, and a third with Zone Awareness silently off. This module wraps aws_opensearch_domain so a domain is described by a handful of variables — engine version, node sizing, AZ count, KMS key, subnets — while the security-critical defaults (encryption at rest, node-to-node encryption, HTTPS-only with TLS 1.2, FGAC over HTTPS) are baked in and hard to turn off by accident.
When to use it
- You run centralized log/observability analytics (application logs, VPC Flow Logs, WAF logs) and need a repeatable domain per environment (dev/stage/prod) with identical security posture.
- You need VPC-isolated OpenSearch reachable only from private subnets, never a public endpoint.
- You want fine-grained access control with an internal master user (or IAM ARN master) and per-index/role authorization instead of a coarse IP/IAM resource policy alone.
- You are standardizing on Multi-AZ with dedicated master nodes for production reliability and want zone awareness wired correctly to node counts.
- You manage many domains across accounts and want encryption, TLS policy, and slow-log publishing enforced centrally rather than reviewed PR by PR.
If you only need ad-hoc, throwaway search for a quick experiment, OpenSearch Serverless (aws_opensearchserverless_collection) may fit better — this module targets provisioned domains where you control capacity and cost.
Module structure
terraform-module-aws-opensearch/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf
# versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
random = {
source = "hashicorp/random"
version = "~> 3.5"
}
}
}
# main.tf
# A strong, random internal master password when FGAC uses an internal user
# database and the caller does not supply one explicitly.
resource "random_password" "master" {
count = (
var.advanced_security_enabled &&
var.internal_user_database_enabled &&
var.master_user_password == null
) ? 1 : 0
length = 24
special = true
min_upper = 2
min_lower = 2
min_numeric = 2
override_special = "!#$%^&*()-_=+"
}
locals {
domain_name = var.domain_name
# Zone awareness must be on whenever we span more than one AZ.
zone_awareness_enabled = var.availability_zone_count > 1
# Resolve the internal master password from the caller or the generated one.
resolved_master_password = (
var.advanced_security_enabled && var.internal_user_database_enabled
? coalesce(var.master_user_password, try(random_password.master[0].result, null))
: null
)
# Default open-but-FGAC-gated access policy when the caller passes none.
# Authorization is still enforced by fine-grained access control.
default_access_policies = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Principal = { AWS = "*" }
Action = "es:*"
Resource = "arn:${data.aws_partition.current.partition}:es:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:domain/${local.domain_name}/*"
}
]
})
log_types = {
index_slow_logs = "INDEX_SLOW_LOGS"
search_slow_logs = "SEARCH_SLOW_LOGS"
es_application_logs = "ES_APPLICATION_LOGS"
}
enabled_log_types = { for k, v in local.log_types : k => v if contains(var.published_log_types, v) }
}
data "aws_partition" "current" {}
data "aws_region" "current" {}
data "aws_caller_identity" "current" {}
# One CloudWatch log group per published log type.
resource "aws_cloudwatch_log_group" "this" {
for_each = local.enabled_log_types
name = "/aws/opensearch/${local.domain_name}/${each.key}"
retention_in_days = var.log_retention_in_days
kms_key_id = var.log_kms_key_arn
tags = var.tags
}
# Resource policy allowing the OpenSearch service to write to those log groups.
resource "aws_cloudwatch_log_resource_policy" "this" {
count = length(local.enabled_log_types) > 0 ? 1 : 0
policy_name = "${local.domain_name}-opensearch-logs"
policy_document = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Principal = {
Service = "es.amazonaws.com"
}
Action = [
"logs:PutLogEvents",
"logs:CreateLogStream",
]
Resource = [for lg in aws_cloudwatch_log_group.this : "${lg.arn}:*"]
Condition = {
StringEquals = {
"aws:SourceAccount" = data.aws_caller_identity.current.account_id
}
ArnLike = {
"aws:SourceArn" = "arn:${data.aws_partition.current.partition}:es:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:domain/${local.domain_name}"
}
}
}
]
})
}
resource "aws_opensearch_domain" "this" {
domain_name = local.domain_name
engine_version = var.engine_version
cluster_config {
instance_type = var.instance_type
instance_count = var.instance_count
zone_awareness_enabled = local.zone_awareness_enabled
dynamic "zone_awareness_config" {
for_each = local.zone_awareness_enabled ? [1] : []
content {
availability_zone_count = var.availability_zone_count
}
}
dedicated_master_enabled = var.dedicated_master_enabled
dedicated_master_type = var.dedicated_master_enabled ? var.dedicated_master_type : null
dedicated_master_count = var.dedicated_master_enabled ? var.dedicated_master_count : null
warm_enabled = var.warm_enabled
warm_type = var.warm_enabled ? var.warm_type : null
warm_count = var.warm_enabled ? var.warm_count : null
}
ebs_options {
ebs_enabled = true
volume_type = var.ebs_volume_type
volume_size = var.ebs_volume_size
throughput = var.ebs_volume_type == "gp3" ? var.ebs_throughput : null
iops = contains(["gp3", "io1"], var.ebs_volume_type) ? var.ebs_iops : null
}
encrypt_at_rest {
enabled = true
kms_key_id = var.kms_key_arn
}
node_to_node_encryption {
enabled = true
}
domain_endpoint_options {
enforce_https = true
tls_security_policy = var.tls_security_policy
}
advanced_security_options {
enabled = var.advanced_security_enabled
internal_user_database_enabled = var.advanced_security_enabled ? var.internal_user_database_enabled : null
dynamic "master_user_options" {
for_each = var.advanced_security_enabled ? [1] : []
content {
master_user_arn = var.internal_user_database_enabled ? null : var.master_user_arn
master_user_name = var.internal_user_database_enabled ? var.master_user_name : null
master_user_password = var.internal_user_database_enabled ? local.resolved_master_password : null
}
}
}
dynamic "vpc_options" {
for_each = length(var.subnet_ids) > 0 ? [1] : []
content {
subnet_ids = var.subnet_ids
security_group_ids = var.security_group_ids
}
}
dynamic "log_publishing_options" {
for_each = local.enabled_log_types
content {
log_type = log_publishing_options.value
cloudwatch_log_group_arn = aws_cloudwatch_log_group.this[log_publishing_options.key].arn
enabled = true
}
}
advanced_options = var.advanced_options
access_policies = coalesce(var.access_policies, local.default_access_policies)
auto_tune_options {
desired_state = var.auto_tune_enabled ? "ENABLED" : "DISABLED"
rollback_on_disable = "DEFAULT_ROLLBACK"
}
tags = merge(var.tags, { Name = local.domain_name })
depends_on = [aws_cloudwatch_log_resource_policy.this]
}
# variables.tf
variable "domain_name" {
description = "Name of the OpenSearch domain. Lowercase, 3-28 chars, must start with a letter."
type = string
validation {
condition = can(regex("^[a-z][a-z0-9-]{2,27}$", var.domain_name))
error_message = "domain_name must be 3-28 chars, lowercase letters/numbers/hyphens, and start with a letter."
}
}
variable "engine_version" {
description = "OpenSearch engine version, e.g. 'OpenSearch_2.13' or 'Elasticsearch_7.10'."
type = string
default = "OpenSearch_2.13"
validation {
condition = can(regex("^(OpenSearch|Elasticsearch)_[0-9]+\\.[0-9]+$", var.engine_version))
error_message = "engine_version must look like 'OpenSearch_2.13' or 'Elasticsearch_7.10'."
}
}
variable "instance_type" {
description = "Instance type for the data nodes, e.g. 'r6g.large.search' or 'm6g.large.search'."
type = string
default = "r6g.large.search"
validation {
condition = can(regex("\\.search$", var.instance_type))
error_message = "instance_type must be an OpenSearch instance type ending in '.search'."
}
}
variable "instance_count" {
description = "Number of data nodes. Use a multiple of availability_zone_count for even shard placement."
type = number
default = 3
validation {
condition = var.instance_count >= 1 && var.instance_count <= 80
error_message = "instance_count must be between 1 and 80."
}
}
variable "availability_zone_count" {
description = "Number of AZs to spread data nodes across (1, 2, or 3). >1 enables zone awareness."
type = number
default = 3
validation {
condition = contains([1, 2, 3], var.availability_zone_count)
error_message = "availability_zone_count must be 1, 2, or 3."
}
}
variable "dedicated_master_enabled" {
description = "Enable dedicated master nodes (strongly recommended for production)."
type = bool
default = true
}
variable "dedicated_master_type" {
description = "Instance type for dedicated master nodes."
type = string
default = "m6g.large.search"
}
variable "dedicated_master_count" {
description = "Number of dedicated master nodes. Use 3 for quorum in production."
type = number
default = 3
validation {
condition = contains([3, 5], var.dedicated_master_count)
error_message = "dedicated_master_count must be 3 or 5 to maintain a stable quorum."
}
}
variable "warm_enabled" {
description = "Enable UltraWarm nodes for cheaper warm-tier storage of older indices."
type = bool
default = false
}
variable "warm_type" {
description = "UltraWarm node instance type, e.g. 'ultrawarm1.medium.search'."
type = string
default = "ultrawarm1.medium.search"
}
variable "warm_count" {
description = "Number of UltraWarm nodes (2-150) when warm_enabled is true."
type = number
default = 2
}
variable "ebs_volume_type" {
description = "EBS volume type for data nodes: gp3, gp2, or io1."
type = string
default = "gp3"
validation {
condition = contains(["gp3", "gp2", "io1"], var.ebs_volume_type)
error_message = "ebs_volume_type must be one of gp3, gp2, or io1."
}
}
variable "ebs_volume_size" {
description = "EBS volume size per data node in GiB."
type = number
default = 100
validation {
condition = var.ebs_volume_size >= 10 && var.ebs_volume_size <= 3584
error_message = "ebs_volume_size must be between 10 and 3584 GiB (instance-type dependent)."
}
}
variable "ebs_throughput" {
description = "Provisioned throughput (MiB/s) for gp3 volumes."
type = number
default = 250
}
variable "ebs_iops" {
description = "Provisioned IOPS for gp3 or io1 volumes."
type = number
default = 3000
}
variable "kms_key_arn" {
description = "KMS key ARN for encryption at rest. Null uses the AWS-managed aws/es key."
type = string
default = null
}
variable "tls_security_policy" {
description = "TLS policy for the HTTPS endpoint."
type = string
default = "Policy-Min-TLS-1-2-PFS-2023-10"
validation {
condition = contains([
"Policy-Min-TLS-1-0-2019-07",
"Policy-Min-TLS-1-2-2019-07",
"Policy-Min-TLS-1-2-PFS-2023-10",
], var.tls_security_policy)
error_message = "tls_security_policy must be a supported OpenSearch TLS policy name."
}
}
variable "advanced_security_enabled" {
description = "Enable fine-grained access control (FGAC). Requires encryption, node-to-node encryption, and HTTPS (all enforced by this module)."
type = bool
default = true
}
variable "internal_user_database_enabled" {
description = "Use the internal user database for the FGAC master user instead of an IAM ARN."
type = bool
default = false
}
variable "master_user_arn" {
description = "IAM ARN of the FGAC master user (used when internal_user_database_enabled is false)."
type = string
default = null
}
variable "master_user_name" {
description = "Master username for the internal user database (used when internal_user_database_enabled is true)."
type = string
default = "os-admin"
}
variable "master_user_password" {
description = "Master password for the internal user database. If null and the internal DB is enabled, a strong password is generated."
type = string
default = null
sensitive = true
}
variable "subnet_ids" {
description = "Private subnet IDs for VPC deployment. Provide one per AZ. Empty list creates a public-endpoint domain."
type = list(string)
default = []
}
variable "security_group_ids" {
description = "Security group IDs attached to the domain ENIs (VPC mode only)."
type = list(string)
default = []
}
variable "published_log_types" {
description = "OpenSearch log types to publish to CloudWatch."
type = list(string)
default = ["INDEX_SLOW_LOGS", "SEARCH_SLOW_LOGS", "ES_APPLICATION_LOGS"]
validation {
condition = alltrue([
for t in var.published_log_types :
contains(["INDEX_SLOW_LOGS", "SEARCH_SLOW_LOGS", "ES_APPLICATION_LOGS", "AUDIT_LOGS"], t)
])
error_message = "published_log_types entries must be INDEX_SLOW_LOGS, SEARCH_SLOW_LOGS, ES_APPLICATION_LOGS, or AUDIT_LOGS."
}
}
variable "log_retention_in_days" {
description = "Retention for the OpenSearch CloudWatch log groups."
type = number
default = 30
}
variable "log_kms_key_arn" {
description = "Optional KMS key ARN to encrypt the CloudWatch log groups."
type = string
default = null
}
variable "auto_tune_enabled" {
description = "Enable Auto-Tune for automatic JVM and queue tuning."
type = bool
default = true
}
variable "advanced_options" {
description = "Map of advanced OpenSearch options, e.g. override_main_response_version or rest.action.multi.allow_explicit_index."
type = map(string)
default = {
"rest.action.multi.allow_explicit_index" = "true"
}
}
variable "access_policies" {
description = "JSON IAM access policy document for the domain. Null uses a permissive policy that relies on FGAC for authorization."
type = string
default = null
}
variable "tags" {
description = "Tags applied to the domain and its log groups."
type = map(string)
default = {}
}
# outputs.tf
output "domain_id" {
description = "Unique resource ID of the OpenSearch domain."
value = aws_opensearch_domain.this.domain_id
}
output "domain_name" {
description = "Name of the OpenSearch domain."
value = aws_opensearch_domain.this.domain_name
}
output "arn" {
description = "ARN of the OpenSearch domain."
value = aws_opensearch_domain.this.arn
}
output "endpoint" {
description = "Domain-specific HTTPS endpoint for the OpenSearch REST API."
value = aws_opensearch_domain.this.endpoint
}
output "dashboard_endpoint" {
description = "Domain-specific endpoint for OpenSearch Dashboards."
value = aws_opensearch_domain.this.dashboard_endpoint
}
output "kibana_endpoint" {
description = "Legacy Kibana/Dashboards endpoint (compatibility)."
value = aws_opensearch_domain.this.kibana_endpoint
}
output "generated_master_password" {
description = "Auto-generated internal master password, if one was created by this module."
value = try(random_password.master[0].result, null)
sensitive = true
}
output "log_group_arns" {
description = "Map of published log type to its CloudWatch log group ARN."
value = { for k, lg in aws_cloudwatch_log_group.this : k => lg.arn }
}
How to use it
module "opensearch" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-opensearch?ref=v1.0.0"
domain_name = "kv-logs-prod"
engine_version = "OpenSearch_2.13"
# Three r6g data nodes across three AZs, with a 3-node dedicated master quorum.
instance_type = "r6g.large.search"
instance_count = 3
availability_zone_count = 3
dedicated_master_enabled = true
dedicated_master_type = "m6g.large.search"
dedicated_master_count = 3
# gp3 storage with explicit throughput/IOPS for predictable ingest.
ebs_volume_type = "gp3"
ebs_volume_size = 200
ebs_throughput = 500
ebs_iops = 5000
# Customer-managed CMK for encryption at rest.
kms_key_arn = aws_kms_key.opensearch.arn
# FGAC with an IAM master role; HTTPS + node-to-node encryption are enforced.
advanced_security_enabled = true
internal_user_database_enabled = false
master_user_arn = aws_iam_role.opensearch_admin.arn
# Private VPC deployment, one subnet per AZ.
subnet_ids = module.vpc.private_subnet_ids
security_group_ids = [aws_security_group.opensearch.id]
published_log_types = ["INDEX_SLOW_LOGS", "SEARCH_SLOW_LOGS", "ES_APPLICATION_LOGS", "AUDIT_LOGS"]
log_retention_in_days = 90
tags = {
Environment = "prod"
Team = "observability"
CostCenter = "platform-eng"
}
}
# Downstream: hand the REST endpoint to a Firehose delivery stream that
# indexes application logs into the domain.
resource "aws_cloudwatch_log_subscription_filter" "to_opensearch" {
name = "app-logs-to-opensearch"
log_group_name = "/kv/app/api"
filter_pattern = ""
destination_arn = aws_lambda_function.os_indexer.arn
}
resource "aws_ssm_parameter" "opensearch_endpoint" {
name = "/kv/observability/opensearch/endpoint"
type = "String"
value = "https://${module.opensearch.endpoint}"
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "s3"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...s3 state bucket/container + key per path...
}
}
2. Module config — live/prod/opensearch/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-opensearch?ref=v1.0.0"
}
inputs = {
domain_name = "..."
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/opensearch && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
| domain_name | string | n/a | yes | Domain name; 3-28 chars, lowercase, starts with a letter. |
| engine_version | string | "OpenSearch_2.13" |
no | OpenSearch/Elasticsearch engine version. |
| instance_type | string | "r6g.large.search" |
no | Data node instance type (must end in .search). |
| instance_count | number | 3 |
no | Number of data nodes (1-80). |
| availability_zone_count | number | 3 |
no | AZ spread (1, 2, or 3); >1 enables zone awareness. |
| dedicated_master_enabled | bool | true |
no | Provision dedicated master nodes. |
| dedicated_master_type | string | "m6g.large.search" |
no | Dedicated master instance type. |
| dedicated_master_count | number | 3 |
no | Dedicated master count (3 or 5). |
| warm_enabled | bool | false |
no | Enable UltraWarm tier. |
| warm_type | string | "ultrawarm1.medium.search" |
no | UltraWarm instance type. |
| warm_count | number | 2 |
no | UltraWarm node count. |
| ebs_volume_type | string | "gp3" |
no | EBS type: gp3, gp2, or io1. |
| ebs_volume_size | number | 100 |
no | EBS size per node in GiB (10-3584). |
| ebs_throughput | number | 250 |
no | gp3 throughput in MiB/s. |
| ebs_iops | number | 3000 |
no | Provisioned IOPS for gp3/io1. |
| kms_key_arn | string | null |
no | CMK ARN for encryption at rest (null = aws/es key). |
| tls_security_policy | string | "Policy-Min-TLS-1-2-PFS-2023-10" |
no | TLS policy for the HTTPS endpoint. |
| advanced_security_enabled | bool | true |
no | Enable fine-grained access control. |
| internal_user_database_enabled | bool | false |
no | Use internal user DB instead of an IAM master ARN. |
| master_user_arn | string | null |
no | IAM ARN of the FGAC master user. |
| master_user_name | string | "os-admin" |
no | Internal DB master username. |
| master_user_password | string | null |
no | Internal DB master password (generated if null). |
| subnet_ids | list(string) | [] |
no | Private subnet IDs (empty = public endpoint). |
| security_group_ids | list(string) | [] |
no | Security groups for the domain ENIs (VPC mode). |
| published_log_types | list(string) | ["INDEX_SLOW_LOGS","SEARCH_SLOW_LOGS","ES_APPLICATION_LOGS"] |
no | Log types shipped to CloudWatch. |
| log_retention_in_days | number | 30 |
no | CloudWatch log group retention. |
| log_kms_key_arn | string | null |
no | KMS key for the log groups. |
| auto_tune_enabled | bool | true |
no | Enable Auto-Tune. |
| advanced_options | map(string) | { "rest.action.multi.allow_explicit_index" = "true" } |
no | Advanced OpenSearch options. |
| access_policies | string | null |
no | JSON access policy (null = FGAC-gated permissive policy). |
| tags | map(string) | {} |
no | Tags for the domain and log groups. |
Outputs
| Name | Description |
|---|---|
| domain_id | Unique resource ID of the OpenSearch domain. |
| domain_name | Name of the OpenSearch domain. |
| arn | ARN of the OpenSearch domain. |
| endpoint | HTTPS endpoint for the OpenSearch REST API. |
| dashboard_endpoint | OpenSearch Dashboards endpoint. |
| kibana_endpoint | Legacy Kibana/Dashboards endpoint. |
| generated_master_password | Auto-generated internal master password, if created (sensitive). |
| log_group_arns | Map of log type to its CloudWatch log group ARN. |
Enterprise scenario
A fintech platform team centralizes audit and application logs from 40+ microservices into a single observability domain per environment. They call this module with kv-logs-prod, three r6g.large.search data nodes across three AZs, a 3-node m6g master quorum, a customer-managed KMS CMK (so the security team controls key rotation and can revoke access), and FGAC bound to an IAM admin role. The domain lives entirely in private subnets reached through the VPC, AUDIT_LOGS ship to a 90-day CloudWatch group for compliance, and UltraWarm is toggled on in the prod tfvars to push indices older than 14 days to the cheaper warm tier — keeping hot-node EBS small while retaining a year of searchable history.
Best practices
- Lock down access on two layers. Keep
advanced_security_enabled = trueand pair it with a tight resourceaccess_policiesdocument; FGAC role mappings handle per-index authorization while the resource policy bounds who can even reach the domain. Never expose a public endpoint for sensitive data — passsubnet_idsto force VPC-only access. - Encrypt with a customer-managed CMK. Provide
kms_key_arnrather than relying on the defaultaws/eskey so you control rotation and can auditkms:Decryptin CloudTrail. Encryption at rest and node-to-node encryption are always on in this module — both are prerequisites for FGAC anyway. - Size for quorum and even sharding. Use
dedicated_master_count = 3(or 5) so a single master loss never breaks cluster state, and keepinstance_counta multiple ofavailability_zone_countso shards distribute evenly across AZs. Three masters plus three data nodes is the smallest sane production shape. - Control cost with gp3 and UltraWarm. Default to
gp3(cheaper and decoupled IOPS/throughput vs gp2) and right-sizeebs_volume_sizeto actual hot-tier needs; move aging time-series indices to UltraWarm (warm_enabled = true) instead of over-provisioning expensive hot-node storage. Always setlog_retention_in_daysso slow/audit logs don’t accrue cost indefinitely. - Enforce modern TLS and publish slow logs. Keep
tls_security_policyat the PFS 1.2 policy andenforce_httpson (both fixed here) so clients can’t fall back to weak ciphers, and shipINDEX_SLOW_LOGS/SEARCH_SLOW_LOGSto CloudWatch to catch hot-shard and mapping problems before they page you. - Name and tag for fleet management. Use an environment-scoped
domain_name(e.g.kv-logs-prod) and consistenttags(Environment, Team, CostCenter) so domains are attributable across accounts, and apply changes through versioned module refs (?ref=v1.0.0) so engine and capacity upgrades are reviewable, not ad hoc.