Terraform Module: AWS Neptune — a hardened graph database cluster you can drop into any VPC

Quick take — A reusable Terraform module for AWS Neptune: provision an encrypted, IAM-authenticated graph cluster with a subnet group, cluster parameter group, instances, and CloudWatch log exports in one call. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "aws" {
  region = "us-east-1"
}

module "neptune" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-neptune?ref=v1.0.0"

  name                   = "..."           # Base name/prefix for the cluster and dependent resource…
  subnet_ids             = ["...", "..."]  # Private subnet IDs across >= 2 AZs for the subnet group.
  vpc_security_group_ids = ["...", "..."]  # Security groups attached to the cluster.
  kms_key_arn            = "..."           # Customer-managed KMS key ARN for encryption at rest.
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Amazon Neptune is a fully managed graph database that speaks Gremlin and openCypher (property graph) as well as SPARQL (RDF). It is purpose-built for highly connected data — fraud rings, identity graphs, knowledge graphs, recommendation engines, network and IT topology — where the value is in the edges, not just the rows. Like Aurora, Neptune separates compute from storage: a aws_neptune_cluster is the storage-and-endpoint layer, and one or more aws_neptune_cluster_instance resources are the compute that you read and write through.

That split is exactly why Neptune is awkward to stamp out by hand. A correct, production-ready cluster is never a single resource — it is a cluster, a DB subnet group pinned to private subnets, a cluster parameter group that turns on IAM database authentication and audit logging, at least one writer instance plus optional readers, and CloudWatch log exports. Miss the subnet group and Terraform falls back to the default VPC; forget storage_encrypted and you cannot turn it on later without a snapshot-restore dance; skip the parameter group and IAM auth is silently off. This module wires all of that together behind a handful of variables so every Neptune cluster in your estate is encrypted, private, IAM-authenticated, and observable by default.

When to use it

Reach for this module when you need a managed property-graph or RDF store and you want it to look identical across dev, staging, and prod. Typical triggers:

You are building on Gremlin, openCypher, or SPARQL and don’t want to operate JanusGraph/Cassandra or a self-hosted triplestore.
You need fraud detection, identity resolution, recommendations, or topology/knowledge graphs with millisecond traversals over billions of relationships.
You want IAM database authentication (no long-lived DB passwords) and encryption at rest with a customer-managed KMS key as non-negotiable defaults.
You run multi-AZ reads and want to add reader instances by changing one number.

Skip it (or pick a different tool) when your data is genuinely relational and your queries are joins and aggregates — that is Aurora/RDS territory. Neptune is also not a document or key-value store; it is optimized for traversals, not point lookups at DynamoDB scale. And if you need Neptune Serverless or Neptune Analytics, this module covers provisioned clusters with optional serverless scaling — Neptune Analytics is a separate service outside its scope.

Module structure

terraform-module-aws-neptune/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

main.tf

locals {
  # Neptune cluster identifiers must be lowercase and <= 63 chars.
  cluster_identifier = "${var.name}-neptune"

  # Build the parameter list once so we can flip IAM auth / audit logging via vars.
  cluster_parameters = concat(
    [
      {
        name  = "neptune_enable_audit_log"
        value = var.enable_audit_log ? "1" : "0"
      }
    ],
    var.extra_cluster_parameters
  )

  tags = merge(
    {
      Name      = local.cluster_identifier
      ManagedBy = "terraform"
      Module    = "terraform-module-aws-neptune"
    },
    var.tags
  )
}

# --- Networking: pin the cluster to private subnets -------------------------
resource "aws_neptune_subnet_group" "this" {
  name        = "${var.name}-neptune"
  description = "Subnet group for the ${var.name} Neptune cluster"
  subnet_ids  = var.subnet_ids

  tags = local.tags
}

# --- Cluster parameter group: IAM auth + audit logging ----------------------
resource "aws_neptune_cluster_parameter_group" "this" {
  name        = "${var.name}-neptune-cpg"
  family      = var.parameter_group_family
  description = "Cluster parameter group for the ${var.name} Neptune cluster"

  dynamic "parameter" {
    for_each = local.cluster_parameters
    content {
      name  = parameter.value.name
      value = parameter.value.value
    }
  }

  tags = local.tags
}

# --- The cluster (storage + endpoints) --------------------------------------
resource "aws_neptune_cluster" "this" {
  cluster_identifier = local.cluster_identifier
  engine             = "neptune"
  engine_version     = var.engine_version

  neptune_subnet_group_name            = aws_neptune_subnet_group.this.name
  neptune_cluster_parameter_group_name = aws_neptune_cluster_parameter_group.this.name
  vpc_security_group_ids               = var.vpc_security_group_ids

  port = var.port

  # Security: encryption at rest is enforced and immutable after creation.
  storage_encrypted = true
  kms_key_arn       = var.kms_key_arn

  # No master password: authenticate with IAM database auth instead.
  iam_database_authentication_enabled = var.iam_database_authentication_enabled
  iam_roles                           = var.iam_roles

  # Stream Gremlin/SPARQL audit + slowquery logs to CloudWatch Logs.
  enable_cloudwatch_logs_exports = var.enable_audit_log ? ["audit"] : []

  backup_retention_period      = var.backup_retention_period
  preferred_backup_window      = var.preferred_backup_window
  preferred_maintenance_window = var.preferred_maintenance_window

  deletion_protection       = var.deletion_protection
  apply_immediately         = var.apply_immediately
  skip_final_snapshot       = var.skip_final_snapshot
  final_snapshot_identifier = var.skip_final_snapshot ? null : "${local.cluster_identifier}-final"

  # Optional serverless v2 scaling (omit the block to stay fully provisioned).
  dynamic "serverless_v2_scaling_configuration" {
    for_each = var.serverless_min_capacity == null ? [] : [1]
    content {
      min_capacity = var.serverless_min_capacity
      max_capacity = var.serverless_max_capacity
    }
  }

  tags = local.tags

  lifecycle {
    # final_snapshot_identifier toggles with skip_final_snapshot; don't fight it.
    ignore_changes = [final_snapshot_identifier]
  }
}

# --- Compute: 1 writer + N readers ------------------------------------------
resource "aws_neptune_cluster_instance" "this" {
  count = var.instance_count

  identifier         = "${local.cluster_identifier}-${count.index}"
  cluster_identifier = aws_neptune_cluster.this.id
  engine             = "neptune"
  engine_version     = var.engine_version

  # db.serverless when serverless v2 is enabled, otherwise the provisioned class.
  instance_class = var.serverless_min_capacity == null ? var.instance_class : "db.serverless"

  neptune_parameter_group_name = aws_neptune_parameter_group.this.name

  apply_immediately            = var.apply_immediately
  auto_minor_version_upgrade   = var.auto_minor_version_upgrade
  preferred_maintenance_window = var.preferred_maintenance_window

  # Promote readers in a deterministic order on failover.
  promotion_tier = count.index

  tags = local.tags
}

# --- DB (instance-level) parameter group ------------------------------------
resource "aws_neptune_parameter_group" "this" {
  name        = "${var.name}-neptune-pg"
  family      = var.parameter_group_family
  description = "DB parameter group for the ${var.name} Neptune instances"

  dynamic "parameter" {
    for_each = var.extra_db_parameters
    content {
      name  = parameter.value.name
      value = parameter.value.value
    }
  }

  tags = local.tags
}

variables.tf

variable "name" {
  description = "Base name for the cluster and all dependent resources (lowercase, used as a prefix)."
  type        = string

  validation {
    condition     = can(regex("^[a-z][a-z0-9-]{1,40}$", var.name))
    error_message = "name must be lowercase alphanumeric/hyphen, start with a letter, and be 2-41 chars."
  }
}

variable "subnet_ids" {
  description = "Private subnet IDs (>= 2 AZs) for the Neptune subnet group."
  type        = list(string)

  validation {
    condition     = length(var.subnet_ids) >= 2
    error_message = "Provide at least two subnet IDs across different AZs for high availability."
  }
}

variable "vpc_security_group_ids" {
  description = "Security group IDs to attach to the cluster (must allow inbound on the Neptune port)."
  type        = list(string)
}

variable "kms_key_arn" {
  description = "ARN of the customer-managed KMS key used to encrypt storage at rest."
  type        = string

  validation {
    condition     = can(regex("^arn:aws[a-z-]*:kms:", var.kms_key_arn))
    error_message = "kms_key_arn must be a valid KMS key ARN."
  }
}

variable "engine_version" {
  description = "Neptune engine version (e.g. 1.3.2.0)."
  type        = string
  default     = "1.3.2.0"
}

variable "parameter_group_family" {
  description = "Parameter group family, matching the major engine line (e.g. neptune1.3)."
  type        = string
  default     = "neptune1.3"
}

variable "port" {
  description = "TCP port the cluster listens on."
  type        = number
  default     = 8182
}

variable "instance_count" {
  description = "Number of instances (1 writer + the rest as readers). Ignored sizing-wise when serverless."
  type        = number
  default     = 2

  validation {
    condition     = var.instance_count >= 1 && var.instance_count <= 16
    error_message = "instance_count must be between 1 and 16."
  }
}

variable "instance_class" {
  description = "Provisioned instance class. Ignored when serverless_min_capacity is set."
  type        = string
  default     = "db.r6g.large"
}

variable "serverless_min_capacity" {
  description = "Minimum Neptune Capacity Units (NCU) for serverless v2. Set to null to stay fully provisioned."
  type        = number
  default     = null

  validation {
    condition     = var.serverless_min_capacity == null || (var.serverless_min_capacity >= 1 && var.serverless_min_capacity <= 128)
    error_message = "serverless_min_capacity must be between 1 and 128 NCU (or null)."
  }
}

variable "serverless_max_capacity" {
  description = "Maximum Neptune Capacity Units (NCU) for serverless v2."
  type        = number
  default     = 8

  validation {
    condition     = var.serverless_max_capacity >= 2.5 && var.serverless_max_capacity <= 128
    error_message = "serverless_max_capacity must be between 2.5 and 128 NCU."
  }
}

variable "iam_database_authentication_enabled" {
  description = "Use IAM database authentication instead of a master password."
  type        = bool
  default     = true
}

variable "iam_roles" {
  description = "IAM role ARNs to associate with the cluster (e.g. for S3 bulk load / Neptune ML)."
  type        = list(string)
  default     = []
}

variable "enable_audit_log" {
  description = "Enable Neptune audit logging and export the 'audit' log stream to CloudWatch Logs."
  type        = bool
  default     = true
}

variable "backup_retention_period" {
  description = "Number of days to retain automated backups (1-35)."
  type        = number
  default     = 7

  validation {
    condition     = var.backup_retention_period >= 1 && var.backup_retention_period <= 35
    error_message = "backup_retention_period must be between 1 and 35 days."
  }
}

variable "preferred_backup_window" {
  description = "Daily backup window in UTC (hh24:mi-hh24:mi)."
  type        = string
  default     = "03:00-04:00"
}

variable "preferred_maintenance_window" {
  description = "Weekly maintenance window in UTC (ddd:hh24:mi-ddd:hh24:mi)."
  type        = string
  default     = "sun:05:00-sun:06:00"
}

variable "deletion_protection" {
  description = "Prevent accidental cluster deletion."
  type        = bool
  default     = true
}

variable "skip_final_snapshot" {
  description = "Skip the final snapshot on destroy (true only for ephemeral/dev clusters)."
  type        = bool
  default     = false
}

variable "apply_immediately" {
  description = "Apply modifications immediately instead of during the maintenance window."
  type        = bool
  default     = false
}

variable "auto_minor_version_upgrade" {
  description = "Allow automatic minor engine version upgrades on instances."
  type        = bool
  default     = true
}

variable "extra_cluster_parameters" {
  description = "Additional cluster parameter group parameters: list of { name, value }."
  type = list(object({
    name  = string
    value = string
  }))
  default = []
}

variable "extra_db_parameters" {
  description = "Additional DB (instance) parameter group parameters: list of { name, value }."
  type = list(object({
    name  = string
    value = string
  }))
  default = []
}

variable "tags" {
  description = "Additional tags applied to every resource."
  type        = map(string)
  default     = {}
}

outputs.tf

output "cluster_id" {
  description = "The Neptune cluster identifier."
  value       = aws_neptune_cluster.this.id
}

output "cluster_arn" {
  description = "ARN of the Neptune cluster (use in IAM policies for neptune-db:* actions)."
  value       = aws_neptune_cluster.this.arn
}

output "cluster_resource_id" {
  description = "Stable cluster resource ID — use this in IAM auth policy ARNs (arn:aws:neptune-db:...:<resource_id>/*)."
  value       = aws_neptune_cluster.this.cluster_resource_id
}

output "endpoint" {
  description = "Cluster (writer) endpoint hostname."
  value       = aws_neptune_cluster.this.endpoint
}

output "reader_endpoint" {
  description = "Load-balanced reader endpoint hostname for read-only traversals."
  value       = aws_neptune_cluster.this.reader_endpoint
}

output "port" {
  description = "Port the cluster listens on."
  value       = aws_neptune_cluster.this.port
}

output "instance_endpoints" {
  description = "Per-instance endpoints for the writer and each reader."
  value       = aws_neptune_cluster_instance.this[*].endpoint
}

output "subnet_group_name" {
  description = "Name of the created Neptune subnet group."
  value       = aws_neptune_subnet_group.this.name
}

How to use it

module "neptune" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-neptune?ref=v1.0.0"

  name                   = "fraud-graph"
  subnet_ids             = module.vpc.private_subnet_ids
  vpc_security_group_ids = [aws_security_group.neptune.id]
  kms_key_arn            = aws_kms_key.neptune.arn

  engine_version         = "1.3.2.0"
  parameter_group_family = "neptune1.3"

  # One writer + two readers for HA read scaling.
  instance_count = 3
  instance_class = "db.r6g.xlarge"

  iam_database_authentication_enabled = true
  iam_roles                           = [aws_iam_role.neptune_bulk_load.arn]

  enable_audit_log        = true
  backup_retention_period = 14
  deletion_protection     = true

  tags = {
    Environment = "prod"
    Team        = "risk-platform"
  }
}

# Security group rule: allow the application tier to reach the writer endpoint.
resource "aws_security_group_rule" "app_to_neptune" {
  type                     = "ingress"
  from_port                = module.neptune.port
  to_port                  = module.neptune.port
  protocol                 = "tcp"
  security_group_id        = aws_security_group.neptune.id
  source_security_group_id = aws_security_group.app.id
}

# Downstream: grant the app role IAM-auth access scoped to this exact cluster.
data "aws_iam_policy_document" "neptune_connect" {
  statement {
    effect    = "Allow"
    actions   = ["neptune-db:*"]
    resources = ["arn:aws:neptune-db:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:${module.neptune.cluster_resource_id}/*"]
  }
}

# Hand the endpoint to the workload (e.g. an ECS task) via environment.
resource "aws_ssm_parameter" "neptune_endpoint" {
  name  = "/fraud-graph/neptune/writer-endpoint"
  type  = "String"
  value = "https://${module.neptune.endpoint}:${module.neptune.port}"
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root config — live/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "s3"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...s3 state bucket/container + key per path...
  }
}

2. Module config — live/prod/neptune/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-neptune?ref=v1.0.0"
}

inputs = {
  name = "..."
  subnet_ids = ["...", "..."]
  vpc_security_group_ids = ["...", "..."]
  kms_key_arn = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/neptune && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name	Type	Default	Required	Description
`name`	`string`	—	Yes	Base name/prefix for the cluster and dependent resources (lowercase).
`subnet_ids`	`list(string)`	—	Yes	Private subnet IDs across >= 2 AZs for the subnet group.
`vpc_security_group_ids`	`list(string)`	—	Yes	Security groups attached to the cluster.
`kms_key_arn`	`string`	—	Yes	Customer-managed KMS key ARN for encryption at rest.
`engine_version`	`string`	`"1.3.2.0"`	No	Neptune engine version.
`parameter_group_family`	`string`	`"neptune1.3"`	No	Parameter group family matching the engine major line.
`port`	`number`	`8182`	No	TCP port the cluster listens on.
`instance_count`	`number`	`2`	No	Number of instances (1 writer + readers); 1-16.
`instance_class`	`string`	`"db.r6g.large"`	No	Provisioned instance class (ignored when serverless).
`serverless_min_capacity`	`number`	`null`	No	Min NCU for serverless v2; `null` keeps it provisioned.
`serverless_max_capacity`	`number`	`8`	No	Max NCU for serverless v2.
`iam_database_authentication_enabled`	`bool`	`true`	No	Use IAM database auth instead of a master password.
`iam_roles`	`list(string)`	`[]`	No	IAM role ARNs for S3 bulk load / Neptune ML.
`enable_audit_log`	`bool`	`true`	No	Enable audit logging and export `audit` logs to CloudWatch.
`backup_retention_period`	`number`	`7`	No	Automated backup retention in days (1-35).
`preferred_backup_window`	`string`	`"03:00-04:00"`	No	Daily backup window (UTC).
`preferred_maintenance_window`	`string`	`"sun:05:00-sun:06:00"`	No	Weekly maintenance window (UTC).
`deletion_protection`	`bool`	`true`	No	Prevent accidental cluster deletion.
`skip_final_snapshot`	`bool`	`false`	No	Skip final snapshot on destroy (dev only).
`apply_immediately`	`bool`	`false`	No	Apply changes immediately vs. in the maintenance window.
`auto_minor_version_upgrade`	`bool`	`true`	No	Allow automatic minor version upgrades on instances.
`extra_cluster_parameters`	`list(object({name,value}))`	`[]`	No	Extra cluster parameter group parameters.
`extra_db_parameters`	`list(object({name,value}))`	`[]`	No	Extra DB (instance) parameter group parameters.
`tags`	`map(string)`	`{}`	No	Additional tags applied to every resource.

Outputs

Name	Description
`cluster_id`	The Neptune cluster identifier.
`cluster_arn`	ARN of the cluster (for `neptune-db:*` IAM policies).
`cluster_resource_id`	Stable resource ID used in IAM-auth policy ARNs.
`endpoint`	Cluster (writer) endpoint hostname.
`reader_endpoint`	Load-balanced reader endpoint hostname.
`port`	Port the cluster listens on.
`instance_endpoints`	Per-instance endpoints (writer + readers).
`subnet_group_name`	Name of the created Neptune subnet group.

Enterprise scenario

A payments company runs real-time fraud detection on a Neptune property graph linking accounts, devices, IPs, and merchants. They instantiate this module once per region with instance_count = 3 (a writer plus two readers behind the reader endpoint), iam_database_authentication_enabled = true, and an iam_roles entry pointing at the S3 bulk-load role that hydrates the graph nightly from their lakehouse. Their scoring service connects over the writer endpoint using IAM auth — no database passwords ever leave the broker — and analysts hit the reader endpoint for ad-hoc Gremlin ring-detection queries, so heavy investigative traversals never contend with the transactional write path.

Best practices

Never expose Neptune publicly. It has no public-endpoint mode by design — keep it in private subnets, lock the security group to the application tier on port 8182 only, and reach it via VPN/PrivateLink, exactly as this module defaults.
Use IAM database authentication, not master passwords. With iam_database_authentication_enabled = true you issue SigV4-signed requests and scope access with neptune-db:* policies bound to the cluster_resource_id output — far safer to rotate and audit than a shared secret.
Encrypt with a customer-managed KMS key and turn it on from day one. storage_encrypted cannot be enabled in place; if you skip kms_key_arn at creation, your only path to encryption later is snapshot, copy-with-encryption, and restore. The module enforces encryption so this never bites you.
Right-size with serverless v2 for spiky graphs, provisioned for steady ones. Set serverless_min_capacity/serverless_max_capacity for workloads that idle then burst (dev, batch ML); stick with provisioned db.r6g classes for predictable high-throughput traversals to keep NCU costs flat and avoid scale-up latency.
Scale reads with the reader endpoint and promotion_tier. Add readers by bumping instance_count, send read-only traversals to reader_endpoint, and let the deterministic promotion tiers control which reader is promoted on failover.
Keep deletion protection and final snapshots on for anything that matters, and name consistently. Leave deletion_protection = true and skip_final_snapshot = false in prod, and rely on the ${name}-neptune naming convention so clusters, subnet groups, and parameter groups stay traceable to their owning module instance.