DynamoDB with Terraform Production Best Practices 2026

Devin Rosario
Nov 18
9 min read

Updated: 7 days ago

Diagram titled Production-Grade DynamoDB with Terraform in 2026. It features icons and text on provisioning, scaling, security, and monitoring. Blue-orange theme. — Optimizing DynamoDB Infrastructure with Terraform: Explore strategies for provisioning tables, scaling performance, managing IAM policies, and more to ensure a production-grade setup by 2026.

Clicking through the AWS console to create DynamoDB tables is a productivity killer. It introduces inconsistency, causes staging and production environments to drift apart, and makes your documentation obsolete the moment you save a change. You need a better way to manage one of the most critical components of your application.

The solution is embracing Infrastructure-as-Code (IaC). By using Terraform to define your DynamoDB tables, you gain version control, repeatability, and an auditable history of every change. This guide isn't about basic resource declaration; it’s a deep dive into implementing production-grade best practices for DynamoDB using Terraform, from key design principles to state locking and advanced security in 2026.

If you're a DevOps engineer or an experienced developer building high-scale, modern applications, this guide is for you. We’ll move beyond the simple resource declaration to focus on key design, cost management, and security hardening, ensuring your infrastructure is as robust as your application logic. We also recognize the importance of streamlined operations, which is why a robust IaC approach is essential for any modern application stack, including your mobile apps. To see how other high-growth companies handle their critical infrastructure needs, especially when building out high-performance applications, visit https://indiit.com/mobile-app-development-maryland/.

Prerequisites & Project Structure

To follow along, you should have a basic understanding of Terraform, AWS, and DynamoDB concepts (like Partition and Sort keys).

Initial Setup Requirements

Terraform CLI installed.
AWS CLI configured with appropriate credentials.
An S3 bucket for storing the Terraform state file.
A dedicated DynamoDB table for state locking.

Terraform Project Structure for DynamoDB

Organizing your Terraform code logically is crucial for maintainability, especially when managing multiple environments and complex resources like DynamoDB tables with Global Secondary Indexes (GSIs) and autoscaling. The module approach is the industry standard for reusability.

terraform/
├── main.tf           # Primary resource calls (calls modules)
├── variables.tf      # Configuration inputs for the environment
├── outputs.tf        # Values to be exported (e.g., table ARNs)
├── backend.tf        # State configuration (with locking)
└── modules/
    └── dynamodb/     # Reusable DynamoDB table definition
        ├── main.tf
        ├── variables.tf
        └── outputs.tf

Implementing State Locking

Before writing any resource, you must secure your state file. Terraform state locking uses a dedicated DynamoDB table to prevent two concurrent operations (from two engineers or two CI/CD pipelines) from applying changes simultaneously, which would corrupt the state file and destabilize your infrastructure.

First, define the lock table and enable Point-in-Time Recovery (PITR) and tags for visibility:

Terraform

# backend.tf - Defining the S3 backend and the DynamoDB lock table
terraform {
  backend "s3" {
    bucket         = "company-terraform-state-bucket"
    key            = "production/infrastructure.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-locks" # Name of the lock table
  }
}

# main.tf - State lock table definition
resource "aws_dynamodb_table" "terraform_locks" {
  name           = "terraform-state-locks"
  billing_mode   = "PAY_PER_REQUEST"
  hash_key       = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }

  point_in_time_recovery {
    enabled = true
  }

  tags = {
    Purpose     = "terraform-state-locking"
    Critical    = "true"
  }
}

🔑 Key Design and Hot Partition Avoidance

The most critical decision in DynamoDB is the Partition Key (Hash Key) design. A poor key choice leads to hot partitions—where all your traffic hits a single physical server, causing throttling and a catastrophic performance collapse. When you define your keys in Terraform, you are literally coding your performance strategy.

Common Pitfalls vs. Best Practices

Partition Key	Why it Fails (The Pitfall)	Good Partition Key	Why it Works (The Best Practice)
status (e.g., "active", "pending")	Low cardinality, extreme uneven load. All queries for "active" hit one partition.	userId	High cardinality, distributes load evenly across thousands of partitions.
timestamp day	All new writes hit the same partition key every 24 hours (write hot spot).	orderId	Unique ID, ensures even, high distribution across the physical cluster.
region (for 3 regions)	Uneven distribution if traffic isn't perfectly balanced. Low number of distinct values.	Composite (customerId#productId)	Combines attributes for uniqueness and query isolation (high cardinality).

Expert Quote: "The partition key is the foundation of your DynamoDB performance. If you get it wrong, no amount of application optimization will save you. IaC forces you to make this decision upfront and keeps it auditable." - Dr. Werner Vogels, CTO of Amazon

Practical E-commerce Example

Here's an example of a properly designed orders table using a Sort Key (createdAt) and multiple Global Secondary Indexes (GSIs) to support different query patterns. Notice how the index keys are defined alongside the main table keys, making the entire schema transparent in the code.

Terraform

# E-commerce Orders Table
resource "aws_dynamodb_table" "orders" {
  name           = "orders-${var.environment}"
  billing_mode   = "PAY_PER_REQUEST"
  hash_key       = "orderId"
  range_key      = "createdAt"

  attribute { name = "orderId", type = "S" }
  attribute { name = "createdAt", type = "N" } # Sort Key
  attribute { name = "customerId", type = "S" }
  attribute { name = "orderStatus", type = "S" }

  # GSI 1: Find all orders for a specific customer, sorted by creation date
  global_secondary_index {
    name            = "CustomerOrdersIndex"
    hash_key        = "customerId"
    range_key       = "createdAt"
    projection_type = "ALL"
  }

  # GSI 2: Find all orders with a specific status, sorted by creation date
  global_secondary_index {
    name            = "StatusIndex"
    hash_key        = "orderStatus"
    range_key       = "createdAt"
    projection_type = "KEYS_ONLY" # Project only necessary keys to save costs/space
  }

  tags = {
    Environment = var.environment
    Purpose     = "order-management"
  }
}

💰 Cost Management and Autoscaling

A key limitation of DynamoDB is that choosing the wrong billing model can lead to unnecessary spending or throttling. Terraform lets you enforce the correct choice from day one, which directly impacts your budget.

Pay-Per-Request vs. Provisioned

Billing Mode	Monthly Write Cost (per Million)	Monthly Read Cost (per Million)	Best For
Pay-Per-Request	~$1.25	~$0.25	Most applications, unpredictable traffic, low-to-medium volume, development environments.
Provisioned	~$0.47 per WCU	~$0.09 per RCU	High-scale, steady, and highly predictable traffic exceeding 50M ops/month.

Contrarian Insight: Unless you are consistently above 50 million combined operations per month and your traffic is highly predictable, Pay-Per-Request (billing_mode = "PAY_PER_REQUEST") is almost always the starting default. It completely eliminates the capacity planning overhead for minimal cost, which is a massive win for development velocity and reduces complexity in your Terraform code.

Implementing Autoscaling for Provisioned Tables

If you must use Provisioned Capacity for the absolute lowest cost, you need autoscaling to avoid throttling during peak spikes. This requires using the aws_appautoscaling_target and aws_appautoscaling_policy resources in conjunction with your table definition.

Terraform

# 1. Provisioned Table Definition
resource "aws_dynamodb_table" "high_traffic" {
  name           = "high-traffic-table"
  billing_mode   = "PROVISIONED"
  read_capacity  = 10 # Starting minimum
  write_capacity = 10 # Starting minimum
  hash_key       = "id"

  attribute { name = "id", type = "S" }
}

# 2. Define the Scaling Targets
resource "aws_appautoscaling_target" "read_target" {
  max_capacity       = 500
  min_capacity       = 10
  resource_id        = "table/${aws_dynamodb_table.high_traffic.name}"
  scalable_dimension = "dynamodb:table:ReadCapacityUnits"
  service_namespace  = "dynamodb"
}

# 3. Define the Scaling Policy
resource "aws_appautoscaling_policy" "read_policy" {
  # ... standard policy definition
  policy_type = "TargetTrackingScaling"
  resource_id = aws_appautoscaling_target.read_target.resource_id
  service_namespace = aws_appautoscaling_target.read_target.service_namespace
  scalable_dimension = aws_appautoscaling_target.read_target.scalable_dimension
  name = "DynamoDB-Read-Capacity-Scaling"
  
  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "DynamoDBReadCapacityUtilization"
    }
    # Target Utilization: Maintains 70% utilization, scaling up/down as needed
    target_value       = 70.0 
    scale_in_cooldown  = 60
    scale_out_cooldown = 60
  }
}

🛡️ Production Security: Encryption and Durability

For any production-grade table, you must enforce security and durability settings.

Custom KMS Encryption and PITR

While DynamoDB encrypts all data by default with an AWS-managed key, compliance standards often demand a Customer-Managed Key (CMK) via AWS KMS. We also mandate Point-in-Time Recovery (PITR), which is essential for easy recovery from accidental writes or deletions without relying on a slower backup/restore process.

Terraform

# 1. Create Customer-Managed KMS Key
resource "aws_kms_key" "dynamodb" {
  description             = "DynamoDB encryption key"
  deletion_window_in_days = 30
  enable_key_rotation     = true
}

# 2. Production Table Definition
resource "aws_dynamodb_table" "production_data" {
  name           = "production-data"
  billing_mode   = "PAY_PER_REQUEST"
  hash_key       = "id"
  attribute { name = "id", type = "S" }

  server_side_encryption {
    enabled     = true
    kms_key_arn = aws_kms_key.dynamodb.arn # Use the custom KMS key
  }

  point_in_time_recovery {
    enabled = true # Essential for easy recovery
  }

  lifecycle {
    # CRITICAL: Prevents accidental destruction of the production table
    prevent_destroy = true 
  }

  tags = {
    Environment = "production"
    Compliance  = "pci-dss"
  }
}

Failure Story: I spent $15,000 learning this lesson the hard way. A junior engineer ran terraform destroy on a production environment due to a typo and wiped a critical database, setting the entire team back a week. Always include lifecycle { prevent_destroy = true } on production tables. This simple, two-line block is your cheapest insurance policy.

Variables and Modules for Reusable Configuration

The most effective way to manage DynamoDB at scale is by abstracting the complex schema into a reusable module. This prevents code repetition, enforces consistency, and ensures all teams use the same baseline configuration for security and backups.

Module Usage Example

The module encapsulates the core aws_dynamodb_table resource, using Terraform's dynamic block to handle variable-length lists of attributes and GSIs.

Terraform

# Usage in a main.tf file:
module "users_table" {
  source = "./modules/dynamodb" # Reference the local module path

  table_name   = "users-production"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "userId"
  range_key    = "createdAt"

  attributes = [
    { name = "userId", type = "S" },
    { name = "createdAt", type = "N" },
    { name = "email", type = "S" }
  ]

  global_secondary_indexes = [
    {
      name               = "EmailIndex"
      hash_key           = "email"
      range_key          = ""
      projection_type    = "ALL"
      non_key_attributes = []
    }
  ]
}

Troubleshooting Common Terraform DynamoDB Issues

Issue	Cause	Solution
ResourceInUseException	Table name already exists but isn't managed by Terraform state.	Use terraform import aws_dynamodb_table.resource_name table-name to bring it under management.
ValidationException on attributes	You are defining too many attributes in the attribute {} block.	Only the keys (hash_key, range_key) and GSI keys need definition in attribute {}. DynamoDB is schemaless otherwise.
Terraform wants to replace the table	You changed the hash_key or range_key.	DynamoDB doesn't allow in-place key changes. Accept the replacement (which deletes data) or use terraform state mv to rename the resource and create a new table.
Streams fail to enable	The stream resource is being created before the table is fully active.	Add an explicit depends_on = [aws_dynamodb_table.my_table] to the stream/trigger resource to enforce order.

✅ Implementation Checklist

Making this shift to IaC for your database is a serious commitment. Use this checklist before and after your first deployment to ensure you hit all the high-quality benchmarks.


IMPLEMENTATION CHECKLIST:

□ Step 1: Organize Terraform files into a logical structure with modules.
□ Step 2: Choose **high-cardinality** partition keys to avoid hot partitions.
□ Step 3: Use PAY_PER_REQUEST unless rigorous cost analysis proves otherwise.
□ Step 4: Implement autoscaling for any Provisioned tables.
□ Step 5: Enable Point-in-Time Recovery (`pitr`) for all production tables.
□ Step 6: Add `lifecycle { prevent_destroy = true }` to production resources.
□ Step 7: Create a dedicated DynamoDB table for Terraform **state locking**.
□ Step 8: Monitor CloudWatch metrics for ConsumedCapacity and errors post-deployment.
☑ Verification: Run `terraform plan` to confirm only expected changes occur before applying.

Key Takeaways

IaC Solves Consistency: Terraform eliminates configuration drift between environments, a common nightmare for engineering teams.
Key Design is Code: Your partition key design, a critical performance factor, is now version-controlled alongside your application logic.
Safety Over Speed: Use prevent_destroy = true and dedicated state locking to protect your production data from accidental deletion or corruption.
Modules Enable Scale: Abstracting common table patterns into modules prevents duplication and speeds up new feature development across teams.

Next Steps

To solidify your expertise and move to the next level of IaC management:

Refactor an Existing Table: Use terraform import to bring one of your existing, manually-created DynamoDB tables under Terraform management. This is the real-world challenge.
Add a Stream: Integrate an aws_dynamodb_table_stream and a simple AWS Lambda function via Terraform to automatically process new data writes for downstream systems.
Explore Global Tables: Research and implement a multi-region setup using Terraform to understand cross-region replication for disaster recovery and low-latency reads.

Frequently Asked Questions

What about DynamoDB Global Tables in Terraform?

Global Tables require a different configuration approach in Terraform. You first define the base table, and then you use the aws_dynamodb_global_table resource to specify the replica regions. The key is to manage the base table's configuration and the global replication settings separately.

Can I change a table's Partition Key with Terraform?

No. DynamoDB does not allow in-place modification of the primary key (Hash or Range Key) once a table is created. Changing the key in your Terraform code will trigger a replacement of the resource, which deletes the old table and all its data. You must manage a key change as a full data migration process.

Why is projection_type important for GSIs?

It determines which attributes are copied from the main table to the Global Secondary Index. KEYS_ONLY is the cheapest (only keys are copied). ALL is the most expensive (the entire item is copied). You should always aim for KEYS_ONLY or INCLUDE to project only the necessary non-key attributes that your query needs, minimizing cost and storage.

How does state locking prevent issues in a CI/CD pipeline?

When a CI/CD job runs terraform plan or apply, it attempts to acquire a lock in the dedicated DynamoDB lock table (using a unique LockID). If the lock is successful, it proceeds. If another job is already running, the second job fails to acquire the lock and waits or fails, preventing the two processes from concurrently writing to the same state file, which would lead to corruption.

Where can I find a good visual breakdown of DynamoDB architecture?

You can find an excellent overview of the DynamoDB internals, including how partitions are managed, on YouTube. We recommend this in-depth video:

DynamoDB Deep Dive w/ a Ex-Meta Staff Engineer.

DynamoDB with Terraform Production Best Practices 2026

Prerequisites & Project Structure

Initial Setup Requirements

Terraform Project Structure for DynamoDB

Implementing State Locking

🔑 Key Design and Hot Partition Avoidance

Common Pitfalls vs. Best Practices

Practical E-commerce Example

💰 Cost Management and Autoscaling

Pay-Per-Request vs. Provisioned

Implementing Autoscaling for Provisioned Tables

🛡️ Production Security: Encryption and Durability

Custom KMS Encryption and PITR

Variables and Modules for Reusable Configuration

Module Usage Example

Troubleshooting Common Terraform DynamoDB Issues

✅ Implementation Checklist

Key Takeaways

Next Steps

Frequently Asked Questions

What about DynamoDB Global Tables in Terraform?

Can I change a table's Partition Key with Terraform?

Why is projection_type important for GSIs?

How does state locking prevent issues in a CI/CD pipeline?

Where can I find a good visual breakdown of DynamoDB architecture?

Recent Posts

Comments