Back to all posts
MongoDBTerraformAWSProductionDevOps

MongoDB Production Cluster with Terraform on AWS

How I built a fully automated, self-bootstrapping 3-node MongoDB replica set across availability zones using Terraform, user_data scripts, and Vault-sourced secrets

December 15, 202415 min read

Infrastructure as Code: Terraform-Managed MongoDB on AWS

Running MongoDB in production means treating your database infrastructure like any other code — version-controlled, reproducible, and automated. In our setup, every MongoDB node is provisioned through Terraform with zero manual intervention. The architecture uses dedicated EC2 instances (ARM-based t4g.medium for cost efficiency) spread across three availability zones for fault tolerance. Each instance gets a dedicated encrypted EBS volume (gp3 with provisioned IOPS) for MongoDB data, separate from the root volume. This separation is critical — it lets you resize storage, snapshot data, or replace an instance without losing data. Key infrastructure decisions: - Dedicated network interfaces with pre-assigned private IPs for stable replica set addressing - Encrypted gp3 EBS volumes with 3000 provisioned IOPS for consistent write performance - ARM-based instances (Graviton) for ~20% better price-performance over x86 - Each node in a separate AZ subnet for true high availability - Security groups restricting port 27017 to VPC CIDR and VPN only — no public access
1# Terraform: Production MongoDB Cluster Infrastructure
2# 3-node replica set across 3 availability zones
3
4resource "random_string" "mongoKey" {
5  length = 768  # Shared keyfile for replica set authentication
6}
7
8resource "aws_network_interface" "mongo" {
9  count           = 3
10  subnet_id       = aws_subnet.resources[count.index].id
11  security_groups = [aws_security_group.mongo.id]
12}
13
14# Dedicated encrypted EBS volumes per node
15resource "aws_ebs_volume" "mongoPrimary" {
16  availability_zone = "us-east-1a"
17  iops              = 3000
18  size              = 50
19  encrypted         = true
20  type              = "gp3"
21  tags = { Name = "MongoDB-Primary", managedBy = "terraform" }
22}
23
24resource "aws_ebs_volume" "mongoSecondary" {
25  availability_zone = "us-east-1b"
26  iops              = 3000
27  size              = 50
28  encrypted         = true
29  type              = "gp3"
30  tags = { Name = "MongoDB-Secondary-0", managedBy = "terraform" }
31}
32
33resource "aws_ebs_volume" "mongoSecondSecondary" {
34  availability_zone = "us-east-1c"
35  iops              = 3000
36  size              = 50
37  encrypted         = true
38  type              = "gp3"
39  tags = { Name = "MongoDB-Secondary-1", managedBy = "terraform" }
40}

EC2 Instance Provisioning: Primary & Secondaries

Each MongoDB node is an EC2 instance bootstrapped entirely through user_data scripts. Terraform's templatefile() function injects secrets and configuration at launch time — private IPs for replica set members, the shared keyfile, and Vault-sourced admin credentials. The primary node's user_data receives the most variables: all three private IPs (for replica set initiation), the shared keyfile, and Vault-sourced admin credentials. The secondary nodes are simpler — they only need the keyfile since the primary handles adding them to the replica set. Notice the key difference in user_data between primary and secondaries: the primary uses base64-encoded user_data (user_data_base64) while secondaries use plain text. Both approaches work, but base64 is safer for scripts containing special characters from Vault secrets. Important design choices: - XFS filesystem for EBS volumes — MongoDB recommends XFS over ext4 for WiredTiger's performance characteristics - fstab entries using UUID for reliable remounting after reboots - Separate directories for data, logs, and keyfile under /mongodb on the dedicated volume - AMI changes ignored in lifecycle blocks to prevent unnecessary instance replacement during OS updates - Each instance gets a pre-created network interface for stable private IPs
1# Primary instance — receives all IPs + Vault credentials
2resource "aws_instance" "mongoPrimary" {
3  ami           = data.aws_ami.ubuntu.id
4  instance_type = "t4g.medium"
5  key_name      = var.keyName
6  monitoring    = true
7
8  user_data_base64 = base64encode(templatefile(
9    "${path.module}/mongo/user_data_primary.sh",
10    {
11      keyfileContent           = base64encode(random_string.mongoKey.result)
12      primaryPrivateIp         = aws_network_interface.mongo[0].private_ip
13      secondaryPrivateIp       = aws_network_interface.mongo[1].private_ip
14      secondsecondaryPrivateIp = aws_network_interface.mongo[2].private_ip
15      mongodbAdminUser         = module.vaultSecrets.secretData["mongoAdminUser"]
16      mongodbAdminPassword     = module.vaultSecrets.secretData["mongoAdminPassword"]
17    }
18  ))
19
20  network_interface {
21    device_index         = 0
22    network_interface_id = aws_network_interface.mongo[0].id
23  }
24
25  root_block_device {
26    volume_type = "gp3"
27    volume_size = 50
28    encrypted   = true
29    delete_on_termination = true
30  }
31
32  lifecycle { ignore_changes = [ami] }
33}
34
35resource "aws_volume_attachment" "mongoPrimary" {
36  device_name = "/dev/xvdb"
37  instance_id = aws_instance.mongoPrimary.id
38  volume_id   = aws_ebs_volume.mongoPrimary.id
39}
40
41# Secondary instance (AZ-b) — only needs the keyfile
42resource "aws_instance" "mongoSecondary" {
43  ami           = data.aws_ami.ubuntu.id
44  instance_type = "t4g.medium"
45  key_name      = var.keyName
46  monitoring    = true
47
48  user_data = templatefile(
49    "${path.module}/mongo/user_data_secondary.sh",
50    { keyfileContent = base64encode(random_string.mongoKey.result) }
51  )
52
53  network_interface {
54    device_index         = 0
55    network_interface_id = aws_network_interface.mongo[1].id
56  }
57
58  root_block_device {
59    volume_type = "gp3"
60    volume_size = 50
61    encrypted   = true
62    delete_on_termination = true
63  }
64
65  lifecycle { ignore_changes = [ami] }
66}
67
68resource "aws_volume_attachment" "mongoSecondary" {
69  device_name = "/dev/xvdb"
70  instance_id = aws_instance.mongoSecondary.id
71  volume_id   = aws_ebs_volume.mongoSecondary.id
72}
73
74# Second secondary instance (AZ-c) — same simple bootstrap
75resource "aws_instance" "mongoSecondSecondary" {
76  ami           = data.aws_ami.ubuntu.id
77  instance_type = "t4g.medium"
78  key_name      = var.keyName
79  monitoring    = true
80
81  user_data = templatefile(
82    "${path.module}/mongo/user_data_second_secondary.sh",
83    { keyfileContent = base64encode(random_string.mongoKey.result) }
84  )
85
86  network_interface {
87    device_index         = 0
88    network_interface_id = aws_network_interface.mongo[2].id
89  }
90
91  root_block_device {
92    volume_type = "gp3"
93    volume_size = 50
94    encrypted   = true
95    delete_on_termination = true
96  }
97
98  lifecycle { ignore_changes = [ami] }
99}
100
101resource "aws_volume_attachment" "mongoSecondSecondary" {
102  device_name = "/dev/xvdb"
103  instance_id = aws_instance.mongoSecondSecondary.id
104  volume_id   = aws_ebs_volume.mongoSecondSecondary.id
105}

User Data Scripts: Primary Node Bootstrap

The user_data scripts are the heart of the automation. On first boot, each node installs MongoDB 7.0, mounts its EBS data volume, writes the shared keyfile, and starts the mongod service — all without any manual SSH access. The primary node script handles the most work: installing MongoDB 7.0 for ARM64, mounting and formatting the EBS data volume with XFS, writing the Terraform-injected keyfile, and configuring mongod.conf with the replica set name and keyfile path. Key details in the primary script: - EBS volume mount with fallback: try mount first, format with XFS only if it fails (idempotent for re-runs) - Keyfile permissions set to 400 (MongoDB refuses to start if keyfile is world-readable) - fstab entry using blkid UUID ensures the volume remounts correctly after reboot - The mongod.conf starts with keyFile auth but without authorization — this is intentional for the bootstrap phase
1#!/bin/bash
2# user_data_primary.sh — Primary node bootstrap
3set -o xtrace
4set -e
5
6sudo apt-get update && sudo apt-get upgrade -y
7sudo apt-get install -y wget apt-transport-https curl gnupg2
8
9# Install MongoDB 7.0 (ARM64)
10curl -fsSL https://pgp.mongodb.com/server-7.0.asc | \
11  sudo gpg --dearmor -o /usr/share/keyrings/mongodb-server-7.0.gpg
12
13echo "deb [ arch=arm64 signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] \
14  https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/7.0 multiverse" | \
15  sudo tee /etc/apt/sources.list.d/mongodb-org-7.0.list
16
17sudo apt-get update && sudo apt-get install -y mongodb-org mongodb-mongosh
18
19# Mount dedicated EBS volume with XFS
20sudo mkdir -p /mongodb
21sudo mount /dev/nvme1n1 /mongodb || \
22  (sudo mkfs -t xfs /dev/nvme1n1 && sudo mount /dev/nvme1n1 /mongodb)
23sudo mkdir -p /mongodb/data /mongodb/logs /mongodb/keyfile /var/run/mongodb
24
25# Write the shared keyfile (injected by Terraform templatefile)
26echo "${keyfileContent}" | base64 -d > /tmp/keyfile
27sudo mv /tmp/keyfile /mongodb/keyfile/mongo.key
28sudo chown -R mongodb:mongodb /mongodb /var/run/mongodb
29sudo chmod 400 /mongodb/keyfile/mongo.key
30
31# Persist mount across reboots via fstab
32mongodb_data_uuid=$(blkid -s UUID -o value /dev/nvme1n1)
33echo "UUID=$mongodb_data_uuid  /mongodb  xfs  defaults,nofail  0  2" | \
34  sudo tee -a /etc/fstab

User Data Scripts: Secondary Nodes Bootstrap

The secondary node scripts are intentionally simpler than the primary. They perform the same base setup — install MongoDB 7.0, mount the EBS volume, write the shared keyfile — but they don't initiate the replica set or create users. That's the primary's job. Both secondary scripts (user_data_secondary.sh and user_data_second_secondary.sh) are nearly identical. They configure the same replica set name (rs0) and keyfile path so MongoDB knows which cluster to join. Once started, they sit idle until the primary's rs.initiate() call adds them. The secondary scripts also: - Disable MongoDB telemetry on startup via mongosh - Enable authorization right after starting (unlike the primary which does it after creating the admin user) - Restart mongod after enabling auth to lock down the node This split between primary and secondary bootstrap is what makes the cluster self-assembling. Terraform launches all three instances simultaneously, the secondaries come up and wait, and when the primary runs rs.initiate() after a 60-second delay, it connects to the secondaries by their pre-assigned private IPs and forms the cluster automatically.
1#!/bin/bash
2# user_data_secondary.sh — Secondary node bootstrap
3# (user_data_second_secondary.sh is identical)
4set -o xtrace
5set -e
6
7sudo apt-get update && sudo apt-get upgrade -y
8sudo apt-get install -y wget apt-transport-https curl gnupg2
9
10# Install MongoDB 7.0 (ARM64) — same as primary
11curl -fsSL https://pgp.mongodb.com/server-7.0.asc | \
12  sudo gpg --dearmor -o /usr/share/keyrings/mongodb-server-7.0.gpg
13echo "deb [ arch=arm64 signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] \
14  https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/7.0 multiverse" | \
15  sudo tee /etc/apt/sources.list.d/mongodb-org-7.0.list
16sudo apt-get update && sudo apt-get install -y mongodb-org mongodb-mongosh
17
18# Mount EBS volume — same XFS + fstab setup as primary
19sudo mkdir -p /mongodb
20sudo mount /dev/nvme1n1 /mongodb || \
21  (sudo mkfs -t xfs /dev/nvme1n1 && sudo mount /dev/nvme1n1 /mongodb)
22sudo mkdir -p /mongodb/data /mongodb/logs /mongodb/keyfile /var/run/mongodb
23
24# Write the SAME shared keyfile (all replica set nodes must match)
25cat <<KEYEOF > /tmp/keyfile
26${keyfileContent}
27KEYEOF
28sudo mv /tmp/keyfile /mongodb/keyfile/mongo.key
29sudo chown -R mongodb:mongodb /mongodb /var/run/mongodb
30sudo chmod 400 /mongodb/keyfile/mongo.key
31
32# Persist mount across reboots
33mongodb_data_uuid=$(blkid -s UUID -o value /dev/nvme1n1)
34echo "UUID=$mongodb_data_uuid  /mongodb  xfs  defaults,nofail  0  2" | \
35  sudo tee -a /etc/fstab
36
37# Write mongod.conf — same replSetName + keyFile as primary
38cat <<CONF | sudo tee /etc/mongod.conf
39systemLog:
40  destination: file
41  path: /mongodb/logs/mongod.log
42storage:
43  dbPath: /mongodb/data
44processManagement:
45  fork: true
46  pidFilePath: /var/run/mongodb/mongod.pid
47  timeZoneInfo: /usr/share/zoneinfo
48net:
49  port: 27017
50  bindIp: 0.0.0.0
51replication:
52  replSetName: rs0
53security:
54  keyFile: /mongodb/keyfile/mongo.key
55CONF
56
57# Start mongod — then wait for primary to add us via rs.initiate()
58sudo systemctl start mongod
59sudo systemctl enable mongod
60
61sleep 5
62
63# Disable telemetry
64mongosh <<EOF
65disableTelemetry();
66EOF
67
68# Enable authorization and restart
69echo "  authorization: enabled" | sudo tee -a /etc/mongod.conf
70sudo systemctl restart mongod

Replica Set Initialization & Security Bootstrap

The replica set initialization is the most critical part of the bootstrap. The primary node's user_data script handles this in a specific sequence that ensures both high availability and security are established correctly. After MongoDB starts on all three nodes, the primary initiates the replica set with explicit member priorities: the primary node gets priority 10 (strongly preferred for election), while both secondaries get priority 5. This ensures predictable failover behavior — if the primary goes down, either secondary can take over, but when the preferred primary recovers, it will be re-elected. The admin user is created with Vault-sourced credentials (never hardcoded), given the root role on the admin database. Only after the user is created does the script enable authorization and restart — this is the correct MongoDB bootstrap order. For production hardening: - Security group limits 27017 access to VPC CIDR + VPN CIDR only - SSH access (port 22) restricted to VPN - Internal Route53 DNS records provide stable hostnames for each node - No public IPs assigned — access only through VPN or VPC peering
1# Continue from primary user_data script...
2
3# Write mongod.conf with keyFile auth and replica set
4cat <<CONF | sudo tee /etc/mongod.conf
5systemLog:
6  destination: file
7  path: /mongodb/logs/mongod.log
8storage:
9  dbPath: /mongodb/data
10processManagement:
11  fork: true
12  pidFilePath: /var/run/mongodb/mongod.pid
13  timeZoneInfo: /usr/share/zoneinfo
14net:
15  port: 27017
16  bindIp: 0.0.0.0
17replication:
18  replSetName: rs0
19security:
20  keyFile: /mongodb/keyfile/mongo.key
21CONF
22
23sudo systemctl restart mongod && sudo systemctl enable mongod
24
25# Wait for secondaries to come online
26sleep 60
27
28# Phase 1: Initiate replica set with priority-based election
29mongosh << EOF
30disableTelemetry();
31rs.initiate({
32  _id: "rs0",
33  members: [
34    { _id: 0, host: "${primaryPrivateIp}:27017",         priority: 10 },
35    { _id: 1, host: "${secondaryPrivateIp}:27017",       priority: 5  },
36    { _id: 2, host: "${secondsecondaryPrivateIp}:27017", priority: 5  }
37  ]
38});
39EOF
40
41sleep 30
42
43# Phase 2: Create admin user with Vault-sourced credentials
44mongosh << EOF
45use admin;
46db.createUser({
47  user: "${mongodbAdminUser}",
48  pwd:  "${mongodbAdminPassword}",
49  roles: [{ role: "root", db: "admin" }]
50});
51EOF
52
53# Phase 3: Enable authorization and restart
54echo "  authorization: enabled" | sudo tee -a /etc/mongod.conf
55sudo systemctl restart mongod
56
57echo "MongoDB replica set initialized successfully"

Network Security & Service Discovery

In production, MongoDB should never be publicly accessible. Our setup uses multiple layers of network isolation managed entirely through Terraform. Each MongoDB node sits in a private subnet with a dedicated network interface attached to a security group that allows port 27017 only from specific CIDR ranges: the VPC CIDR (so EKS pods can connect), the VPN CIDR (for admin access), and peered VPC CIDRs (for cross-service communication). SSH is similarly restricted to VPN-only access. For service discovery, we use Route53 private hosted zones instead of hardcoding IPs. Each node gets a DNS A record (e.g., mongodb-primary.internal, mongodb-secondary-0.internal) pointing to its private IP. Application connection strings use these DNS names, so if a node is replaced, updating the DNS record is all that's needed — no application config changes. This architecture gives you: - Zero public exposure — all traffic flows through private subnets - VPN-gated admin access for debugging and maintenance - DNS-based discovery that survives instance replacement - VPC peering support for cross-account or cross-region access - Full auditability through VPC Flow Logs and CloudTrail
1# Security Group: Restrict MongoDB access to VPC + VPN only
2resource "aws_security_group" "mongo" {
3  name        = "production-mongo"
4  description = "Security group for production MongoDB cluster"
5  vpc_id      = aws_vpc.cluster.id
6
7  egress {
8    cidr_blocks = ["0.0.0.0/0"]
9    from_port   = 0
10    to_port     = 0
11    protocol    = "-1"
12  }
13
14  ingress {
15    description = "MongoDB from VPC, VPN, and peered networks"
16    from_port   = 27017
17    to_port     = 27017
18    protocol    = "TCP"
19    cidr_blocks = [
20      aws_vpc.cluster.cidr_block,        # EKS pods
21      var.vpnCidr,                        # Admin VPN
22      data.aws_vpc.peered.cidr_block,     # Peered VPC
23    ]
24  }
25
26  ingress {
27    description = "SSH from VPN only"
28    from_port   = 22
29    to_port     = 22
30    protocol    = "TCP"
31    cidr_blocks = [var.vpnCidr]
32  }
33}
34
35# Route53 internal DNS for service discovery
36resource "aws_route53_record" "mongodb_primary" {
37  name    = "mongodb-primary"
38  type    = "A"
39  zone_id = data.aws_route53_zone.internal.id
40  ttl     = 300
41  records = [aws_instance.mongoPrimary.private_ip]
42}
43
44resource "aws_route53_record" "mongodb_secondary_0" {
45  name    = "mongodb-secondary-0"
46  type    = "A"
47  zone_id = data.aws_route53_zone.internal.id
48  ttl     = 300
49  records = [aws_instance.mongoSecondary.private_ip]
50}
51
52resource "aws_route53_record" "mongodb_secondary_1" {
53  name    = "mongodb-secondary-1"
54  type    = "A"
55  zone_id = data.aws_route53_zone.internal.id
56  ttl     = 300
57  records = [aws_instance.mongoSecondSecondary.private_ip]
58}