The AWS Decommissioning Checklist I Wish I Had

Everyone writes about spinning up infrastructure. Nobody talks about tearing it down. I recently had to decommission an entire AWS project, and it turned out to be way more nuanced than “just delete everything.” Here’s the checklist I built along the way, so you don’t have to learn these lessons the hard way.

Before You Touch Anything: Archive and Snapshot

This is the most important step, and it’s the one you’ll skip if you’re in a hurry. Don’t skip it.

Before deleting a single resource, take snapshots and archives of everything that matters:

RDS: Create final snapshots of all databases
S3: Sync important buckets to a local backup or another account with aws s3 sync
EBS Volumes: Create snapshots
CloudFormation/Terraform State: Back up your state files somewhere safe
CloudWatch Logs: Export log groups you might need later
Route53: Export your hosted zone records to a JSON file with aws route53 list-resource-record-sets

You think you won’t need any of this data again. You’re wrong. Future you will thank present you for spending 30 minutes on this.

The Checklist

1. Notify Your Team First

Before you start deleting things, communicate. Post in the relevant Slack channels, send emails, whatever your team uses. I learned this one the hard way — anticipating changes and giving people a heads-up avoids the “hey, where did that staging environment go?” conversation after you’ve already nuked it.

2. Route53: Empty Before You Delete

Route53 hosted zones won’t let you delete them if they still contain records. You need to remove all records (except the default NS and SOA records) before you can delete the hosted zone itself.

Also worth knowing: Route53 DNS records can’t be renamed. If you need to change a record name, you have to delete it and recreate it. This matters during decommissioning because you might want to temporarily point records elsewhere before removing them entirely.

# List all records in a hosted zone
aws route53 list-resource-record-sets --hosted-zone-id Z1234567890

# Delete records using a change batch JSON file
aws route53 change-resource-record-sets \
  --hosted-zone-id Z1234567890 \
  --change-batch file://delete-records.json

3. S3: Don’t Just Hit Delete

This one surprised me. Deleting an S3 bucket with a lot of objects is painfully slow through the console. If you have millions of objects, you’ll be sitting there for hours.

The better approach: create a lifecycle rule that expires all objects after 1 day, wait for AWS to clean it up, then delete the empty bucket.

# Apply a lifecycle rule to expire all objects
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-bucket \
  --lifecycle-configuration '{
    "Rules": [
      {
        "ID": "ExpireAll",
        "Status": "Enabled",
        "Filter": { "Prefix": "" },
        "Expiration": { "Days": 1 }
      }
    ]
  }'

Here’s the thing about S3 though: it’s really cheap. We’re talking less than $10/month even with a decent amount of data. If you’re unsure whether to delete a bucket, just leave it. The cost of keeping it around “just in case” is almost nothing compared to the cost of losing data you actually needed.

4. Terraform Destroy: Use It, But Carefully

If your infrastructure was provisioned with Terraform, terraform destroy is your friend — mostly.

# Always plan first
terraform plan -destroy -out=destroy.tfplan

# Review what will be destroyed
terraform show destroy.tfplan

# Then apply
terraform apply destroy.tfplan

A few caveats:

Run plan -destroy first. Review exactly what will be removed. Terraform might try to delete things in an order that causes dependency failures.
Some resources resist destruction. S3 buckets with objects, Route53 hosted zones with records, and RDS instances without skip_final_snapshot will cause destroy to fail or hang.
Stateful resources need extra attention. If you have prevent_destroy lifecycle rules in your Terraform code, you’ll need to remove those first.
Consider targeted destroys for complex stacks: terraform destroy -target=module.app lets you tear things down in stages instead of all at once.

5. IAM Cleanup

Don’t forget IAM resources. Roles, policies, and users created for this project are easy to overlook. They won’t cost you money, but they’re a security liability sitting around with permissions to resources that no longer exist.

6. CloudWatch and Logs

CloudWatch log groups persist forever by default. Set a retention period or delete them. Same goes for custom metrics and alarms — they’ll keep running (and potentially alerting) long after the resources they monitor are gone.

7. Cost Explorer: Verify the Cleanup

After everything is torn down, check Cost Explorer daily for a week. Lingering costs from resources you missed will show up here. I’ve caught forgotten NAT Gateways and orphaned Elastic IPs this way.

The Short Version

Archive and snapshot everything
Notify your team
Empty Route53 hosted zones before deleting them
Use S3 lifecycle rules for large bucket cleanup
Run terraform plan -destroy and review before applying
Clean up IAM, CloudWatch, and other “invisible” resources
Monitor Cost Explorer for a week after

Decommissioning isn’t glamorous, but doing it well saves you from the “why is this account still costing us $200/month?” conversation three months later. Take your time, follow the checklist, and always snapshot first.