0x55aa
Back to Blog

Multi-Region Deployments: You Probably Don't Need It 🌍 (But Here's When You Do)

|
6 min read

Multi-region deployments are the architectural equivalent of a sports car. Impressive to talk about at parties, ruinously expensive to maintain, and most of the time you're just sitting in traffic anyway.

Yet somehow, every engineering team eventually ends up in the conversation: "Should we go multi-region?" And the answer — the honest one — is probably no. But the nuance of when the answer flips to yes is worth understanding deeply, because getting it wrong in either direction is painful.

Why Everyone Wants Multi-Region

The pitch writes itself. Netflix is multi-region. Stripe is multi-region. We should be multi-region.

It promises two things: disaster recovery (if us-east-1 catches fire, us-west-2 keeps serving traffic) and latency improvements (users in Tokyo don't have to wait for packets to bounce across the Pacific).

Both are real benefits. The problem is that they come with a cost profile that most teams dramatically underestimate — and there's almost always a cheaper path to the same outcome.

What Multi-Region Actually Costs You

Let me put numbers on this before the architecture diagrams start appearing.

Data transfer: AWS charges ~$0.02/GB for inter-region traffic. If your primary database is replicating 500 GB/day to a standby region, that's $3,000/month just for replication bandwidth before you've written a single line of application code.

Operational complexity multiplier: Every runbook doubles. Every database migration now needs to be safe for a lagged replica. Every deploy needs to reason about split-brain scenarios. Your on-call rotation just got significantly more interesting.

Data residency conflicts: You wanted one region for latency, but now GDPR wants to know exactly where EU user data lives. Congratulations, your latency optimization just became a compliance project.

At Cubet, we had a client who insisted on multi-region "for resilience" — their SLA requirement was 99.9% uptime. That's 8.7 hours of allowable downtime per year. A single-region setup with proper auto-scaling groups, cross-AZ deployments, and an RTO-tested failover procedure would have comfortably hit that target. Multi-region bought them approximately $180K/year in additional AWS costs to solve a problem they didn't have.

When You Actually Need It

That said, there are legitimate cases. Here's the honest list:

1. Regulatory data residency requirements If Australian law says Australian user data must stay in Australia, and you also have EU customers with GDPR obligations — congratulations, you're multi-region whether you like it or not. This isn't optional.

2. Sub-100ms latency for interactive workloads globally If you're building a real-time collaboration tool (think Figma-level responsiveness), a gaming backend, or financial trading infrastructure, physics wins. The speed of light from Sydney to Virginia is ~180ms round-trip. No amount of optimization changes that.

3. Your SLA requires it and your customer knows it does If you're selling 99.99% uptime to enterprise customers and they're paying for it, and your failure scenarios span single-region outages (which do happen — us-east-1 has had notable incidents), then multi-region active-passive is part of the product.

4. True blast radius isolation Some teams use multi-region not for latency but to isolate customer data — EU customers land in eu-west-1, US customers in us-east-1, never crossing. This is an entirely valid compliance and risk isolation strategy, even if each region is independently redundant.

The Middle Path Most Teams Miss

Before you go full active-active, there's a spectrum of options that solve 80% of the problems for 20% of the cost:

# Active-passive: promoted to active only during failover
resource "aws_rds_cluster" "primary" {
  cluster_identifier = "app-primary"
  engine             = "aurora-postgresql"
  availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
  # ...
}

resource "aws_rds_cluster" "replica" {
  cluster_identifier        = "app-replica"
  engine                    = "aurora-postgresql"
  replication_source_identifier = aws_rds_cluster.primary.arn
  # Replica is in the same account, different region
  provider = aws.us_west_2
}

An Aurora Global Database active-passive setup gets you an RTO under 60 seconds on region failure, with replication lag typically under 1 second. For most "we need DR" conversations, this is the answer — not active-active traffic splitting.

For latency, CloudFront (or any CDN) with edge caching eliminates the "my API is slow for users in Singapore" problem for 95% of read-heavy workloads without running a second database.

When Active-Active Genuinely Makes Sense

If you've worked through the above and still need active-active — where multiple regions simultaneously serve writes — here's what you're committing to:

# Example: Route 53 latency routing (NOT failover — both regions active)
resource "aws_route53_record" "api" {
  for_each = {
    "us-east-1" = aws_lb.us_east.dns_name
    "eu-west-1"  = aws_lb.eu_west.dns_name
  }

  zone_id = var.zone_id
  name    = "api.example.com"
  type    = "A"

  latency_routing_policy {
    region = each.key
  }

  alias {
    name                   = each.value
    zone_id                = var.lb_zone_id
    evaluate_target_health = true
  }

  set_identifier = each.key
}

Now your application must be designed for eventual consistency or CRDTs — two users updating the same record in different regions simultaneously is a real problem your data layer must handle. Most OLTP applications are not designed for this and retrofitting it is a significant engineering project.

The Decision Framework

Run through these questions in order:

  1. Is data residency regulated? If yes, you're already going multi-region for compliance — optimize for that.
  2. What's your actual uptime SLA? 99.9% = single region fine. 99.99% = evaluate. 99.999% = you need multi-region and a serious budget.
  3. Where are your users? Use Cloudflare Analytics or your CDN logs. If 90% are in one geography, serve them from the nearest region and cache aggressively.
  4. What's your RTO/RPO? For most SaaS companies, an RTO of 15 minutes is perfectly acceptable. Aurora Global Database gives you that for a fraction of active-active complexity.
  5. What's the bill? Model it in AWS Cost Explorer before committing. Multi-region often costs 2-3x single-region once you factor in data transfer, duplicate compute, and the engineering hours to maintain it.

The Honest Takeaway

Multi-region is a genuine engineering achievement when it's warranted. It's also one of the most effective ways to double your AWS bill and triple your incident complexity if you jump to it prematurely.

The teams that do it well start with a clear business requirement — compliance, latency SLA, or a contractual uptime guarantee — and architect backward from there. The teams that do it poorly start with "we should be more resilient" and architect forward into a distributed systems nightmare.

Nine times out of ten, cross-AZ redundancy, a CDN, an Aurora read replica, and a tested failover runbook will get you where you need to go. Save multi-region for when the requirements genuinely demand it.

Your future on-call engineer will thank you.


Are you running multi-region or evaluating whether you need it? I'd love to hear what drove the decision — reach out on X/Twitter or drop it in an issue on GitHub.

Thanks for reading!

Back to all posts