AWS Outage Will Reshape the Cloud Industry
On October 20, 2025, AWS faced elevated error rates and latencies in US-EAST-1, starting at 12:11 AM PDT, impacting 105 services like DynamoDB, EC2, and Lambda. The issue stemmed from an internal subsystem affecting network load balancer health, causing connectivity and API errors. Mitigations began at 2:22 AM, with recovery by 2:27 AM. A DNS issue for DynamoDB was resolved by 3:35 AM, but EC2 launch errors persisted. AWS throttled EC2 launches and adjusted Lambda SQS polling, achieving significant recovery by 12:15 PM PDT, though some Lambda errors lingered. Thirty-four services, including CloudTrail, were fully resolved. AWS advised retrying requests and avoiding specific Availability Zones for EC2 launches.
Today AWS is experiencing an outstanding outage. Outages are common and well known in this industry and even measured with a couple of nines to be predictable. But today something catastrophic was experienced by millions, maybe even billions indirectly.
Most system designers consider using multiple data centers (Availability Zones in AWS jargon) to serve any boring service which would make users sad if it goes offline. If you have a system that should be dependable like the one I am responsible for (needs 24/7 uptime as much as possible) you go with multi-region deployment in the same cloud provider. This means groups of data centers far from each other and do not even share the same city mostly, yet alone the same electricity or fresh water supply.
Generally these multi-region systems are enough. For today it seems like enough for this size of outage since only US-East-1 is affected but all AZs are down. But if you are like me you will start considering multi-cloud deployments for critical systems that require 24/7 uptime as much as possible.
In the end money will make the decision but this preference will also make the multi-region deployments a commodity in the near future with upcoming market interest so there will be more room for people like me to promote multi-cloud infrastructures.