Ptechhub
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs
No Result
View All Result
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs
No Result
View All Result
PtechHub
No Result
View All Result

Amazon’s Outage Root Cause, $581M Loss Potential And ‘Apology:’ 5 Key AWS Outage Takeaways

CRN by CRN
October 27, 2025
Home News
Share on FacebookShare on Twitter


From the root cause of Amazon’s outage to its potential $581 million cost, CRN breaks down the five important results and findings from AWS’ new post-mortem report. ‘We will do everything we can to learn from this event and use it to improve our availability even further,’ AWS says.

Amazon’s outage that affected thousands of companies and millions of people was caused by two automated systems updating the same data simultaneously, leading to a DNS (Domain Name System) issue that brought down AWS’ DynamoDB database.

Cyber risk analytics firm CyberCube just released a preliminary insured loss estimate for AWS’ outage, projecting a loss of up to $581 million.

“We apologize for the impact this event caused our customers,” said AWS in its post-mortem report of the outage results and root cause.

[Related: AWS’ 15-Hour Outage: 5 Big AI, DNS, EC2 And Data Center Keys To Know]

“We know this event impacted many customers in significant ways,” AWS said. “We will do everything we can to learn from this event and use it to improve our availability even further.”

CRN breaks down the five biggest things to know about AWS outage that every Amazon customer, partner and user needs to know.

No. 1: CyberCube Estimates Up To $581 Million In Losses

CyberCube has released a preliminary insured loss estimate for the AWS outage, projecting a range of between $38 million and $581 million.

Cybersecurity risk analytics provider CyberCube said the outage impacted more than 2,000 large organizations and around 70,000 organizations overall.

AWS is expected to reimburse affected companies for downtime, which may limit insured losses and discourage litigation, according to the security analytics firm.

CyberCube said many customers might choose not to file claims, which is a factor contributing to its lower-end loss projection, because the outage lasted less than a day. The company expects the outage to have a low to moderate impact on cyber insurers, with the majority of losses likely to be in the lower end of CyberCube’s range.

Read through for the four other big things to know about AWS’ outage, including the root cause and changes AWS plans to make.

Broken data chain. Data hacked concept

No. 2: The Root Cause Bug That Caused DynamoDB To Go Down

At around 2:48 a.m. ET on Oct. 20, a critical fault in DynamoDB’s DNS management system cascaded into a roughly 15-hour outage that eventually disrupted millions of people.

The root cause of the outage stemmed from two automated systems that were updating the same data simultaneously.

AWS said the issue was with two programs competing to write the same DNS entry at the same time, which resulted in an empty DNS record for the service’s regional endpoint.

The error rate was triggered by “a latent defect” within the service’s automated DNS management system, Amazon said, which controls how user requests are routed to servers.

This led to the accidental deletion of all IP addresses for the database service’s regional endpoint.

The DNS issue brought down AWS’ DynamoDB database, which then created a cascading effect that impacted many AWS services such as EC2 and its Network Load Balancer.

Futuristic background with hexagon shell and hole with binary code and opened lock. Hacker attack and data breach. Big data with encrypted computer code. Safe your data. Cyber internet security and privacy concept. 3d illustration

No. 3: Digging Into DynamoDB’s DNS Root Cause

Amazon said the outage was caused by a race condition in DynamoDB’s automated DNS management system that left an empty DNS record for the service’s regional endpoint.

Amazon’s DNS management system is made up of two separate components: a DNS Planner that monitors load balancer health and builds DNS plans, and a DNS Enactor that applies changes via Amazon Route 53.

The race condition occurred when one DNS Enactor experienced “unusually high delays” while the DNS Planner continued generating new plans, according to Amazon.

A second DNS Enactor then began applying the newer plans and executed a clean-up process just as the first Enactor completed its delayed run.

This “clean-up” deleted the older plan, which immediately removed all IP addresses for the regional endpoint and left the system in an inconsistent state that prevented further automated updates applied by any DNS Enactors.

Before manual intervention, systems connecting to DynamoDB experienced DNS failures—including customer traffic and internal AWS services—which impacted EC2 instance launches and network configuration, Amazon said.

Network Load Balancer Issue

Following the DNS issue, AWS’ Network Manager began propagating a large backlog of delayed network configurations, causing newly launched EC2 instances to experience network configuration delays.

These network delays affected AWS’ Network Load Balancer (NLB) service.

NLB’s health checking subsystem deleted new EC2 instances that failed health checks due to network delays, only to then restore them when subsequent checks succeeded.

With EC2 instance launches impaired, dependent AWS services including Lambda, Elastic Container Service (ECS) and Elastic Kubernetes Service (EKS) all experienced issues.

“The root cause of this issue was a latent race condition in the DynamoDB DNS management system that resulted in an incorrect empty DNS record for the service’s regional endpoint (dynamodb.us-east-1.amazonaws.com) that the automation failed to repair,” Amazon said. “All systems needing to connect to the DynamoDB service in the [AWS North Virginia US-East-1 data center] Region via the public endpoint immediately began experiencing DNS failures and failed to connect to DynamoDB. This included customer traffic as well as traffic from internal AWS services that rely on DynamoDB.”


No. 4: Amazon Making Changes To Prevent Similar Outages

Amazon said it is making several changes to its systems following the outage, including fixing the “race condition scenario” that caused the two automated systems to overwrite each other’s work.

AWS has disabled the DynamoDB DNS Planner and DNS Enactor automation globally until safeguards can be put in place to prevent the race condition reoccurring.

Amazon also said it will build an additional test suite to help detect similar bugs in the future and improve throttling mechanisms.

“As we continue to work through the details of this event across all AWS services, we will look for additional ways to avoid impact from a similar event in the future, and how to further reduce time to recovery,” said Amazon.

“We know this event impacted many customers in significant ways,” Amazon said. “We will do everything we can to learn from this event and use it to improve our availability even further.”


No. 5: Former AWS Executive Says It Was ‘Inevitable’

Former AWS top executive Debanjan Saha told CRN that the AWS outage was “inevitable.”

“Given their massive global scale and the complexity of these distributed systems, it’s actually remarkable that large-scale disruptions like this are as rare as they are,” Saha said in an email to CRN.

AWS’ outage was “inevitable over a long enough horizon,” he said, as both public cloud and private cloud providers will eventually experience an outage.

“The question is not if, but when,” Saha added.

Saha was AWS’ former vice president and general manager of AWS’ database business from 2014 until 2019, before jumping to competitor Google Cloud in 2019 as vice president and general manager for Google’s data analytics business. He became CEO of DataRobot in 2022.

“Every business that relies on cloud infrastructure should have a clear strategy for resiliency,” Saha said. “That means thinking beyond a single data center or region, and ideally beyond a single provider, building for multi-region—and where possible—multi-cloud or hybrid environments.”

Amazon is set to report the quarterly financial results from the third quarter of 2025 on Oct. 30.



Source link

Tags: AIApplication and Platform SecurityArtificial IntelligenceBusiness Intelligence and AnalyticsCloud PlatformsCloud SecurityCloud SoftwareCloud StorageCyberattacksCybersecurityCybersecurity FrameworkEndpoint SecurityManaged Securitynetwork securitySecurity operationsVulnerabilities
CRN

CRN

Next Post
GDT enhances client-first managed services platform powered by Webex to redefine the future of customer experience

GDT enhances client-first managed services platform powered by Webex to redefine the future of customer experience

Recommended.

ZTE-CDO Cui Li hält auf der AI Innovation Asia 2025 von Economist Impact eine Rede

ZTE-CDO Cui Li hält auf der AI Innovation Asia 2025 von Economist Impact eine Rede

December 7, 2025
Wells Fargo shares jump after earnings beat, strong 2025 guidance

Wells Fargo shares jump after earnings beat, strong 2025 guidance

January 15, 2025

Trending.

Chai AI Announces Upcoming Rollout of Apple and Google Age Verification APIs to Enhance Platform Safety

Chai AI Announces Upcoming Rollout of Apple and Google Age Verification APIs to Enhance Platform Safety

March 10, 2026
Huawei lanceert Next Generation FAN-oplossing

Huawei lanceert Next Generation FAN-oplossing

March 7, 2026
Baidu Announces Fourth Quarter and Fiscal Year 2025 Results

Baidu Announces Fourth Quarter and Fiscal Year 2025 Results

February 26, 2026
Half of Google’s software development now AI-generated | Computer Weekly

Half of Google’s software development now AI-generated | Computer Weekly

February 5, 2026
Huawei uvádí na trh řešení FAN nové generace

Huawei uvádí na trh řešení FAN nové generace

March 6, 2026

PTechHub

A tech news platform delivering fresh perspectives, critical insights, and in-depth reporting — beyond the buzz. We cover innovation, policy, and digital culture with clarity, independence, and a sharp editorial edge.

Follow Us

Industries

  • AI & ML
  • Cybersecurity
  • Enterprise IT
  • Finance
  • Telco

Navigation

  • About
  • Advertise
  • Privacy & Policy
  • Contact

Subscribe to Our Newsletter

  • About
  • Advertise
  • Privacy & Policy
  • Contact

Copyright © 2025 | Powered By Porpholio

No Result
View All Result
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs

Copyright © 2025 | Powered By Porpholio