Ptechhub
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs
No Result
View All Result
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs
No Result
View All Result
PtechHub
No Result
View All Result

Google Cloud, Cloudflare Apologize For Massive Outage

CRN by CRN
June 16, 2025
Home News
Share on FacebookShare on Twitter


‘We deeply apologize for the impact this outage has had,’ Google Cloud says.

Google Cloud and Cloudflare have apologized for the cloud outage that appeared to also down multiple popular websites and applications including Spotify and Discord.

The Mountain View, Calif.-based cloud and artificial intelligence giant said that it will work on addressing a Google Cloud customer monitoring infrastructure failure that happened during the outage and left users without incident signals or an understanding of how the outage affected their business.

“We deeply apologize for the impact this outage has had,” according to the Google incident report. “Google Cloud customers and their users trust their businesses to Google, and we will do better. We apologize for the impact this has had not only on our customers’ businesses and their users but also on the trust of our systems. We are committed to making improvements to help avoid outages like this moving forward.”

[RELATED: Cloudflare Blames Google Cloud For Mass Services Outages]

Google Cloud Outage

Dawn Sizer, CEO of Mechanicsburg, Pa.-based solution provider 3rd Element Consulting, said that her clients noticed some commercial websites weren’t working but did not experience any interference with their business.

“A blip,” Sizer said. “That’s what it boiled down to.”

Thomas Kurian, CEO of Google Cloud, posted on social media network X that “we regret the disruption this caused our customers.”

Cloudflare also experienced broad service outages on June 12 and published a report on what happened. Although during the outage, a Cloudflare spokesperson pointed multiple media outlets to Google Cloud as the source of the failure, the official Cloudflare report says that the infrastructure that failed is backed in part “by a third-party cloud provider” that experienced an outage the same day.

The vendor confirmed that the outage was not due to an attack or security event. No data was lost because of the outage. Its Magic Transit and Magic WAN, DNS, Cache, proxy, WAF and related services didn’t experience direct impacts.

“We’re deeply sorry for this outage: this was a failure on our part,” according to Cloudflare’s report. “While the proximate cause (or trigger) for this outage was a third-party vendor failure, we are ultimately responsible for our chosen dependencies and how we choose to architect around them. … This was a serious outage, and we understand that organizations and institutions that are large and small depend on us to protect and/or run their websites, applications, zero trust and network infrastructure. Again we are deeply sorry for the impact and are working diligently to improve our service resiliency.”

Google Cloud Crashed Loop

Google’s report on the incident puts the start time at 10:51 a.m. Pacific June 12 and end time at 6:18 p.m. the same day.

The issue dates back to a new feature added to Service Control, the core binary that is part of the check system that makes sure application programming interface (API) requests are authorized with appropriate policies to meet endpoints.

On June 12, Google inserted a policy change into the regional Spanner tables used for policies and caused a crash loop with unintended blank fields in the policy data. Google site reliability engineers (SREs) found the root cause within 10 minutes and turned off the policy serving path. The path disabling rollout completed within 40 minutes of the incident start time and Google started seeing recovery across its regions.

In larger regions, such as us-central-1, which includes Iowa, Service Control task restarts overloaded the infrastructure. Service Control didn’t have the right randomized exponential backoff to avoid the problem. US-central-1 took almost three hours to fully resolve, and Google throttled task creation to minimized infrastructure impact. Cloud Service Health infrastructure failing during the outage prevented the first public incident report publishing on Google’s status page for about an hour.

Moving forward, Google plans to modularize Service Control’s architecture to isolate the functionality and “fail open”–that is, default to an accessible state if a future failure happens.

Google will also audit all systems that consume globally replicated data, enforce all changes to critical binaries with feature flag protection and disabled by default, improve static analysis and testing practices to correctly handle errors and fail open if needed, ensure randomized exponential backoff in systems, improve automated and human external communications to inform users and make sure monitoring and communication infrastructure remains operational even when Google Cloud and primary monitoring products are down, according to the vendor.

Cloudflare Works To Improve Resiliency

Cloudflare, meanwhile, said in its own report that its Workers KV key-value data store saw the failure of underlying storage infrastructure that is backed in part by Google Cloud. Worker KV “is a critical dependency for many Cloudflare products and relied upon for configuration, authentication and asset delivery across the affected services.”

Cloudflare put its outage at about two-and-a-half hours. Workers KV saw about 90 percent of requests fail, although data stored in Workers KV was not affected.

The incident started June 12 at 17:52 Coordinated Universal Time (UTC). The impact ended the same day at 20:28. Moving forward, Cloudflare plans to prevent singular dependencies on third-party storage infrastructure to improve recovery.

Cloudflare will improve Workers KV storage infrastructure redundancy, implement “short-term blast radius remediations” to make each product resilient to loss of service caused by a single point of failure, implement tooling to progressively re-enable namespaces during storage infrastructure incidents.

“This list is not exhaustive,” Cloudflare said in the report. “Our teams continue to revisit design decisions and assess the infrastructure changes we need to make in both the near (immediate) term and long term to mitigate the incidents like this going forward.”



Source link

Tags: Application and Platform SecurityCloud PlatformsCloud SecurityCloud SoftwareCloud StorageCybersecurityManaged Securitynetwork securitySecurity operations
CRN

CRN

Next Post
Google: ‘Multiple Intrusions’ In US Likely Linked To Infamous Hacker Group

Google: ‘Multiple Intrusions’ In US Likely Linked To Infamous Hacker Group

Recommended.

Stocks making the biggest moves premarket: BP, Nucor, Tesla, Meta Platforms and more

Stocks making the biggest moves premarket: BP, Nucor, Tesla, Meta Platforms and more

February 10, 2025
Wiz: Misconfigured AWS System Could Have Enabled Largest-Ever Supply Chain Attack

Wiz: Misconfigured AWS System Could Have Enabled Largest-Ever Supply Chain Attack

January 15, 2026

Trending.

Chai AI Announces Upcoming Rollout of Apple and Google Age Verification APIs to Enhance Platform Safety

Chai AI Announces Upcoming Rollout of Apple and Google Age Verification APIs to Enhance Platform Safety

March 10, 2026
Huawei lanceert Next Generation FAN-oplossing

Huawei lanceert Next Generation FAN-oplossing

March 7, 2026
Baidu Announces Fourth Quarter and Fiscal Year 2025 Results

Baidu Announces Fourth Quarter and Fiscal Year 2025 Results

February 26, 2026
Half of Google’s software development now AI-generated | Computer Weekly

Half of Google’s software development now AI-generated | Computer Weekly

February 5, 2026
Ghost Campaign Uses 7 npm Packages to Steal Crypto Wallets and Credentials

Ghost Campaign Uses 7 npm Packages to Steal Crypto Wallets and Credentials

March 24, 2026

PTechHub

A tech news platform delivering fresh perspectives, critical insights, and in-depth reporting — beyond the buzz. We cover innovation, policy, and digital culture with clarity, independence, and a sharp editorial edge.

Follow Us

Industries

  • AI & ML
  • Cybersecurity
  • Enterprise IT
  • Finance
  • Telco

Navigation

  • About
  • Advertise
  • Privacy & Policy
  • Contact

Subscribe to Our Newsletter

  • About
  • Advertise
  • Privacy & Policy
  • Contact

Copyright © 2025 | Powered By Porpholio

No Result
View All Result
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs

Copyright © 2025 | Powered By Porpholio