Picture of a data center blackout

An ongoing heatwave in the United Kingdom has led to Google Cloud and Oracle Cloud outages after cooling systems failed at the companies' data centers.

For the past week, the United Kingdom has suffered an ongoing record-breaking heat wave causing stifling temperatures throughout the region.

However, today, with temperatures reaching a record-breaking 40.2 degrees Celsius (104.4 Fahrenheit), cooling systems at data centers used by Google and Oracle to host their cloud infrastructure have begun to fail.

To prevent permanent damage to hardware components and thus create a prolonged outage, both Google and Oracle have shut down equipment, leading to outages in their cloud services.

Oracle was the first to be affected, with the company reporting a cooling failure at approximately 11:30 AM EST today, causing "non-critical hardware" to be powered down.

"As a result of unseasonal temperatures in the region, a subset of cooling infrastructure within the UK South (London) Data Centre experienced an issue. This led to a subset of our service infrastructure needed to be powered down to prevent uncontrolled hardware failures," reads an Oracle Cloud status message that appears to have been first spotted by TheRegister.

"This step has been taken with the intention of limiting the potential for any long term impact to our customers."

However, even with only non-critical hardware powered off, Oracle states that customers in this zone may be unable to access their Oracle Cloud Infrastructure resources.

Almost two hours later, Google also reported cooling failures in one of their buildings hosting the europe-west2-a zone for region europe-west2.

"There has been a cooling related failure in one of our buildings that hosts zone europe-west2-a for region europe-west2. This caused a partial failure of capacity in that zone, leading to VM terminations and a loss of machines for a small set of our customers," reads the Google Cloud incident report.

"We're working hard to get the cooling back online and create capacity in that zone. We do not anticipate further impact in zone europe-west2-a and currently running VMs should not be impacted. A small percentage of replicated Persistent Disk devices are running in single redundant mode."

"In order to prevent damage to machines and an extended outage, we have powered down part of the zone and are limiting GCE preemptible launches. We are working to restore redundancy for any remaining impacted replicated Persistent Disk devices."

Like Oracle, this cooling failure is disrupting Google Cloud customers, with virtual machines being terminated, unreachable machines, and Persistent Disk devices running in single redundancy mode.

Both companies report that they do not expect any further impact as they work to bring cooling systems back online.

Cooling systems restored

Both Google and Oracle have resolved the cooling issues in their data centers, with service restored for Google on Tuesday and Oracle on Wednesday.

Google restored their services Tuesday night at 11:45 PM EST, with the following final status update.

"There was a cooling related failure in one of our buildings that hosts a portion of capacity for zone europe-west2-a for region europe-west2 that is now resolved. GCE, Persistent Disk and Autoscaling impacts have been addressed. Customers can launch VMs in all zones of europe-west2. A small number of HDD backed Persistent Disk volumes are still experiencing impact and will exhibit IO errors. If you are continuing to experience issues with these services, please contact Google Cloud Product Support and reference this message."

Oracle took a little longer to restore cooling, with services restored Wednesday at 7:00 AM EST.

"Following unseasonably high temperatures in the UK South (London) region, two cooler units in the data center experienced a failure when they were required to operate above their design limits. As a result, temperatures in the data center began to climb causing a subset of Compute infrastructure to go into protective shut down."

Update 7/20/22 4:55 PM EST: Added updates on cooling system issues.

Related Articles:

Reddit down in major outage blocking access to web, mobile apps

Google Meet opens client-side encrypted calls to non Google users

Google ad impersonates Whales Market to push wallet drainer malware

Google to crack down on third-party YouTube apps that block ads

Google Workspace rolls out multi-admin approval feature for risky changes