Flammable clouds — part deux: What happened at OVH?

Mike Puchol
8 min readMar 17, 2021
Credit: Sapeurs-Pompiers du Bas-Rhin

After my recent article about the fire at the OVH data center in Strasbourg, which destroyed an entire building and thousands of servers, and seriously damaged a second building, I tried to find out more information about the design of data centers (DCs) that could explain how the entire structure of SBG2 could have become fully involved in a matter of minutes.

Before I begin, I would like to commend OVH on being very open and transparent with information about the event, and ongoing efforts, as they get themselves and their customers back into service — stay strong!

Designed for green

In a remarkable and necessary effort by most large-scale DC operators, facilities are becoming more sustainable, particularly from an energy consumption point of view. Servers stacked in a DC consume vast amounts of power — OVH’s SBG2, at 30,000 servers maximum capacity, consuming 100W each, clocks in at 3 Megawatts just to keep the machines up. Baxtel confirms this figure at 2.0 MW, which would indicate the DC is not running at full capacity, but still enough to power some 1,400 average homes.

A by-product of each server operating is heat, which must be conducted away, or risk hardware malfunctions. Traditionally, this is achieved with massive air-conditioning units that pump cold air into the equipment rooms, and HVAC systems that extract the heated air away from the facility.

Modern DCs employ alternative methods of cooling, in OVH’s case, water-based systems that draw heat away from each server into a heat exchanger. In addition, the DC building is designed with natural ventilation characteristics, as is explained in one of their videos:

Credit: OVH YouTube channel

Of particular interest is how the DC is designed to allow air in via side vents, which passes through the infrastructure, and exits once heated through a vertical center column:

OVH design for natural air cooling. Credit: OVH YouTube channel

We can see these vents very clearly in photos of SBG2 before the fire:

OVH SBG2 before the fire. Source: Google Street View.
3D view of SBG2, showing the hollow central column. Source: Google Maps 3D view.

We can see how the natural air cooling worked, by allowing outside air into the structure via the colored vents, and out, once heated, through the two central openings.

Does this remind you of anything?

Yes, a fireplace. It takes an ignition source (a match), fuel (wood), and oxygen, and gives off that awesome glow and warmth we all love on a cold winter night.

The hot air and combustion products rise up the chimney, causing a drop in pressure. This pressure drop, in turn, causes the higher pressure air in the room to be drawn into the inlet of the fireplace, providing more oxygen to the combustion process, and increasing the heat release rate.

Heat release rate is a very important concept in modern firefighting, which we will see played a crucial role in the OVH fire.

An amateur fire investigator’s view of the OVH fire

The natural instict of people watching the results of a fire and the destruction caused is to focus on what got burned. However, it is as important to take into consideration what did not burn.

Let’s start with this photo taken by a drone during the most active phase of the fire:

What is remarkable is how the gases are being vented from the top of the building, with almost no smoke or flames coming out the sides. The thin side panels have been compromised and melted, creating larger holes that helped ventilating the fire.

A still from OVH’s video shows the interior of one of their sites being constructed:

We can see a metal frame, with wood structural elements, and thin corrugated metal sheet, combined with the air intake vents, which don’t appear to have any form of mechanical control that would allow the DC operators to shut them. As we all know, wood is more flammable than metal sheets, so there is an increased fuel load available to a fire once it starts, unless it is well protected.

Let’s take a look at post-fire photos of SBG2, and what we can infer from them:

Can you spot the things that did not burn? The air intake vents, and the fire escape staircase exit doors.

Starting at the vents, we can see that there is soot and discoloration on the top 3–4 blades, but the rest of the vent is almost intact. The thin side panels have, however, disintegrated, indicating very high temperatures.

The vents could have been made from a different material that is more resistant to heat, but in addition, they became the source of oxygen for the combustion process inside the building. Some smoke and combustion gases exited from the top of the vent, but most of the vent had cool air coming in.

This video shows a high-speed replay of a single room going from ignition to flashover, at the 30-second mark you can clearly see the neutral plane, with gasses venting out the top of the door, and fresh air being fed through the bottom. As the heat release rate increases, the neutral plane lowers, and eventually the whole compartment becomes engulfed.

The doors that lead to the fire escape ladder did not get damaged, except some soot from smoke venting through the edges. The primary reason for this is fire escape exit doors must have a fire resistance rating. Owen Roats provides a very good insight into different types of doors and their ratings. In essence, these doors must be at least FD30 rated, meaning they must resist fire for at least 30 minutes. As the fire progressed up the structure, the heat release was enough to compromise the thin side panels, but not the fire exit doors, or the air intake vents, which were self-cooling for some time.

Heat release rate and fuel load

I previously mentioned the concept of heat release rate. In order to get familiar with the basic concepts of heat release, flow paths, ventilation, and combustion, I recommend this training video by P.J. Norwood and Sean Gray, where they use a wooden dollhouse to explain the basics of fire behavior. It’s 15 minutes, but well worth the time if you want to learn more:

The second important concept to consider is the fuel load. A large warehouse full of a small amount of materials with high amounts of stored energy can be more dangerous than a small room totally filled up with barely flammable materials. Think a can of gasoline vs. ten pallets of plaster wall panels.

Thus, for a given fuel load, environmental conditions, and heat release rate, there is a minimum suppression capacity required in order to contain the fire. If you could get plaster wall boards to ignite, you could put them out with a toy water pistol, whereas you wouldn’t stand a chance against a pool of gasoline.

In traditional DC design, one would take into account the potential fuel load in a given compartment (rooms could be made smaller to reduce), expected heat release rate given environmental conditions and fuel load (the servers, cables, etc. plus a ventilated atmosphere which adds oxygen to a fire), and then design the required detection capabilities to make sure a fire gets caught while it can be contained, and the suppression system that could contain said fire — of course with ample safety margin to allow for eventualities.

In a DC that is constructed out of wood and sheet metal, has natural ventilation holes all around, a large vertical shaft that acts as a chimney, and thousands of servers, cabling and supporting power infrastructure, all consuming some 2MW of power, both your detection and suppression capabilities need to be sized way more conservatively, as any fire that starts will have even more flow paths to grow, more fuel to attack, and as a result, a significantly higher heat release rate.

The bottom line for OVH

It seems the current design of OVH’s DCs follows similar approaches, where cost and energy efficiency are paramount. Octave has posted several photos on his Twitter profile which support the conclusion that OVH builds its datacenters with the planet in mind, but also with very tight cost controls, which in turn allow them to offer services at very competitive price points:

We can see no red pipes that could carry water or inert gases, no “technical” raised floors that could carry suppression systems in them.

These are all valid tradeoffs, if you are aware that as a business, you run the risk of loss to fire at some stage. I cannot possibly believe that the Strasbourg fire was a freak accident, due to a faulty UPS triggering the combustion of a large building and all its contents in a matter of minutes. There have been calculated decisions that balanced CAPEX, OPEX and margins, which lead to compromises, which in the case of SBG2, led to its destruction.

OVH could do a lot more without spending a ton of cash, in order to make their DCs less vulnerable to similar events:

  • Run a full audit of fire safety on all their DCs, concentrating specifically on fuel load, fire spread parameters, and alarm/suppression capabilities.
  • Increase sensitivity and detection points of the fire alarm system.
  • Section the DCs into smaller compartments, with FR doors and walls between them.
  • Seal all gaps that fire can use to expand or draw oxygen from other areas, such as gaps to pass cable trays through walls.
  • Train staff on the use of SCBA and PPE and equip them, which would allow an increase to their safe exposure time in unfavourable environments, and perform containment activities longer.
  • Implement ventilation control mechanisms — even simple manually-activated trap doors that seal the inlet vents. A couple of staff members could have likely closed all the vents, moving from the ground floor upwards through to the top floor, and exited safely via the fire escape staircase.

The bottom line for OVH’s customers

If you are a customer of OVH, you need to think hard about your backup strategy, disaster recovery, and business continuity plans, if you are running any form of mildly business-critical or mission-critical systems.

This could happen tomorrow at another DC, if the fire load and ventilation configuration are similar, combined with similar fire alerting and suppression systems.

Thus, take into account that not having to purchase your own metal, contract space at a DC, install and operate, all comes at a cost of convenience. You are paying low monthly fees for “metal as a service”, only because someone else has figured out how to charge them while keeping their own business viable — and which corners they had to cut in order to do so.

--

--