Cooling 400+ W boards as Deep Packet Inspection demands (and others) balloon
In the latest article in this series, David looks at the wider context of cooling 400+ W boards. How does it affect everyone in the AdvancedTCA ecosystem?
AdvancedTCA has two technology pushes: Increased throughput and increased processing needs. Shelf vendors are anticipating 40 Gbps boards. Faster throughput in itself requires an increase in processor performance, met by using multi-threaded, multicore devices. An increasing demand for Deep Packet Inspection (DPI) requires even more processing power. For processors, high power means increased heat dissipation and the need for increased cooling.
In this technology push it may well be that the limiting technology is not the backplane or the processor boards but the power distribution and cooling subsystems.
Boards in context
The design techniques needed to cool 400 W boards have been discussed in earlier columns in this series (see www.compactpci-systems.com and search “David Wright”), and published tools and techniques are available to ensure good results. Consider however that these techniques affect the design and maintenance of existing legacy boards. Legacy boards designed to work in a 200 W environment, where the airflow operating pressures are low, may well cause cooling system problems when fitted alongside high-power processor boards. The relationship between airflow and pressure (using the Communications Platforms Trade Association [CP-TA] B4 example) is quadratic. See Figure 1. The result is that a 200 W board running alongside a 400 W board would take 74 CFM at a pressure of nearly 1". This is wasted air in a world where cooling air is a valuable resource and should not be wasted.
The baffle requirements of PICMG 3.0 REQ 5.10 to REQ 5.15 need to be strengthened and upgraded to allow interoperability between low-dissipation boards needing low airflow and high-dissipation boards needing high values of airflow.
Shelf in context
Board cooling is a shelf requirement, but this is not the only area where 400 W boards impact shelf design. Power needs to be supplied to the boards and Rear Transition Module (RTM) (if fitted) and at 40 V for a double-width board, the current needed can be as high as 23 A. This is within the capability of a qualified Zone 1 connector, but it does require the backplane to handle sufficient current carrying capacity.
There is a second power requirement around the fans themselves. Our consultancy’s first AdvancedTCA shelf design providing 30 CFM at 0.15" required 120 W of fan power. Our current design that provides 100 CFM at 1.2" requires a maximum fan power of 1200 W. To enable the fans to run at full speed during a supply drop to 40 V, the shelf must deliver 30 A to the fan trays. The supply current path to the fan tray assemblies requires good power design practice.
With efficient fans most of this energy is used to move the air, but even with 65 percent efficiency there is still 420 W of additional heating we must herd out of the shelf. This additional heat source needs to be included in the shelf thermal design process.
High fan power levels bring EMC problems calling for innovative solutions. Fans are commutative devices, so choices have to be made to select soft commutating fans to reduce EMI. Fan trays require effective common mode and differential EMC filtering. Note though that as EMC threats are best dealt with at source, the innovative use of spread spectrum techniques in the fan power supplies can be used to reduce the fan EMI levels.
Shelf airflow specifications for both boards and shelves may well have been written in some cases by the marketing department, not by engineers or test technicians. The information is often inadequate and misleading, and, surprisingly, sometimes treated as a trade secret. PICMG, SCOPE Alliance, and CP-TA have endeavored to write realistic specifications for what is becoming a moving target. The AdvancedTCA community needs to revisit this area and rewrite the requirement and test specifications to avoid the specmanship and secrecy that seems to exist around the board and the shelf pressure flow curves.
System integration of 400 W boards in context
Telecoms and network equipment providers add value at the integration phase. This is where a supplier’s intellectual property generates revenue and profit. The AdvancedTCA ecosystem comprises ready developed products to form the basic product platform. The need to integrate these diverse products into the required platform is an unnecessary cost and time-to-market impact for the integrator. The OpenSAF and Service Availability Forum (SA Forum) organizations advocate the use of Hardware Platform Integration (HPI) tools to aid in managing shelf resources. HPI provides for cooling management above that of the Shelf Manager, and hooks can be added for the AIS to signal that there is a heavier processor load that could need increased cooling without waiting for thermal events.
The integration prior to loading applications is less well served where cooling different boards is involved.
Table 1 shows the relationship between board power and the cooling airflow and pressure. The majority of shelf products with adequate plena only work at a single pressure. There are always small differences in the actual slot airflows but in general if a single board is running at 400 W and all the boards have B4 impedances then all boards will receive 60 CFM whether they need it or not. This is not a problem for the board vendor or even the shelf supplier but with airflow as a limited engineering resource it makes life difficult for the systems integrators. As part of a specification review there is a need to allow the system integrator to set board airflow impedances to an optimum value.
Cool application-ready platforms
Where should this pre-applications integration process of both cooling and hardware take place? The answer until now has been at the integrator’s site. Is there a need for the outsourcing of the application-ready process not just for the Tier 2 and 3 providers, but also for Tier 1 companies? With the time and monetary costs to revalidate cooling being high, a tendency to remain tied to limited approved suppliers is understandable. However, this approach mitigates the economic benefits of a robust ecosystem.
We would suggest that there is a need for application-ready shelves to cover every aspect of pre-integration. By this we mean the cooling task itself, IPMI, and electronic keying, HPI access and structure, and the ongoing management of cooling airflow balance and capability.
Improving the cooling infrastructure
The following are a few thoughts on improving the cooling infrastructure. Some practical and some fanciful (or maybe not).
The present systems management paradigm through the shelf manager, the systems’ HPI and AIS are focused on maintaining the level of system performance. Current algorithms could introduce a decision tree that looks at the environmental issues before increasing system and processing power and hence enlarging the cooling demand.
Machine efficiency is defined as Useful Power Output/Total Power Input. Applying a rigorous definition to AdvancedTCA systems, the useful power output in terms of optical transducer power (a few watts at most) against our 400 W per board system shows efficiencies around 0.1 percent to 0.2 percent. Our definition of useful work probably cannot be quantified. A more useful analysis of Cooling System Efficiency would be to look at the ratio of power dissipated in the payload boards to total system power (or the ratio of fan power dissipation to total system power). Even so, the efficiencies will be seen to be low. The heat produced by a shelf can be defined as low-grade heat . This heat needs to be moved out of the local environment. This further degrades the system efficiency when the power input needed to run the external cooling systems is factored in.
Much development effort worldwide has been exerted on extracting useful energy from low-grade heat. Efficiencies are very low when considering the useful electrical output. However, one side effect of these heat engines is cooling. If we rearrange the definitions and compare the 10 kW of electrical energy not expended on cooling plants the cooling efficiencies can be seen to be very high.
At present we (seem to) have three alternative electronic cooling mechanisms: Air, liquid, and solids (conduction). There is a wide range of choices for conduction cooling, a range that includes many metals as well as some ceramics. Liquid cooling also provides choices, but when it comes to gas convection cooling, natural or forced, we use air. The most obvious reason being that it exists all around us and costs nothing.
Equation 1 shows a commonly used formula for deciding the airflow needed for cooling.
This formula originates from air conditioning design where the focus is on maintaining proper air temperature. In electronics cooling, we are concerned with maintaining proper component temperature. The formula is based on definitions of standard air (which differ between ASHRAE and PICMG). Unfortunately standard air only exists in a few places as a statistical event.
Time to mix it up?
Air is a gas that cools by providing a heat transfer mechanism to sensitive components based on its mass, volume (density), thermal conductivity, and thermal capacity. Other gases can cool much more efficiently. As an instance, helium gas has nearly the same heat capacity as water, although it is much more difficult to handle. Some of the refrigerant gases can double cooling capacity. Even pressurizing regular air can increase cooling heat capacity by 20-30 percent. There may be mileage in an investigation of convection cooling using mixed gases.
David Wright has been involved in AdvancedTCA since 2001, and in practical electronics and mathematical modeling for much longer. He worked for 35 years for multinational companies MoD UK, GEC Ltd., MEL, a division of Philips NV, CTS Corp., and Hybricon, Inc. David operated as a consultant for nine years before co-founding Advanced Platforms Ltd. in Israel (previously Wickenby Ltd.). David functions as the AdvancedTCA, Systems Infrastructure Architect. The co-founder Kevin Gyllenberg provides embedded software and systems integration expertise. Advanced Platforms Ltd. provides consultancy, design services, and third-generation AdvancedTCA products.
 Including Advanced Platforms Ltd. www.advanced-platforms.com/40plusGnow
 One advantage of this airflow capacity is that the fans run very quietly under normal conditions.
 SAAM can help here. See www.advanced-platforms.com/saam.php