Delivering breakthrough performance in AdvancedTCA thermal design
Doug reviews the thermal management issues facing AdvancedTCA designers and then describes how a 14-slot AdvancedTCA chassis was designed to exceed the challenging thermal management standards set by the Communications Platforms Trade Association (CP-TA) Interoperability Compliance Document (ICD) Class B.4.
Most industry analysts agree that the growth in AdvancedTCA adoption has been slower than previously predicted. Two technical reasons for this have been cited: interoperability and scalability. Bandwidth scalability is being addressed with the emergence of 40 GbE enabled AdvancedTCA form factor backplanes, and one PICMG subcommittee is developing interconnect channel performance masks to ensure backwards compatibility to the PICMG 3.0 standard. However, thermal performance also needs to be able to scale to accommodate the power dissipation needs of higher speed backplane transceivers.
Achieving interoperability between shelves and blades from different vendors has also proven to be a challenge. The PICMG 3.0 Specification stipulates a 200 W per slot thermaldissipation requirement. When developing AdvancedTCA-compliant component solutions, shelf suppliers have found that achieving airflow uniformity across all slots to be a less than trivial problem to solve while blade suppliers have produced blades that, while AdvancedTCA-compliant, nevertheless have airflow impedance characteristics that vary widely. The emergence of 300 W blades for some applications has further complicated thermal management.
This interoperability problem has resulted from the fact that the PICMG AdvancedTCA specification stipulates requirements at a very high level, which leaves scope for some ambiguity in interpretation. The Communications Platforms Trade Association (CP-TA) was founded to address this issue.
The CP-TA's charter is to solve interoperability problems through defining hard limits to which compliant AdvancedTCA components must adhere in order to be considered CP-TA compliant and therefore interoperable. These are described in the CP-TA's Interoperability Compliance Document (ICD). Along with a set of associated test procedures and a suite of prescribed test equipment the CP-TA provides a blueprint for development of interoperable blades and chassis that AdvancedTCA component suppliers can adopt.
In the thermal domain, the ICD ensures shelf and blade interoperability by defining maximum blade air flow impedance and classifying shelf performance into four categories based on volumetric airflow capacity, pressure drop, and uniformity per slot.
Framing the challenge
The commercial appeal of AdvancedTCA is based on multiple suppliers of shelves and blades competing and selling their products to third-party integrators who can easily configure, test, and ship systems using industry standard building blocks. The promise is that any blade should fit in any slot in any platform and operate satisfactorily.
The compelling combination of commercial competition and standard components is expected to drive prices down and volumes up.
This promise has yet to be realized. Interoperability between AdvancedTCA building blocks has proven to be a significant issue, and one primary reason for this lies in thermal performance characteristics of different blades and shelves from around the ecosystem. Blades have varying thermal impedances, and many shelves deliver nonuniform airflow distribution.
This leads to some shelf slots being able to cool certain blade types while other slots in the same shelf might not. Or shelves from one supplier might provide sufficient air, and other shelves might not. In this situation certain configurations end up with thermal hotspots, which ultimately lead to long-term reliability problems and make interoperability nothing more than a visionary notion.
Engineering a solution starts with understanding what is causing these problems. The issues are summarized in Table 1.
Airflow capacity is a function of the fan selected for the task. By understanding how much power will be dissipated by slot and across the system it is easy to calculate how much bulk airflow is needed in order to deliver a certain temperature gradient from input to output. This determines the ideal fan performance required.
Fans use rotating impeller blades to suck air in and push it out at high pressure. Unfortunately the rotating impellers cause the air to rotate, and if not controlled effectively, the rotating air from a bank of fans forms eddies of negative pressure, which causes peaks and troughs in airflow across the system. This phenomenon is known as fan swirl and is illustrated in Figure 1.
Airflow impedance of shelf elements
The mechanical design of the shelf (inlets, filters, card guides, exhausts, backplane, and the like) introduces impedances for the airflow, resulting in airflow pressure drop. The mechanical design of the shelf must be optimized to minimize these impedances.
Without careful management these elements can combine to create a nonuniform airflow distribution. A typical industry example is shown in Figure 2.
Blade airflow thermal impedance characteristics
The ability to cool any blade is not only determined by the volume of air available from the chassis for cooling. It also depends upon the airflow impedance of that blade. When designing an AdvancedTCA blade, ideally the end profile of the blade should be uniform, and power hungry components requiring heatsinks should be carefully distributed across the card to maximize exposure to the airflow.
However, even when a blade's thermal im-pedance is optimized, given the slot-to slot airflow variance illustrated in Figure 2,it can be seen how its thermal operating characteristics could be compromised if it were located in slot 9 versus slot 7.
For a COTS ecosystem to flourish, inte-grators must have the flexibility to con-figure platforms with any blade in anyslot. This is only possible if every AdvancedTCA chassis is able to deliver uniform airflow per slot, and every blade conforms to maximum slot airflow impedance criteria. The Communications Platforms Trade Association is spearhead-ing the industry-wide coordination to make this a reality.
The CP-TA has set the terms of engage-ment for AdvancedTCA component inter-operability through their Interoperability Compliance Document and Test Procedure Manual (TPM). The TPM defines test procedures to enable subsystem suppliers to determine whether their products comply to the ICD. The ICD and TPM provide the blueprints for interoperability compliance for modular communications platform building blocks. Using the TPM, vendors are able to test their products for CP-TA compliance as a first step towards full CP-TA certification.
For thermal interoperability, the ICD has defined the worst permissible slot thermal impedance for a blade in a worst case ambient operating environment. This is defined by the ratio of pressure drop over a slot divided by the square of volumetric airflow across that slot and is 52 Metric Flow Impedance (MFI) Units for a front board and 1243 MFI for a Rear Transition Module (RTM) board.
Figure 3 shows four levels of thermal management performance based on a chassis's ability to deliver volumetric airflow and pressure across that impedance under a worst case environmental ambient condition of 55 ¬∫C.
The ICD also specifies that airflow must be balanced within each slot from front to back within four zones. Ideally each zone should deliver 25 percent of the airflow for the slot, but for minimum acceptable performance it is mandated that each zone will provide a minimum of 20 percent of the slot airflow within each slot.
Thermal management blueprint
In setting out to design the TurboFabric Scalable ATCA Platform, we used the CP-TA ICD document as the blueprint for development of the thermal management system.
The TurboFabric 14-slot is designed to offer scalable bandwidth and features a 40 GbE enabled backplane. The product can be populated with 10 GbE blades and deployed today then upgraded in the field later once 40 GbE cards become commercially available.
To make the product truly scalable the thermal management system must be capable of scaling upwards to accommodate the anticipated higher power consumption of IEEE 802.3ap 10GBASE-KR-compliant backplane transceivers.
We used the ICD Classification B.4(40 CFM/slot front and 5 CFM/slot RTM) as the design target for thermal manage-ment system performance of the TurboFabric entry level product, a push-only system. To provide scalable headroom in thermal performance, the design specification stipulated that a field upgradeable unit comprising a set of upper fan trays should be accommodated. The target for the resulting push-pull system was to deliver 50 CFM per slot for the front and 5 CFM/slot for the rear.
This level of performance had to be achieved while meeting the NEBS and ETSI acoustic performance standards and while delivering acceptable cooling performance under fan failure conditions as specified in the CP-TA ICD document.
Thermal design process
In essence, optimizing thermal performance requires minimizing airflow impedances and their associated pressure drops and adopting a fan solution that can deliver the required air pressure while meeting the necessary acoustic performance parameters. Minimizing pressure drop across the system ultimately enables fans to run more slowly and helps meet noise requirements.
We began with the conception of a PICMG 3.0 compliant modular chassis (Figure 4).
The system comprises:
- Core 9.6U card frame and lower fan bay with built in EMC shielding
- Air inlet module
- Air filter module
- Air exhaust module
- Optional upper fan bay
Our development team simulated a push airflow system, with the fans located below the cards, a pull system with the fans located above the cards, and a push-pull system, with half the fans above and half below. FloTherm 3D thermal simulation software made extensive simulation possible.
Eventually we optimized the airflow impedance for all scenarios by distributing the available space to plenums at the inlets and outlets and carefully managing the relative positioning of air filters and grilles.
The design methodology adopted featured an iterative process of modeling, simulation, and validation as illustrated in Figure 5.
With airflow impedance optimization complete, attention then turned to the design of the fans and fan impeller blades.
The starting point was to determine the gross airflow requirements to meet Class B.4. Based on 40 CFM Front plus 5 CFM RTM per slot, for a push-only system a total of 14 x 45 CFM,or 630 CFM, is required.
The chosen architecture featured three fan trays each housing two fans. Therefore at the entry level push-only system, each fan had to be capable of delivering 105 CFM.
The secondary consideration was the targeted 50+5 CFM/slot push-pull system. The bulk airflow needs per fan for this scenario was 770 CFM, requiring an output of a total of 64 CFM per fan across a total of 12 fans.
The optimization of pressure drops inside the chassis enabled a customized fan to be designed that would provide the best air pressure and acoustic performance. The fan curve in Figure 6 shows the output of the fan in CFM relative to the static pressure it sees.
Centrifugal fan curves are usually characterized by a smooth slope starting at zero static pressure with flow increasing as static pressure is reduced. At some point this curve peaks, and the static pressure is reduced at the start of the surge or stall region of the curve. Except in rare special cases the fan should not be operated to the left of this peak, often called the knee ofthe fan curve, because this area is very un-stable, and small changes in static pressure may lead to large changes in flow. The ideal operating point is usually just to the right of the knee. We designed the fan to operate very efficiently at the static pressure generated by the optimized chassis design.
The final part of the design jigsaw relates to managing fan swirl. Fan swirl does not affect the total airflow through the fanbut does affect the distribution of air-flow, particularly the regions of high and low velocity immediately downstream of the fan, which in turn can have a major effect on the amount of air entering each card slot.
The modeling process for this included a detailed analysis of the fan blade design (Figure 7), to understand how it created swirl, and how that swirl could be regulated with customized airflow straighteners to be positioned immediately above the fans inside the fan trays.
As part of the quest for AdvancedTCA interoperability, CP-TA has mandated the use of Accusense Chassis Scan equipment from Degree Controls. Inc., and this was acquired to test the TurboFabric chassis.
The Chassis Scan system pictured in Figure 8 consists of four types of modules that fit into AdvancedTCA Chassis front and rear slots. Two of these modules, the Front Flow Measurement Board (FFMB) and the Rear Flow Measurement Board (RFMB), have airflow sensors in them. MS Windows-based software takes airflow data readings from the FFMB and RFMB and generates a CP-TA ICD and TPM compatible test report inMS Excel format.
Testing demonstrated that the design objectives were met.
As shown in Figure 9 the push-only system achieved in excess of 40 CFM per slot for the front boards with none of the zones delivering less than 8 CFM per slot. The push-only RTM delivered more than 5 CFM per slot. Therefore the push system can be declared to be compliant to CP-TA Class B.4 performance.
The push-pull system delivered in excess of 50 CFM per slot at the front, and once again, the zone to zone distribution per slot was around 25 percent per zone. The push-pull RTM test results show airflow capacity in excess of 5 CFM per slot. This not only meets but in fact exceeds CP-TA Class B.4 performance to the point that a further classification of CP-TA Class B.5 may be appropriate.
Given that the push-pull performance can be achieved by simply plugging in three upper fan-bay pull fan trays to supplement the lower push fan trays, it can be seen how TurboFabric enables field scalability in thermal performance to be achieved.
The Airflow Velocity Chart shown in Figure 10 compares TurboFabric performance with the "typical" shelf used as an example in Figure 2. The improvements in airflow uniformi-ty and volume per slot are readily apparent.
Fan failure conditions
As shown in Figure 11, the volume of air generated by the system is such that even when one fan fails, the system still delivers in excess of 40 CFM to each slot impacted by the failure.
That means that the TurboFabric System delivers leading edge cooling capacity (as it is defined by CP-TA Class B.4) even when operating under a fan failure condition.
A major challenge in thermal manage-ment system design is achieving the required airflow performance while meeting the environmental noise constraints laid down by NEBS and ETSI.
The NEBS noise limit for a single cabinet at 27 ¬∫C is 7.8 Bel, while the ETSI noise limit for a single cabinet at 23 ¬∫C is 7.2 Bel. Based on these specifications, the CP-TA ICD derived that a value of 6.7 Bel should be used for shelf performance comparisons.
To determine TurboFabric compliance a series of tests were performed. Figure 12 shows the noise power against the front card average airflow.
The Chassis Scan equipment was used to measure airflows for the maximum sound power output specified by NEBS and ETSI. The results are tabulated in Table 2.
The airflow results obtained demonstrate that the push-pull build fully meets B.4 classification requirements.
Interestingly the push-pull build is quieter than the push-only for any given airflow despite having double the number of fans. Typically the push-pull fan speed is significantly lower than the push-only fan speed to achieve the same airflow, which will have a significant impact on fan life and long term system reliability.
Achieving uniform slot-to-slot cool-ing performance is a prerequisite for AdvancedTCA interoperability. The CP-TA has laid down the guidelines for this to become a reality. However, to ensure maximum ROI for service providers the need to scale cooling capacity upwards as thermal demands increase is increasingly important.
The TurboFabric Platform demonstrates how applying state of the art modeling and simulation techniques to analyze chassis and fan designs, makes it possible to not only meet, but exceed CP-TA Class B.4 performance. TurboFabric broadens the AdvancedTCA application space and will enable even more robust applications to be deployed within AdvancedTCA shelves as a result of better thermal management.