Adding advanced server-management features to ATCA IPMCs

4With substantial convergent forces pushing data center and network center functionality together, AdvancedTCA (ATCA) server blades can further strengthen their management credentials by adding advanced server-management facilities at the hardware platform management (HPM) level.

ATCA intelligent platform management controllers (IPMCs) are the rough equivalent of enterprise server baseboard management controllers (BMCs), which often include advanced features such as redirection of keyboard, video, and mouse (KVM) functions to remote management consoles. Such redirection can enhance the ability of a small group of system administrators to manage a large number of ATCA blade servers while minimizing the need to physically visit them, thereby maximizing management cost-effectiveness.

ATCA systems in converged network and data centers are under increasing cost pressure, so adding advanced redirection features is practical only if it can be done cost-effectively. With highly integrated IPMC silicon, the board-level incremental cost to add this functionality can be minimized as well.

Scaled-out server complexes that are not based on ATCA typically equip each server with a baseboard management controller (BMC), as defined in the widely used Intelligent Platform Management Interface (IPMI), which is also leveraged as a foundation of the ATCA HPM layer. Such BMCs typically support using Ethernet for remote access to those BMCs.

Every ATCA board, including both server blades and complementary boards, has an IPMC. That IPMC is likely connected to one or more inside-the-shelf networks, leveraging well-established PICMG-defined mechanisms such as PICMG HPM.2, the LAN-attached IPMC specification, and related specifications.

Figure 1 shows a typical approach to a “LAN-attached” ATCA IPMC: sharing the Ethernet access for the IPMC with Ethernet access for payload CPU(s) on the board. It is usually highly preferable to share the Ethernet connection(s) needed by the main CPU(s) with IPMI-based remote management traffic to avoid the cost and logistical challenges of maintaining a separate physical network for management. With recent revisions of the relevant PICMG specifications, management access via version 6 of the Internet Protocol (IPv6) is now defined.

Figure 1: Key communication interfaces of an IPMC and the payload CPU(s) it manages, along with the sideband interface that allows the IPMC to share the Ethernet connection(s) that primarily serve the main CPU(s). Serial console interfaces of the main CPU(s) can be routed to the IPMC, enabling remote access for them as well.

The most widely used sideband interface between an IPMC and the network controller (NC) is the Network Controller Sideband Interface (NC-SI), an open standard developed by the Distributed Management Task Force (DMTF). Most Ethernet NCs targeting server markets implement an NC-SI port and internal switching to allow IPMI traffic to share the NC with main CPU traffic.

A key IPMC benefit is management access when the main CPU(s) of a server is down, potentially allowing much quicker diagnosis of a failure. Obviously, in the shared Ethernet architecture shown in Figure 1, such access is only possible if the NC(s) and the IPMC are powered by separate management power. The extended management power domain in the figure highlights this aspect of the architecture. The hardware design for such a subsystem needs to be done carefully to ensure that the extended management power domain can be powered while the connected payload CPUs are not.

Another key topic in many IPMC applications is remote access to the serial ports on the payload CPU(s). Especially in scaled-out configurations with massive server counts, but even in smaller configurations, it can be highly preferable to avoid connecting one or more physical serial cables to each server or each CPU within each server.

IPMI defines a serial over LAN (SoL) architecture that is useful in this context. Payload CPU serial ports can be connected to the IPMC and a remote network-connected client of the IPMC can interact with those serial ports without needing any physical serial port connections. Such serial port access can be crucial, for instance, in diagnosing a malfunctioning server remotely. The remote client can see all serial traffic as the payload CPU(s) boots, starting from the very first character, since the SoL session(s) with the IPMC can be established before the payload CPU(s) is even powered on.

Additional advanced IPMC features

For some server architectures, key subsystems may rely on a richer human interface than simple serial ports. Remote access to these subsystems may require the ability to virtually attach a remote KVM to an arbitrary server in a large configuration. The remote system administrator in this case needs to be able to use the remote KVM facilities as if they were physically attached to that server and then be able to transfer that virtual attachment instantly to some other server, perhaps in a completely different physical location.

Figure 2 shows how an advanced IPMC can address these needs by redirecting KVM-related connections over the network to a remote console. Media redirection, a related feature also shown in the figure, enables a remote installation image to function as if it were a drive physically attached to a server. The server can boot from the remote drive image as part of a diagnosis or recovery operation.

Figure 2: Key communication interfaces supporting KVM and media redirection in an example IPMC based on ASPEED’s AST2500. Two USB ports and a PCI Express (PCIe) port on a payload CPU interact with the IPMC as if the remote keyboard/mouse, installation drive and PCIe-accessed video were directly attached. In reality, all those devices are attached to a remote console that communicates with the IPMC via Ethernet.

The key idea in the redirection facilities shown in Figure 2 is that the payload CPU can interact with the redirected devices exactly as if they were physically attached, not redirected to corresponding physical devices attached to the remote console. This capability is independent of the operating system running on the payload CPU.

Of course, the considerable compute power and good network performance in an AST2500-based IPMC are necessary to make this practical. Furthermore, specialized video capture and compression hardware is critical to an effective implementation of video redirection. In the Figure 2 model, the payload CPU would be configured to treat that hardware as if it were the system video card.

The powerful features of Linux running on the BMC (such as rich I/O and communications subsystems) are valuable in integrating these redirection facilities. This is especially true for media redirection, where Linux already includes support for applicable USB and mass storage protocols. Furthermore, a Linux foundation potentially allows other applications, such as a web interface or other services, to coexist with the main IPMC application on the BMC CPU. Normal Linux interprocess protections help to avoid interapplication interference. Of course, total resource requirements of the various co-resident applications must be planned carefully so there is always enough capacity for the critical applications.

Delivering IPMC facilities cost-effectively

One key to achieving cost-effectiveness is combining hardware support for key management features with traditional system-on-chip (SoC) facilities in a single device. The AST2500, a server management processor from ASPEED Technologies, is one example of such management-optimized SoCs. Figure 3 shows a possible IPMC reference design based on the AST2500. The design takes advantage of specialized hardware features, including the video controller and multiple USB ports for redirected KVM and media. Also included is a 16-input analog to digital controller (ADC) for monitoring electrical parameters and 14 independent I2C controllers (with only a fraction of either of these resources used in the base reference design here).

Another key to cost-effectiveness is using the power of Linux (running on an ARM11 processor at 600 MHz on these AST devices) and its rich open-source ecosystem of development tools and device support as the foundation layer for management applications.

Figure 3: An example reference design for an IPMC with advanced redirection support, in this case using the AST2500 or (without the redirection features) its variant the AST2520. The need for additional chips beyond the AST device is minimized by the high integration level of the SoC.

Getting started

One such offering is the Schroff Pigeon Point BMR-AST-IPMC reference design from Pentair. The IPMC subsystem in this solution is compliant with the most recent IPMI, ATCA, and HPM.2 specification revisions and the firmware has been field-validated over the last decade in ATCA boards and complementary modules used in telecommunications and other applications around the world.

The Pigeon Point BMR-AST-ATCA Starter Kit comes with schematics and a bench-top reference implementation that allows designers engineers to become immediately familiar with IPMI-based management and the IPMC features. While the configuration architecture enables many board-specific adaptations to be implemented without extensive programming, full source code of the BMR management application is included as well. The source code may simply be used as a supplementary educational resource or more aggressively, to do extensive customizations, if needed.

An equivalent BMR subsystem is included with the Pigeon Point BMR-AST-BMC Starter Kit, which also supports advanced redirection facilities, but in the context of IPMI-managed servers not based on ATCA.

An IPMC with advanced redirection facilities can strengthen the competitiveness of an ATCA blade server, especially for use in the context of the converged network and data center, which are often scaled out with hundreds or thousands of servers. Servers not based on ATCA, especially in such scaled-out configurations, can benefit from these facilities as well.

Mark Overgaard is Architect, System Management for Pentair Electronics Protection. Mark was the founder and formerly the CTO of Pigeon Point Systems, which was acquired by Pentair in July 2015 and integrated into Pentair’s Electronics Protection platform, under the Schroff brand.

Pentair Electronics Protection