AdvancedTCA in the data center: Intelligent networking designs for 40 Gbps and beyond
Jarrod examines major market changes and their effect on system design requirements, focusing on virtualized data center servers and appliances.
Widely used in telecom central offices, AdvancedTCA systems are gaining a foothold in data centers. Two major networking changes are behind this. One is the ongoing evolution to a new, intelligent network driven by secure, content- and application-aware processing at increasingly higher speeds. The other is the shift to next-generation data centers based on highly virtualized servers. Accelerating this shift is the drive to offer enterprise and public services using cloud computing. This combination of increasing network throughput with L2-L7 intelligent networking in highly virtualized servers introduces new CPU and network I/O system design requirements.
As the architecture of network infrastructure equipment, data center servers, and network and security appliances evolves to meet these demands, designers of networking equipment are considering AdvancedTCA systems. Of particular interest to these designers are the attributes that helped AdvancedTCA succeed in the central office.
The intelligent network evolution
Not long ago the network infrastructure was merely a super highway between clients and their network-based content. Increasing bandwidth and network throughput was the primary objective. With the advent of cloud computing, Web 2.0 applications, and other outsourced managed network services, more information is moving throughout public networks. Fast, dumb pipes are giving way to end-to-end intelligent and secure communications paths.
There are five distinguishing attributes of the new intelligent networks:
Demands for a more intelligent network exist alongside the need for speed. Networks have rapidly begun moving from multigigabit to 10 Gbps. And 40-100 Gbps looms. Intelligent networks must keep pace with network performance requirements. Gaining intelligence at the expense of throughput and latency is not acceptable.
Threats to networks are increasing, and networks are taking steps to become more secure at nearly every point along the communication path. Threats can come from the inside in the form of accidental and intentional data loss, or from the outside in the form of spam, botnets, and other forms of malware. Intelligent networks must offer integrated security functions such as intrusion prevention and detection, cryptography, data loss prevention, and firewalling. These computationally intense applications must be supported without degrading performance.
Application and content awareness
Intelligent networks make forwarding decisions for policing and shaping as well as QoS and service level agreement guarantees, and they enforce acceptable use policies. These tasks require line-rate deep packet inspection at L2-L7 for all traffic.
With many network and security applications being delivered on high-performance multicore IA/x86 systems, intelligent networks must provide equivalent I/O virtualization. This network virtualization must tightly couple with the multicore IA/x86 processors to guarantee the applications have adequate performance, including throughput and latency.
Meeting higher performance and resiliency demands
Network applications and security applications increasingly require “intelligence” in the form of deep packet and content inspection. Network applications include test and measurement, service assurance, Session Border Controllers (SBCs) and Deep Packet Inspection (DPI) systems. Security related applications include:
· Intrusion Detection and Prevention Systems (IDS/IPS)
· Unified Threat Management (UTM)
· Data Loss Prevention (DLP)
· Lawful intercept (CALEA)
· High-speed packet capture
The just-mentioned network and security applications have historically been offered in x86 based appliances. And now more often than not these devices find their jobs taking them to locations that require better performance and resiliency than a typical appliance can support. This is where modular, chassis-based AdvancedTCA systems as shown in Figure 2 come in. As network I/O begins to scale, and as these devices anticipate deployment in larger networking locations, the use of COTS hardware platforms for application hosting is gaining ground.
Common attributes of intelligent networking devices
Intelligent network and security devices share several common attributes. In all cases, they require very high speed packet capture with zero packet loss. By the nature of their applications, they must see 100 percent of network traffic. In many cases, these devices are also involved in packet forwarding, requiring line-rate network I/O for 1 Gbps, 10 Gbps, 40 Gbps, and eventually 100 Gbps interfaces. It also requires very low latency, equating to a maximum of 250 μs of delay in 1 Gbps networks.
In the cases where the devices are involved in packet forwarding, the decisions are often far more sophisticated than basic L2-L3 decisions. These advanced L4-L7 decisions, based on security and on application and content processing, are computationally intense. In order to keep pace with the increasing performance, the majority of these applications have been written to IA/x86 platforms. With the advent of multicore IA/x86 and CPU virtualization, designers of these systems are also able to combine network and security functions into single systems.
These devices are deployed in networks where the protocol and application topography is changing rapidly. Keeping pace requires a high degree of programmability across application, control, and network data planes without degrading performance.
A scalable product architecture based on COTS systems makes possible a range of price-performance hardware platforms and allows for rapid, yet independent control plane or data plane component enhancements as innovations occur and market requirements expand.
Network developers are responding to the scenarios just noted by offering a wide array of flexible interface options spanning 1, 10, 40 and 100 Gigabit Ethernet. And devices must be deployable both as passive systems that are out of band and as active inline network elements. The latter network configuration presents additional requirements for resiliency, such as the inclusion of highly redundant components. Redundant control plane processing and integrated bypass technologies such as fail-to-open, fail-to-close, and other fail-to-wire mechanisms are common.
Ramping up product performance by separately enhancing network processing
The AdvancedTCA form factor supports architectures that separate the control/application plane and data plane processing. Developers using AdvancedTCA can increase product performance by separately enhancing network processing for the I/O and the general-purpose computing for the application and control plane.
AdvancedTCA modularity means designers can follow the x86 roadmap for application-processing blades. As a result, designers can improve the performance of AdvancedTCA-based network and security applications. Upgrades take place for application-processing blades without requiring major changes to other system components.
While the x86 architecture may be the application processor of choice, it is not designed to provide the highest levels of network processing at up to 100 Gbps. Therefore, dedicated line cards that are optimized for network and flow processing form a logical complement to the x86. In supporting a model that allows for the separation of application and data plane processing an AdvancedTCA solution can also significantly improve performance. And the solution can address evolving network performance and functionality requirements via dedicated network interface cards. Just as developers can quickly enhance the application-processing blades with improved general-purpose CPUs, they can upgrade networking cards to support new network interface requirements. When designed with programmable network flow processors, the networking cards can be powerful, field-configurable interface modules for inline or look-aside acceleration that can be programmed toward specific functions such as DPI, security processing, or algorithmic acceleration (Figure 3).
Separating the application/control plane processing from the data plane guarantees the best application performance and best networking throughput. This unique, heterogeneous processing architecture, when used in an AdvancedTCA form factor, will allow equipment providers to rapidly deliver programmable products that satisfy the new requirements of intelligent networking.
The role of I/O virtualization
The important role of network I/O virtualization requires special mention because it is critical to the solution. Networks have been virtualized with many familiar technologies like VLANs and VPNs for years. With the increasing usage of multicore CPUs that might also be virtualized, there is a missing link in the architecture. The mechanism to get data from the virtual network interfaces to the x86 CPUs is typically a single shared pipe. A better approach would be an underlying I/O subsystem that is also virtualized.
When designing data plane line cards for AdvancedTCA systems, special design considerations must be made. The network processors used in these line cards must be purpose-built to be aware of the virtualization that exists among the application plane processors. This I/O virtualization ensures that the network line cards efficiently direct traffic to the appropriate cores with dedicated resources, and in some designs offload specific network processing from the applications, directly to the network line cards.
Designers of AdvancedTCA-based systems have options for implementing a heterogeneous architecture. One option includes creating separate general-purpose processing line cards for the network and security applications, while offering a wide array of networking line cards for I/O flexibility and L2-L7 processing. Or the architecture can combine control plane and data plane processing in a single hardware design. In order to preserve the benefits of independently increasing performance among different processing elements, designers could provide the I/O and networking processing in AdvancedMC form factors while tightly coupling with the IA/x86 application processors on an AdvancedTCA carrier.
AdvancedTCA designs are proven and successful in carrier central office applications. Many of those same requirements are now found in the evolving requirements of data center applications. When designers take advantage of a heterogeneous processing architecture for network and security application processing and L2-L7 network flow processing, AdvancedTCA systems can help scale the most demanding data center applications to 40 Gbps and beyond.
Jarrod Siket is senior VP of sales and marketing for Netronome Systems, Inc., based in Pennsylvania. He has 18 years of experience in the data and telecommunications industry, including roles at Tollgrade Communications, FORE Systems, and three terms (2000-2005) as the vice chairman of the IP/MPLS Forum Technical Committee. Jarrod holds a BS in Information and Decision Systems from Carnegie Mellon University and an MBA from the Joseph M. Katz Graduate School of Business at the University of Pittsburgh. He can be reached at firstname.lastname@example.org
Netronome Systems, Inc.