AdvancedTCA: ready to conquer (mobile) space mountain

Carriers are contending with growing mobile broadband, the need to support both TDM and IP traffic, and pressure to implement 4G.

7The industry has a steep mountain to climb in the mobile space, but this task is not limited to telecom; other applications outside the central office – including military, aerospace, and network data center applications – are facing their own mountains. PICMG’s AdvancedTCA extensions initiative will provide the tools to tackle these challenges head on, and over the next couple of years will enable powerful new features offering additional performance from what is fast becoming a common AdvancedTCA applications platform.

One of the primary goals for the Advanced Telecommunications Computing Architecture was to establish itself as “the technology platform” for telecom applications. While some argue that this goal has already been achieved, it has become clear that the platform’s power and cooling limitations are somewhat handicapping it against in-house designs. To counter this issue and expand the form factor into new markets, PICMG last year announced an initiative to extend the capabilities of AdvancedTCA beyond telecom. But what is driving this performance push and which of these enhancements can we actually use in the telecom world?

The perfect storm

Thanks to the emergence of popular “smart” devices like the iPad, iPhone, and BlackBerry, demand for mobile data services has increased steadily. In fact, according to Cisco Systems’ Visual Networking Index, video content is projected to drive the bulk of increased traffic on the network over the next five years (see Figure 1).

Figure 1: Video’s Projected Share of Network Traffic

One need only visit the local mobile service sales outlet to see that economic conditions are driving carriers to offer aggressive unlimited flat-rate Internet access, the all-too-familiar wireline strategy. Mobile carriers’ tactic of offering flat-rate mobile-data packages is rewarding these companies with growth rates ranging from approximately 50 percent in emerging markets and 20 percent in mature markets.

Nothing comes for free, however, as carriers are running the risk of heavily congested networks along with exploding costs with these types of offers. In fact, operators are already selling mobile broadband below cost in some areas, sacrificing profitability to increase market share. The industry has been keen to see mobile data finally take off, years after it spent billions on licenses to build third-generation mobile networks with high data speeds.

To add fuel to the fire, with LTE becoming the future dominant force in 4G, today’s base station capacities of tens of megabits per second will ultimately require hundreds of megabits per second. Consequently, backhaul networks will need to scale to support new multi-gigabit wireless backhaul solutions. It’s primarily data services driving capacity growth, so operators are seeking transition from circuit- to packet-based architectures to more efficiently adapt to the new data-centric world.

In sum, then, carriers are facing a perfect storm with growing mobile broadband: the need to support both Time Division Multiplexing (TDM) and IP traffic; pressures and costs to implement 4G; and an economy that has yet to rebound (Figure 2). This pressured environment means that operators need a flexible product strategy addressing capacity and scalability issues for their networks today, while driving their 4G initiatives for tomorrow – all with a low total cost of ownership.

Figure 2: Traffic versus Revenue – Traffic is Skyrocketing While Revenue is Flatlining


What the numbers say

Mobile IP data has been growing at an average 131 percent annually. If one extrapolates this growth over five years, the processing needs in packets per second increases more than 60 times, or put another way, you must process sixty times more packets per second. No matter how you look at this, it is an extraordinary increase in traffic considering that 20 percent yearly revenue growth will only garner 2.5 times the revenue in that same period.

In addition to considering the traffic-versus-revenue dilemma, we must be aware that not only is traffic increasing at a rapid pace, but processing per packet is also increasing exponentially, landing a “one-two punch.”

In the early 2G/3G network days, processing was simple mobile packet routing. Today, however, operators need to do more with each packet. Table 1 highlights the huge increases in processing that individual applications demand.

Table 1: Application Processing Complexity


Such high data rates make software design difficult. Furthermore, when we look at resource-hungry applications coupled with the dramatic increase in amounts of traffic, we end up with a multiplier effect. For example, if we conservatively pick a rationale of 20X the processing power needed per packet than what was the norm previously, then when coupled with the 60X traffic multiplier (Figure 3) we end up with a total multiplier of 1200X – a very significant increase in what is required in just the next few years.

Figure 3: Processing Power Needed



Moore’s Law, which essentially specifies that processing power doubles every two years for a given amount of silicon real estate, would predict that our industry is currently on a trajectory of delivering only 6X improvement over the next five years, as Figure 4 illustrates. Since a 1200X improvement is required to keep pace with projected demand, the multiplier gap of 200X (Figure 5) is the true “Mobile Processing challenge.”

Figure 4: The True Processing Power Needed Is Enormous

Figure 5: Mobile Packet Processing Gap

What are the options?

With this disproportionate relationship of demand and projected supply, resource-hungry applications require the industry to “double down” by squeezing the most out of silicon – as well as being smarter about how processing capabilities are packaged on AdvancedTCA real estate.

The size of this challenge demands multiple techniques in order to conquer it:

·    Scalability

·    Memory architecture

·    Integrated peripheral offload

·    Hardware expansion

Regarding scalability, it’s becoming clear that bladed systems provide the best approach with linear scaling from 10 Gbps up to 100’s of Gbps per shelf. One of the key scalable implementations consists of integrated load balancing to allow traffic to be distributed over a bladed architecture.

The most well known load balancing (Figure 6) or flow distribution falls into two different categories:

Figure 6: Typical Load-Balanced system


·    Switch-Based Load Balancing

o    Operating statistical load balancing at line rates of 80 Gbps per blade

o    Zero slot approach using an existing switch blade

·       DPI-Based Load Balancing

o    Uses multicore packet processing blade enabling flexibility to balance on any L2-L7 field

o    Can support millions of flows with stateful load balancing

o    Operates at 10-40 Gbps per blade

 Memory is typically the biggest bottleneck for packet processing. For example, DRAM typically delivers only 30-70 percent of the theoretical bandwidth, a huge hit to throughput and performance. Likewise, for 10 GbE full duplex, 30 M packets per second takes 90 M memory accesses per second to implement a simple flow classification algorithm. One approach to this problem is to minimize memory accesses as much as possible by pre-fetching data into cache whenever feasible. Multicore processor peripherals are another key to high-performing systems with security engines, allowing wirespeed encryption and decryption without using CPU cores.

The last area of focus is hardware expansion, which boils down to maximizing usage of the available real estate – and highlights the need for those recent PICMG initiatives to expand the capabilities of AdvancedTCA referenced earlier. These initiatives include driving new features such as double-wide boards, enhancements to power and cooling, optimizations for non-NEBS environments, and allowances for double-sided shelves and other modifications while maintaining forward and backward compatibility with existing AdvancedTCA products.

Climbing gear is in place now

Expanded boundaries for AdvancedTCA are enabling it to meet the mobile space mountain and other challenges head on. For packet processing, today’s hardware expansion comes in the form of:

·       TCAM memory mezzanine accelerating searches and RegEx

o    Single cycle search for IPv6 longest prefix match

o    Allows thousands of firewall ACL rules to be evaluated in a single cycle

·       FPGA expansion allows custom functions

o    Implement new crypto standards

o    Move part of software algorithm to FPGA to improve packet rate

Continuous Computing’s FlexPacket ATCA-PP50, for instance, uses two discrete multicore RMI XLR MIPS64 packet processors. Each processor provides eight multi-threaded cores and contains a built-in security co-processor capable of handling up to 10 Gbps of bulk encryption / decryption (20 Gbps per blade). The PP50 supports each processor with up to 8 GB of memory (16 GB per blade) as well as access to a TCAM and content-based processors via mezzanines.

Figure 7: Continuous Computing’s PP50 Packet Processing Blade

On the control plane front, Continuous Computing has already released its FlexCompute ATCA-XE60 dual Intel 5500 Series “Nehalem” compute blade. This product comes in varying configurations with support for not only the standard AdvancedTCA Nehalem processor (L5518) – keeping the power below the usual 200 W limit, but also for higher-power commercial Nehalem chips (E5540) that push the envelope closer to 300 W while driving performance skyward.

Figure 8: Continuous Computing’s XE60 Dual Nehalem CPU Blade

As mentioned earlier, memory architecture needs to be designed to make use of this increase in CPU performance, so the XE60 supports an increased cache size coupled with 64 GB of memory and four storage drives with over one terabyte of storage, thus enabling higher performance and a much wider range of applications.

Jason Byrne brings more than 15 years of telecom product Marketing and engineering experience to Continuous Computing. Currently he manages the company’s AdvancedTCA switching and compute product lines.

Prior to Continuous Computing, Jason held System Engineering positions with Nuera Communications (acquired by AudioCodes) and designed a range of VoIP media gateway products on CompactPCI. Previously, he held various technical leadership positions spanning both hardware and software while in the Wireless Infrastructure division at Lucent Technologies and at Boston Technologies (acquired by Comverse). Jason has an MBA from University of San Diego, received his Honors Diploma in electrical / electronic engineering from Dublin Institute of Technology, and has a BSEE from Trinity College, Dublin.

Continuous Computing