Data distribution service becoming a natural fit for AdvancedTCA switched-fabric backplanes

Historically, data critical applications are characterized by items such as industrial control data collection systems or airline traffic control systems where gathering and distribution of data within predictable time constraints is critical. We are used to thinking of air traffic control, industrial automation, and command and control systems as closed network environments. But now these and other applications such as financial transaction processing, network monitoring, and network-centric data collection systems are evolving to use open networks. In this month’s column, we will look at Object Management Group’s (OMG’s) Data Distribution Service (DDS) for real-time systems standard, its capabilities, and its applicability to the switched-fabric environment within AdvancedTCA systems.

Data Distribution Service (DDS)

DDS is an OMG standard that defines communications middleware in a data-centric manner. Specifically, when the system architecture is defined, data objects are identified along with their availability and access characteristics. Once defined, publishers and subscribers are created for each data object. The publishers must make the data available in accordance to the defined characteristics of the data object. Likewise, subscribers must access data object information from the publishers within those same boundaries.

The end result of DDS is a cleaner system architecture where the distribution or co-location of publishers and subscribers is independent of operating system, programming language, and network topology or protocol. The application can simply view the middleware layer as consisting of a set of data objects where data can be published or consumed according to the defined data characteristics.

AdvancedTCA switch fabric bus architecture

The AdvancedTCA form factor allows a number of bus and fabric configurations across the backplane among blades in an AdvancedTCA chassis. The AdvancedTCA bus architecture acknowledges that COTS systems may need star or bus configurations between blades as well as IP, InfiniBand, or other connectivity protocol capabilities. The AdvancedTCA bus architecture is specified to allow the gathering and distribution of large amounts of data among any number of blades in the system. Further, a given blade in the system may change function over time.

AdvancedTCA systems are not necessarily hard real time, but the predictability requirements of an AdvancedTCA system are an integral part of the functions they perform. Some systems may gather data from a variety of sources and distribute the data to a variety of users, such as databases, display devices, or control algorithms.

Designers of complex, data-critical distributed systems are turning to AdvancedTCA for a number of reasons. AdvancedTCA provides cost savings and easier maintainability than traditional closed systems that historically implement these applications.

A switched-fabric bus is unique in that it allows all AdvancedTCA blades on the bus to logically interconnect with all other blades on the bus. Each blade is physically connected to one AdvancedTCA switch slot. This topology results in a redundant network or fabric, in which there may be one or more redundant physical paths between any two blades. A blade may be logically connected to any other blade via the switch card. A logical path is temporary and can be reconfigured or switched among the available physical connections. The switched fabric can be used to provide fault tolerance and scalability without unpredictable degradation of performance, among other features.

It turns out the switched-fabric environment within AdvancedTCA is a highly complementary environment for DDS and the appli-cations it services.

Convergence between AdvancedTCA and DDS

I had the opportunity to talk with Gordon Hunt, Principal Applications Engineer with Real-Time Innovations, Incorporated, a leading supplier of DDS middleware technology, about the convergence between switch fabrics that are implemented in AdvancedTCA and DDS.

Gordon mentioned that the publish/subscribe philosophy behind DDS is highly complementary with the design goals and architecture of AdvancedTCA systems. People are rolling out highly distributed systems with COTS hardware, Linux, and/or real-time embedded OSs. GbE topologies are also becoming prevalent. All these characteristics within network systems can also be found in AdvancedTCA deployments. Gordon also mentioned companies are moving toward open architectures and away from their historical closed solutions.

Data rates and latency requirements require other transports that companies may not be able to cost effectively implement in their closed systems. Gordon noted that transport options such as switched Ethernet and InfiniBand are currently receiving a great deal of attention. The benefit of DDS within environments such as this is that once the distributed application is implemented using a DDS middleware layer, companies do not need to change the code/data model. Different data transports can be swapped in and out without major changes in the application.

Historically, design decisions happened by choosing the hardware platform, operating system, and network topology as a tightly coupled entity. Once these decisions were made, the rest of the design decisions followed. Gordon has seen these design cycles evolve to defining a distribution bus with heterogeneous switched fabric. This is because applications have wide ranging data publishing and subscription requirements. So, the right transport technology becomes critical to the success of the application, and DDS is a natural way to define data objects such that their characteristics fit within the capabilities of the network technologies chosen.

Gordon sees customers migrating to DDS while at the same time migrating to open architectures such as AdvancedTCA. He explained a typical development starts by building out a 10-to-15 node system with a standard COTS platform. Then data definitions and data rate requirements within the system are benchmarked. If the topology does not satisfy the data publish/subscribe requirements of the application, the topology can be easily swapped out without requiring changes to the data model.

Gordon has seen DDS implemented using network topologies and protocols such as IP over InfiniBand and raw InfiniBand. Some even use DDS over a backplane using DMA technology. This option just publishes the subscriber application layer, and the hardware does the DMA distribution. Finally, as these 10-to-15 node systems mature and fulfill the require-ments of the application, more nodes can be added using other topologies for data access. As applications grow and scale, DDS enables powerful multi-protocol, multi-network systems without changes to the application software.

Hot swap easily implemented with AdvancedTCA and DDS

Gordon mentioned that similar to the system capabilities of AdvancedTCA, the RTI DDS also automatically handles hot swapping redundant publishers if the primary fails. Subscribers always get the sample with the highest priority whose data is still valid (that is, whose publisher-specified validity period has not expired). It automatically switches back to the primary when it recovers.

DDS and network migration

Evolution to wireless and satellite applications are other areas where DDS can be of great value. It is hard to take a system application currently using wire line Ethernet and move it to a wireless topology. There are many Layer 2 reliability issues. For example, packing the data properly for wireless topology is very different from the historical Ethernet application and can compromise the data availability of the application. In addition, moving from a reliable protocol layer like a TCP/point-to-point solution on wired legacy systems to a lossy wireless environment makes the transition even more difficult.

Some applications have used a historical network-centric model, where the application was responsible for sending to one object. Now that application must send to groups of nodes. DDS eases the transition between networks by specifying the Quality of Service (QoS) across all topologies used by the application so that the QoS differences and operations are compartmentalized within the topology layer. DDS has around 50 QoS definitions, and each of those have subtypes. The QoS is compartmentalized as well when topologies change, so the application does not have to be as sensitive to that.

Moving legacy systems to DDS

Gordon sees a definite shift toward an open architecture mandate. RTI works with customers to make this transition as easy as possible. The first challenge is usually dealing with the sometimes awkward evolution from point-to-point systems to a magic tipping point where there are too many items connected and it becomes too expensive to open up and change topologies.

Usually the legacy software will have an abstraction to the I/O devices. But the application reliability layer is almost always encoded in the application layer. For example, the migration steps might take place as follows:

(1) We take a look at what the application is sending on the wire and use DDS to implement the data objects and messaging. For example, we would identify the data being sent and any heartbeat, handshake, and responses and their timing constraints for proper operation. (2) The application layer takes control of these message semantics involving access and production of the data. (3) We look at what happens on failure/resend. This reliability layer can be defined in the QoS of the data object and therefore pushed down to the middleware. (4) When specifying in the QoS, it is typically specifying interactions that have been traditionally part of the application. So once defined in the DDS middleware, the applica-tion simply accesses the APIs and no longer has responsibility for this. (5) The application still has qualms about what to do with the data, but getting the information from the objects is abstracted through the middleware.

Application transition to COTS hardware and a DDS middleware layer almost always results in smaller, cleaner applications that implement the core application algorithm and leaves the QoS, reliability, and fail-over mechanics to the DDS middleware.

Fail-over scenarios and operation

DDS QoS definitions have a lot of semantics on what it means to fail. Failure of data delivery is one dimension. DDS provides hooks to tell the application when the defined data rules are not being met. When a source of information goes away, the application can get notified. Traditionally this was done with application handshakes. Another complication is adding applications to an existing system. When you add a new application, the existing applications have to know how to deal with the added application. Middleware abstracts this, makes sure the QoS is being enforced, and notifies the application.

Data validity and volatility are other important concepts in DDS. When we send data, what happens to it? Does the data have state/persistence? For example, a user drives into a new mobile area. These legacy apps would include synching new users up to the current mode of that roaming-enabled application. With DDS this data system mode is persistent and after we send it, new objects joining the system should get it. The DDS middleware between the sites can then share this state information, allowing applications to come and go naturally.

There is also a concept of data persistence relative to the appli-cation. As long as the producer of the data is running, that data is available. When the producer goes away, the data goes away. Data can be defined to remain persistent past the producer’s lifetime. This has to do with the high availability capabilities of DDS. The designer can specify persistence down to the per piece of data level.

Thus each piece of data can be built out with proper:

  • Quality of Service

  • Event detection

  • Heartbeat

  • How hard do I work to make sure you get the data

  • Data quality (persistence)

  • Number of sources of data

  • Fail-over from primary/secondary

Coupling these DDS capabilities over various transports results in a powerful definition of the system. Some topologies are physically reliable; some are lossy. In any event the system designer can leverage the DDS QoS and high availability capabilities across topologies.

More about RTI

RTI has recognized that historically technology has driven the code. The new paradigm is to pick the application data objects first, then as we build out the pieces, pick the right technology that manipulates the data efficiently according to its requirements. If the application is doing complex queries and joins, maybe the right topology is a database. In other cases, it might source over a network. RTI has products that allow users to manipulate their data once they have chosen the data model.

RTI also has a capability called Content Event Processing (CEP). This capability parallels nicely with a data first architecture. For example, say that there are streams of data from lots of different sources and there is a need to do multi-stream correlation such as pattern recognition. This could be coded into an application by using DDS to get the raw feeds and analyze the data. With CEP, you write that kind of algorithm in an almost SQL-like fashion, for example, "if any of the things you are tracking looks like this or correlates in this way, call this method in the application." CEP isn’t limited to DDS either and can be used on other items such as e-mail, database entries, and SNMP traps. CEP is a power-ful way of building out complex logic in a highly distributed environment.

RTI also holds to a distributed connect concept: data is data. It does not matter if this is obtained in a stream or it is coming from/to a database; we can view the world as data and do not have to worry about the technology being used.

Conclusion

AdvancedTCA and DDS were conceptualized and developed to solve similar issues, one on the hardware plane and the other on the software plane. New and evolving applications requiring multi-network, hot swap, and fail-over capabilities can be satis-fied efficiently through the use of an AdvancedTCA platform and its switched-fabric backplane architecture with DDS as the middleware layer.

For more information, contact Curt at [email protected].