Adoption of Internet-based voice and video brings promise...and threat

By now we are all familiar with Internet threats and attacks on corporate LANs and home computers. But as service providers continue to offer new voice and video services over the Internet, new security and denial of service threats begin to take shape. Voice and video services require real-time delivery for proper operation. Unlike traditional data services, which are much more security processing tolerant, real-time applications canít afford undue processing latencies introduced with traditional security products such as intrusion detection, firewall, and virus scanners. If threat-detection processing introduces too much latency and/or jitter, the real-time application becomes useless.

In this monthís column, we will take a look at the taxonomy of emerging threats to real-time applications and security solutions that address these new threats.

Taxonomy of VoIP threats

Recently introduced voice and multimedia services require new network components. With the deployment of these network components come new attack points where protocols and applications can be compromised.

One example of a new network architecture supporting voice and multimedia is the Session Initiation Protocol (SIP) network architecture. The SIP network architecture defines clients and servers. A client is a user endpoint that sends SIP requests. A server is an entity that responds to client requests according to the SIP protocol. Clients can notify SIP servers about where they can be reached and/or set up profiles based on date and time on where to contact them. SIP servers come in two flavors. Proxy servers take user connection requests and make the connection through the proxy server to the requested client. Redirect servers respond to where is requests from clients for the location of a client by providing the address where that client can be reached.

Beyond the SIP client/server architecture, a call control protocol t supports point-to-point, unicast, broadcast, and conferencing capabilities. So, SIP clients, servers, and the SIP call control protocol itself are all new attack points within this new multimedia environment.

Intrusion detection systems and firewalls do a good job at recognizing attacks based on the IP protocol and the TCP/UDP ports and protocols noted earlier. They typically prevent attacks by allowing or disallowing access to the remote machine from specific IP addresses, IP address subnets, and/or Layer 4 (TCP, UDP, or other) service ports. If the provider offers multimedia services, they must also allow traffic to flow through these multimedia service ports. Once that decision is made, a whole new world of VoIP attacks can occur through these open service ports:

n   SIP infrastructure attacks

      - Fuzzing: Attacks where a variety of diverse messages compliant or deviant to the protocol layer being attacked in an attempt to cause system exceptions, buffer overflows, or unexpected results from the equipment being attacked

      - Reconnaissance: Attacks that send a variety of SIP service requests to SIP redirect servers in an attempt to get client identity information from the network

      - Floods/distributed floods: Attacks that send high volumes of SIP registry, client address, or call requests to SIP servers in an attempt to deny service of legitimate traffic

n   SIP signaling attacks

      - Misuse/spoofing: Attempting to communicate with a SIP client under a different client identity, potentially acquired from a SIP infrastructure reconnaissance attack

      - Session anomalies: Attacks that cause SIP sessions to be prematurely dropped by using the signaling termination messages in the protocol from clients not participating in the session

      - Stealth: Stealth attacks are those in which one or more specific end-points are deliberately attacked from one (DoS) or more (DDoS) sources, although at a much lower call volume than is characteristic of flood-type attacks

      - Spam: Illegal acquisition of SIP client information for purposes of sending unsolicited information to the client

n   The SIP media content itself

      - Fuzzing: The content of the SIP call may be a G.7xx coder/decoder audio session, MPEG, or other voice/video encoding technology. These attacks send diverse encoding methods to the endpoint in an attempt to cause hardware exceptions, buffer overflows, or unexpected results from SIP clients

      - Floods: Attacks that involve sending a large volume of traffic (calls) to a SIP client, preventing that client from operating on legitimate traffic     - Misuse/spoofing: Attacks that contact SIP clients under a different client address and/or use the session to deliver unsolicited (ads, etc) or illegal content to the client

Simply opening up a few signaling and SIP content ports on a firewall creates a list of potential network attacks that is quite daunting indeed.

Impact of false positives on real-time applications

Latency further complicates the attacks on voice and video systems. Voice and video are real-time data streams. That is, if packets are held up too long, they are useless to the system, resulting in poor video or incomprehensible audio. So, even in a case where no threats are occurring, the security equipment searching for these threats must not introduce latencies that cause poor quality video or audio.

I mention this because one approach using traditional security devices might be to treat everything coming in on the new voce/video/signaling ports as a possible attack. When packets come in, security devices inspect the packets at the content level, then passes or drops the packets. Unfortunately, the latency this approach introduces makes the service itself unusable. Further, the dynamic nature of a voice and video environment (for example, I might order a bunch of movies over the Internet on my day off or when Iím sick, but most other days I never get a movie) versus a more predictable corporate LAN environment (for example, a data synchronization job happens every weekday from 2:00 p.m. to 3:00 p.m., so the firewall opens that port during that time and closes it at all other times) makes traditional peep-holing methods, where ports and/or services are only allowed at specific times, unusable. For instance, I do not want my emergency 911 call at 3:00 a.m. causing a false positive at my VoIP service provider firewall so that the call is disallowed because it is outside my calling profile.

Flexible solutions to new threats

Sipera Systems (Richardson, Texas) is addressing these new voice and multimedia service threats. Figure 1 shows Siperaís view on these new threats as they extend beyond traditional intrusion detection and firewall systems. Thanks goes out to Brendan Ziolo, director of marketing, and Krishna Kurapati, chief technology officer of Sipera Systems for their contributions to this discussion.

Figure 1

Krishna mentions three key characteristics for handling new threats effectively within multimedia networks. These include real-time performance, not being a point of failure, and behavior learning and verification.

Developing threat detection and mitigation systems that consider real-time performance issues is more easily said than done. The appliance typically needs specialized hardware in order to ensure processing latencies are within tolerance of the real-time network. Further, encryption is also a consideration. The appliance may need to securely store and manage encryption keys in order to decrypt packets at wire speed and within real-time specifications of the traffic flows.

The goal of all network system components is to eliminate the possibility of the component being a point of failure within the network. However, many of these new multimedia services such as VoIP are mission critical applications. Laws governing service availability for telecommunications systems now apply to the Internet. So, high availability and fail-safe bypass features are important to have in these real-time security systems.

Perhaps the most intriguing of these new characteristics is behavior learning and verification capabilities. The idea behind this is to start with some user-definable multimedia profile, then augment this baseline with the ability for the equipment to learn call patterns and endpoint fingerprints in order to analyze and counter any possible threat. This kind of dynamic learning within the equipment is key to identifying VoIP spam or detecting any number of the attacks described previously.

Threat detection and mitigation solutions

Sipera Systems IPCS product line is a threat detection and mitigation system targeted for addressing the real-time issues of multimedia services on the Internet. The product line ranges from a 10 Mb appliance for up to 200 users through a 2 Gb appliance handling 100,000 subscribers. Figure 2 illustrates how the IPCS works.

Figure 2

The idea behind the Sipera Systems product line is for the network provider to have one product with a comprehensive suite that looks for all the attacks described earlier. The IPCS is deployed hand-in-hand with the VoIP network to detect anomalies and learn behaviors to create profiles of the subscribers being protected.

IPCS is in-line but transparent within the network. High availability and fault tolerance is built into IPCS such that failures revert to pass-all so IPCS will not become a point of failure. IPCS can be configured to automatically learn or stick to a fixed, configurable profile. In a typical application, Krishna mentioned that the system administrator might set some policies through a web-based interface and the rest is done by automatic detection.

An anomaly may be detected by IPCS when a call occurrence happens outside of the call profile for a given user. When an anomaly occurs, the IPCS will challenge the SIP endpoint to determine if it is a regular endpoint or a spoofed endpoint. In the case of VoIP spam, IPCS will run what Sipera calls a ìVoIP Turing test.î This test is used to determine if the caller is a person or a machine. The IPCS will play a message that asks a question of the caller. If the caller answers correctly, they are allowed. If no answer is returned, the call is blocked.

Special care is taken in IPCS to ensure that blocking the call is appropriate. In this new, VoIP enabled environment, it is unacceptable to block even one legitimate call. So, a series of escalating tests are performed on potentially blocked calls to determine false positives. In the case of these escalating tests resulting in a blocked call, the IT person is notified of the anomaly, and actions can be taken to allow the call in the future or a policy added to immediately block the call.

IPCS includes a number of logging capabilities as well. Instant messages and SIP call control messages can all be logged in an attempt to determine how policies should be set for different users on the VoIP network.

The IPCS can be managed using a centralized web-based management system. Policies and parameters are pre-programmed, so the device is useful without any kind of configuration. However, call profiles and user parameters are tunable by the administrator through a web-based interface. The administrator can view the rule set and enable or disable rules from the list. Another configuration page provides the ability to set tunable parameters. Higher-level rule sets or profiles called policies can be selected. Policies cover a particular domain such as a group of users. As policies are selected for different groups of users, the system can treat each domain differently to maximize flexibility.

Taking a look under the hood reveals that the IPCS uses a hardened version of Linux for configuration and management functions and a proprietary real-time embedded operating system for the real-time signaling and media content processing.

A separate software tools are also available from Sipera. Called LAVA tools, they can run on a VoIP network and perform some threat auditing of the network to determine that networkís readiness to resist attacks.


It is important to stop and think about threat and security issues when deploying multimedia voice and video services. The mission critical nature of voice service over the Internet demands the ability to deter threats without compromising services.

For more information, contact Curt at [email protected].