Deep packet inspection platforms have evolved from a static rules-based filtering engine to a sophisticated enforcement point allowing packet and protocol classification, prioritization and shaping. Ubiquitous in enterprises and telco networks, they are the jack-of-all-trade of traffic management, allowing such a diverse set of use cases as policy enforcement, adult content filtering, lawful interception, QoS management, peer-to-peer throttling or interdiction, etc...
DPIs rely first on a robust classification engine. It snoops through data traffic and classifies each packet based on port, protocol, interface, origin, destination, etc... The more sophisticated engines go beyond layer 3 and are able to recognize classes of traffic using headers. This classification engine is sufficient for most traffic type inspection, from web browsing to email, from VoIP to video conferencing or peer-to-peer sharing.
The premise, here is that if you can recognize, classify, tag traffic accurately, then you can apply rules governing the delivery of this traffic, ranging from interdiction to authorization, with many variants of shaping in between.
DPI falls short in many cases when it comes to video streaming. Until 2008 or so, most video streaming was relying on specialized protocols such as RTSP. The classification was easy, as the videos were all encapsulated in a specific protocol, allowing instantiation and enforcement of rules in pretty straightforward manner. The emergence and predominance of HTTP based streaming video (progressive download, adaptive streaming and variants) has complicated the task for DPIs. The transport protocol remains the same as general web traffic, but the behaviour is quite different. As we have seen many times in this blog, video traffic must be measured in different manner from generic data traffic, if policy enforcement is to be implemented. All packets are not created equal.
- The first challenge is to recognise that a packet is video. DPIs generally infer the nature of the HTTP packet based on its origin/destination. For instance, they can see that the traffic's origin is YouTube, they can therefore assume that it is video. This is insufficient, not all YouTube traffic is video streaming (when you browse between pages, when you read or post comments, when you upload a video, when you like or dislike...). Applying video rules to browsing traffic or vice versa can have adverse consequences on the user experience.
- The second challenge is policy enforcement. The main tool in DPI arsenal for traffic shaping is setting the delivery bit rate for a specific class of traffic. As we have seen, videos come in many definition (4k, HD, SD, QCIF...), many containers and many formats, resulting in a variety of different encoding bit rate. If you want to shape your video traffic, it is crucial that you know all these elements and the encoding bit rate, because if traffic is throttled below the encoding, rate, then the video stalls and buffers or times out. It is not reasonable to have a one-size-fits-all policy for video (unless it is to forbid usage). In order to extract the video-specific attributes of a session, you need to decode it, which requires in-line transcoding capabilities, even if you do not intend to modify that video.
Herein lies the difficulty. To implement intelligent, sophisticated traffic management rules today, you need to be able handle video. To handle video, you need to recognize it (not infer or assume), and measure it. To recognize and measure it, you need to decode it. This is one of the reasons why Allot bought Ortiva Wireless in 2012, Procera partnered with Skyfire and ByteMobile upgraded their video inspection to full fledged DPI more recently. We will see more generic traffic management vendors (PCRF, PCEF, DPI...) partner and acquire video transcoding companies.