The technology is fundamental to help accommodate the growth of online video delivery over unmanaged (OTT) and wireless networks.
The implementation is as follow: a video file is encoded into different streams, at different bit rates. The player can "jump" from one stream to the other, as the condition of the transmission degrades or improves. A manifest document is exchanged between the server and the player at the establishment of the connection for the player to understand the list of versions and bit rates available for delivery.
Unfortunately, the main content delivery technology vendors then started to diverge from the standard implementation to differentiate and control better the user experience and the content provider community. We have reviewed some of these vendor strategies here. Below are the main implementations:
- Apple HTTP Adaptive (Live) streaming (HLS) for iPhone and iPad: This version is implemented over HTTP and MPEG2 TS. It uses a proprietary manifest called m3u8. Apple creates different versions of the same streams (2 to 6, usually) and breaks down the stream into little “chunks” to facilitate the client jumping from one stream to the other. This results in thousands of chunks for each stream, identified through timecode.Unfortunately, the content provider has to deal with the pain of managing thousands of fragments for each video stream. A costly implementation.
- Microsoft IIS Smooth Streaming (Silverlight Windows phone 7): Microsoft has implemented fragmented MP4 (fMP4), to enable a stream to be separated in discrete fragments, again, to allow the player to jump from one fragment to the other as conditions change. Microsoft uses AAC for audio and AVC/H264 for video compression. The implementation allows to group each video and audio stream, with all its fragments in a single file, providing a more cost effective solution than Apple's.
- Adobe HTTP Dynamic Streaming (HDS) for Flash: Adobe uses a proprietary format called F4F to allow delivery of flash videos over RTMP and HTTP. The Flash Media Server creates multiple streams, at different bit rate but also different quality levels. Streams are full lengths (duration of video).
None of the implementations above are inter-operable, from a manifest or from a file perspective, which means that a content provider with one 1080p HD video could see himself creating one version for each player, multiplied by the number of streams to accommodate the bandwidth variation, multiplied by the number of segments, chunks or file for each version... As illustrated above, a simple video can result in 18 versions and thousand of fragments to manage. This is the reason why only 4 to 6% of current videos are transmitted using ABR. The rest of the traffic uses good old progressive download, with no capacity to adapt to changes in bandwidth, which explains in turn why wireless network operators (over 60 of them) have elected to implement video optimization systems in their networks. We will look, in my next posts, at the pros and cons of ABR and the complementary and competing technologies to achieve the same goals.
Find part II of this post here.
Find part II of this post here.