Showing posts with label elasticity. Show all posts
Showing posts with label elasticity. Show all posts

Tuesday, June 21, 2016

SDN / NFV: Enemy of the state

Extracted from my SDN and NFV in wireless workshop.

I want to talk today about an interesting subject I have seen popping up over the last six months or so and in many presentations in the stream I chaired at the NFV world congress a couple of months ago.

In NFV and to a certain extent in SDN as well, service availability is achieved through a combination of functions redundancy and fast failover routing whenever a failure is detected in the physical or virtual fabric. Availability is a generic term, though and covers different expectations whether you are a consumer, operator or enterprise. The telecom industry has heralded the mythical 99.999% or five nines availability as the target to reach for telecoms equipment vendors.

This goal has led to networks and appliances that are super redundant, at the silicon, server, rack and geographical levels, with complex routing, load balancing and clustering capabilities to guarantee that element failures do not impact catastrophically services. In today's cloud networks, one arrives to the conclusion that a single cloud, even tweaked can't performed beyond three nines availability and that you need a multi-cloud strategy to attain five nines of service availability...

Consumers, over the last ten years have proven increasingly ready to accept a service that might not be always of the best quality if the price point is low enough. We all remember the start of skype when we would complain of failed and dropped calls or voice distortions, but we all put up with it mostly because it was free-ish. As the service quality improved, new features and subscriptions schemes were added, allowing for new revenues as consumers adopted new services.
One could think from that example that maybe it is time to relax the five nines edict from telecoms networks but there are two data points that run counter to that assumption.


  1. The first and most prominent reason to keep a high level of availability is actually a regulatory mandate. Network operators operate not only a commercial network but also a series of critical infrastructure for emergency and government services. It is easy to think that 95 or 99% availability is sufficient until you have to deliver 911 calls, where that percentage difference means loss of life.
  2. The second reason is more innate to network operators themselves. Year after year, polls show that network operators believe that the way they outcompete each others and OTTs in the future is quality of service, where service availability is one of the first table stakes. 


As I am writing this blog, SDN and NFV in wireless have struggled through demonstrating basic load balancing and static traffic routing, to functions virtualization and auto scaling over the last years. What is left to get commercial grade (and telco grade) offerings is resolving the orchestration bit (I'll write another post on the battles in this segment) and creating a service that is both scalable and portable.

The portable bit is important, as a large part of the value proposition is to be able to place functions and services closer to the user or the edge of the network. To do that, an orchestration system has to be able to detect what needs to be consumed where and to place and chain relevant functions there.
Many vendors can demonstrate that part. The difficulty arises when it becomes necessary to scale in or down a function or when there is a failure.

Physical and virtual functions failure are to be expected. When they arise in today's systems, there is a loss of service, at least for the users that were using these functions. In some case, the loss is transient and a new request / call will be routed to another element the second time around, in other cases, it is permanent and the session / service cannot continue until another one is started.

In the case of scaling in or down, most vendors today will starve the virtual function and route all new requests to other VMs until this function can be shut down without impact to live traffic. It is not the fastest or the most efficient way to manage traffic. You essentially lose all the elasticity benefits on the scale down if you have to manage these moribund zombie-VNFs until they are ready to die.

Vendors and operators who have been looking at these issues have come to a conclusion. Beyond the separation of control and data plane, it is necessary to separate further the state of each machine, function service and to centralize it in order to achieve consistent availability, true elasticity and manage disaster recovery scenarios.

In most cases, this is a complete redesign for vendors. Many of them have already struggled to port their product to software, then port it to hypervisor, then optimized for performance... separating state from the execution environment is not going to be just another port. It is going to require redesign and re architecting.

The cloud-native vendors who have designed their platform with microservices and modularity in mind have a better chance, but there is still a series of challenges to be addressed. Namely, collecting state information from every call in every function, centralizing it and then redistribute it is going to create a lot of signalling traffic. Some vendors are advocating some inline signalling capabilities to convey the state information in a tokenized fashion, others are looking at more sophisticated approaches, including state controllers that will collect, transfer and synchronize relevant controllers across clouds.
In any case, it looks like there is still quite a lot of work to be done in creating truly elastic and highly available virtualized, software defined network.

Monday, October 19, 2015

SDN world 2015: unikernels, compromises and orchestrated obsolescence

Last week's Layer123 SDN and OpenFlow World Congress brought its usual slew of announcements and claims.

From my perspective, I have retained a contrasted experience from the show. 

On one hand, it is clear that SDN has now transitioned from proof of concept to commercial trial, if not full commercial deployment and operators are now increasingly understanding the limits of open source initiatives such as OpenStack for carrier-grade deployments. The telling sign is the increasing number of companies specialized in OpenFlow or other protocols high performance hardware based switches.

It feels that Open vSwitch has not hit its stride, notably in term of performance and operators are left with either going open source, cost efficient but not scalable nor performing or compromising with best of breed, hardware-based, hardened switches that offer high performance and scalability but not the agility of software-based implementation yet. What is new, however, is that operators seem ready to compromise for time to market, rather than wait for a possibly more open solution that could -  or not - deliver on its promises.

On the NFV front, I feel that many vendors have been forced to lower their silly claims in term of performance, agility and elasticity. It is quite clear that many of them have been called to prove themselves in operators' labs and have failed to deliver. In many cases, vendors are able to demonstrate agility, through VM porting / positioning using either their VNFM or an orchestrator's integration, they are even, in some cases, able to show some level of elasticity with auto-scaling powered by their own EMS, and many have put out press releases with Gbps or Tbps or millions of simultaneous sessions of capacity...
... but few are able to demonstrate all three at the same time, since their performance achievement has, in many cases been relying on SR-IOV to bypass the hypervisor layer, which ties the VM to the CPU in a manner that makes agility and elasticity extremely difficult to achieve.
Operators, here again, seem bound to compromise between performance or agility if they want to accelerate their time to market.

Operators themselves came in troves to show their progress on the subject, but I felt a distinct change in tone in term of their capacity to effectively get vendors deliver on the promises of the NFV successive white papers. One issue lies flatly on the operators' attitude themselves. Many MNO are displaying unrealistic and naive expectations. They say that they are investing in NFV as a means to attain vendor independence but they are unwilling to perform any integration themselves. It is very unlikely that large Telecom Equipment Manufacturer will willingly help deconstruct their value proposition by offering commoditized, plug-and-play, open interfaced virtualized functions.

SDN and NFV integration is still dirty work. Nothing really performs at line rate without optimization, no agility, flexibility, scalability is really attained without fine tuned integration. Operators won't realize the benefits of the technology if they don't get in on the integration work themselves.

At last, what is still missing from my perspective is a service creation strategy that would make use of a virtualized network. Most network operators still mention service agility and time to market as a key driver, but when asked what they would launch if their network was fully virtualized and elastic today, they quote disappointing early examples such as virtual (!?) VPN, security or broadband on demand... timid translations of existing "services" in a virtualized world. I am not sure most of the MNOs realize their competition is not each other but Google, Netflix, Uber, Facebook and others...
By the time they launch free and unlimited voice, data and messaging services underpinned by advertising or sponsored model, it will be quite late to think of new services, even if the network is fully virtualized. It feels like MNOs are orchestrating their own obsolescence.

At last, the latest buzzwords you must have in your presentation this quarter are:
The pet and cattle analogy, 
SD WAN,
5G

...and if you haven't yet formulated a strategy with respect to containers (Dockers, etc...) don't bother, they're dead and the next big thing are unikernels. This and more in my latest report and workshop on "SDN NFV in wireless networks 2015 / 2016".

Wednesday, November 30, 2011

Mobixell update and EVO launch

Mobixell was founded in December of 2000 to focus on mobile multimedia adaptation. Their first product, launched in 2002, was for MMS (Multimedia Messaging) adaptation and was sold through OEMs such as Huawei, Ericsson, NSN and others. It launched a mobile TV platform in 2008, and a mobile video optimization product in 2010. Along the way, Mobixell acquires Adamind in 2007, and 724 Solutions in 2010.


Mobixell has 16% market share of the deployed base of video optimization engines. Nearly 18 months after the launch of the video optimization module in their Seamless Access product suite, Mobixell launches EVO (for Evolved Optimization).


As a follow-up from the 360 degrees review of the video optimization market and in anticipation of the release of my market report, I had a recent chat with Yehuda Elmaliach, CTO and co-founder at Mobixell about their recent announcement, introducing Mobixell EVO.


"We wanted to address the issue of scalability and large deployments in video optimization in a new manner. As traffic grows for Gbps to 10's and 100's of Gbps, we see optimization and particularly;y real-time transcoding as a very CPU intensive activity, which can require a lot of CAPEX. The traditional scaling model, of adding new blades, chassis, sites does not make sense economically if traffic grows according to projections."
Additionally, Yehuda adds "We wanted to move away from pure volume reduction, as a percentage saving of traffic across the line to a more granular approach, focusing on congestion areas and peak hours."


Mobixell EVO is an evolution of Seamless Access video optimization that complements Mobixell capabilities with cloud-based services and benefits. The current Seamless Access product sits on the Gi Interface, after the GGSN and performs traffic management, shaping and video optimization. The video optimization features at that level are real-time transcoding, dynamic bit rate adaptation, offline transcoding and caching. Mobixell EVO proposes to complement or replace this arrangement with a cloud-based implementation that will provide additional computational power and storage in an elastic and cost effective manner for real time transcoding and for a hierarchical caching system.


Yehuda adds: "We have launched this product based on customer feedback and demand. We do not see customers moving their infrastructure to the cloud only for the purpose of optimization, but for those who already have a cloud strategy, it fits nicely. EVO is built on the principles of virtualization, geometric and automatic scalability and self replication to take advantage of the cloud architecture. "


An interesting development for Mobixell. EVO has no commercial deployment yet and is planned to be generally available in Q2 2012 after current ongoing trials and proof of concepts. Mobixell sees this platform being deployed first with carriers private clouds, then maybe using mixed private and public clouds. The idea is a waterfall implementation, where routine optimization is performed at the Gi level, then moves to private cloud or public ones as peak and surges appear on the network. The idea has a certain elegance, particularly for operators that experience congestion in a very peaky, localized manner. In that case a minimum investment can be made on Gi and complemented with cloud services as peaks reach certain thresholds. It will be interesting to see if Mobixell can live up to the promises of EVO, as security, bandwidth, latency and scalability can reduce the benefits of a mixed core / cloud implementation if not correctly addressed.
Mobixell is the second vendor to launch cloud based optimization after Skyfire.

Tuesday, May 24, 2011

Cloud or vault?

My understanding of cloud computing is somewhat superficial.  I am going to refer to it in the next few posts in the meaning of cloud computing, cloud services, software as a service (SAAS) and no doubt, applications you would not immediately associate with "the cloud".


The cloud is a means to separate data from it's processing and storage. Until recently, data and processing power had to be co-located. Word processors, for instance, are fat client installed on a PC, used to create, edit documents that are destined to be stored on the same unit. Cloud computing allows to separate the functions and to have for instance the storage, processing of the data physically separated from it's access and editing functions.
A browser, a thin client or an app present content and data, while its storage and computation happen in the cloud. This has been made possible by the increase in available fixed and wireless bandwidth and two key concepts I develop below.


Elasticity
As applications and content require more and more processing power and as the type of content and application is in constant flux, we need to have the capacity to have very flexible model for allocating in near real-time capacity for the processing, delivery, management of content and applications.
This concept in cloud computing is called elasticity.



Fungibility
If you operate a server, for instance to stream video, it has finished capacity in term of I/O, CPU, wattage,etc... When you reach the system's capacity, performance decrease and in some case, the application shuts down. Theoretically, in a cloud, you have a large amount of servers, that are not dedicated to one application in particular, but as demand increase in one service, then capacity can be captured from other resources. Of course, it means virtualization across application and intelligent networks that can organically adapt to the demand. Ideally, a large farm of servers or a collection of data centers present a general capacity, processing power, etc... that can be used by units, on demand by the residing applications.

This concept is called fungibility. You might have a large server farm, with several applications deployed concurrently in a virtualized environment. Ideally, the resources of the farm are dynamically allocated to each application as the demand for these resources vary over time.

Cloud or vault?
Cloud computing is a great evolution. It enables us to use resources in a more efficient manner, reducing fixed footprint for specific applications. What cloud computing is not, though is free, unlimited resources. Cloud computing is still bound by law of physics so if you need a lot of processing power or storage for an application, the fact that you are using cloud computing does not necessarily mean you are being more efficient. Cloud computing in my mind is particularly well suited for spiky, unpredictable, low I/O, transactional content and apps.

I am not saying that the cloud is not ready for business critical high bandwidth, high I/O traffic, just that I am not. It is more a matter of mindset, maybe of generation than technology.



I feel more confident and more in control with a vault than a cloud. I would keep all my content, programs, apps in a vault, that I can physically access myself, even if it is less efficient, more costly and ultimately less reliable than the cloud. That is until the cloud is so prevalent, with so much redundancy, so many safety nets that I could never loose one bit of data and the service could never be interrupted.

I am not ready to relinquish total control over my content and apps. I am less trained, equipped and capable than cloud services  providers, but I will need to change my mindset to choose a cloud service over my vault.

I will provide a couple of example of my best and worst experience with cloud computing as a consumer in the next few posts.



In the meantime, please comment, are you cloud or vault?