The TM Forum has been instrumental in defining the journey towards automation and autonomous telco networks.
As telco revenues from consumers continue to decline and the 5G promise to create connectivity products that enterprises, governments and large organizations will be able to discover, program and consume remains elusive, telecom operators are under tremendous pressure to maintain profitability.
The network evolution started with Software Defined Networks, Network Functions Virtualization and more recently Cloud Native evolution aims to deliver network programmability for the creation of innovative on-demand connectivity services. Many of these services require deterministic connectivity parameters in terms of availability, bandwidth, latency, which necessitate end to end cloud native fabric and separation of control and data plane. A centralized control of the cloud native functions allow to abstract resource and allocate them on demand as topology and demand evolve.
A benefit of a cloud native network is that, as software becomes more open and standardized in a multi vendor environment, many tasks that were either manual or relied on proprietary interfaces can now be automated at scale. As layers of software expose interfaces and APIs that can be discovered and managed by sophisticated orchestration systems, the network can evolve from manual, to assisted, to automated, to autonomous functions.
TM Forum defines 5 evolution stages from full manual operation to full autonomous networks.
- Condition 0 - Manual operation and maintenance: The system delivers assisted monitoring capabilities, but all dynamic tasks must be 0 executed manually
- Step 1 - Assisted operations and maintenance: The system executes a specific, repetitive subtask based on pre-configuration, which can be recorded online and traced, in order to increase execution efficiency.
- Step 2: - Partial autonomous network: The system enables closed-loop operations and maintenance for specific units under certain external environments via statically configured rules.
- Step 3 - Conditional autonomous network: The system senses real-time environmental changes and in certain network domains will optimize and adjust itself to the external environment to enable, closed-loop management via dynamically programmable policies.
- Step 4 - Highly autonomous network: In a more complicated cross-domain environment, the system enables decision-making based on predictive analysis or active closed-loop management of service-driven and customer experience-driven networks via AI modeling and continuous learning.
- Step 5 - Fully autonomous network: The system has closed-loop automation capabilities across multiple services, multiple domains (including partners’ domains) and the entire lifecycle via cognitive self-adaptation.
The stated goals of level 4 are to enable the creation and roll out of new services within 1 week with deterministic SLAs and the delivery of Network as a service. Furthermore, this level should allow fewer personnel to manage the network (1000's of person-year) while reducing energy consumption and improving service availability.
These are certainly very ambitious objectives. The paper goes on to describe "high value scenarios" to guide level 4 development. This is where we start to see cognitive dissonance creeping in between the stated objectives and the methodology. After all, much of what is described here exists today in cloud and enterprise environments and I wonder whether Telco is once again reinventing the wheel in trying to adapt / modify existing concepts and technologies that are already successful in other environments.
First, the creation of deterministic connectivity is not (only) the product of automation. Telco networks, in particular mobile networks are composed of a daisy chain of network elements that see customer traffic, signaling, data repository, look up, authentication, authorization, accounting, policy management functions being coordinated. On the mobile front, the signal effectiveness varies over time, as weather, power, demand, interferences, devices... impact the effective transmission. Furthermore, the load on the base station, the backhaul, the core network and the internet peering point also vary over time and have an impact on its overall capacity. As you understand, creating a connectivity product with deterministic speed, latency capacity to enact Network as a Service requires a systemic approach. In a multi vendor environment, the RAN, the transport, the core must be virtualized, relying on solid fiber connectivity as much as possible to enable the capacity and speed. The low latency requires multiple computing points, all the way to the edge or on premise. The deterministic performance requires not only virtualization and orchestration of the RAN, but also the PON fiber and end to end slicing support and orchestration. This is something that I led at Telefonica with an open compute edge computing platform, a virtualized (XGS) PON on a ONF ONOS VOLTHA architecture with an open virtualized RAN. This was not automated yet, as most of these elements were advanced prototype at that stage, but the automation is the "easy" part once you have assembled the elements and operated them manually for enough time. The point here is that deterministic network performances is attainable but still a far objective for most operators and it is a necessary condition to enact NaaS, before even automation and autonomous networks.
Second, the high value scenarios described in the paper are all network-related. Ranging from network troubleshooting, to optimization and service assurance, these are all worthy objectives, but still do not feel "high value" in terms of creation of new services. While it is natural that automation first focuses on cost reduction for roll out, operation, maintenance, healing of network, one would have expected more ambitious "new services" description.
All in all, the vision is ambitious, but there is still much work to do in fleshing out the details and linking the promised benefits to concrete services beyond network optimization.