Thursday, July 31, 2025

The Orchestrator Conundrum strikes again: Open RAN vs AI-RAN

10 years ago (?!) I wrote about the overlaps and potential conflicts of the different orchestration efforts between SDN and NFV. Essentially, observing that, ideally, it is desirable to orchestrate network resources with awareness of services and that service and resource orchestration should have hierarchical and prioritized interactions, so that a service deployment and lifecycle is managed within resource capacity and when that capacity fluctuates, priorities can be enforced.

Service orchestrators have not really been able to be successfully deployed at scale for a variety a reasons, but primarily due to the fact that this control point was identified early on as a strategic effort for network operators and traditional network vendors. A few network operators attempted to create an open source orchestration model (Open Source MANO), while traditional telco equipment vendors developed their own versions and refused to integrate their network functions with the competition. In the end, most of the actual implementation focused on Virtual Infrastructure Management (VIM) and vertical VNF management, while orchestration remained fairly proprietary per vendor. Ultimately, Cloud Native Network Functions appeared and were deployed in Kubernetes inheriting its native resource management and orchestration capabilities.

In the last couple of years, Open RAN has attempted to collapse RAN Element Management Systems (EMS), Self Organizing Networks (SON) and Operation Support Systems (OSS) with the concept of Service Management and Orchestration (SMO). Its aim is to ostensibly provide a control platform for RAN infrastructure and services in a multivendor environment. The non real time RAN Intelligent Controller (RIC) is one of its main artefacts, allowing the deployment of rApps designed to visualize, troubleshoot, provision, manage, optimize and predict RAN resources, capacity and capabilities.

This time around, the concept of SMO has gained substantial ground, mainly due to the fact that the leading traditional telco equipment manufacturers were not OSS / SON leaders and that Orchestration was an easy target for non RAN vendors wanting to find a greenfield opportunity. 

As we have seen, whether for MANO or SMO, the barriers to adoption weren't really technical but rather economic-commercial as leading vendors were trying to protect their business while growing into adjacent areas.

Recently, AI-RAN as emerged as an interesting initiative, positing that RAN compute would evolve from specialized, proprietary and closed to generic, open and disaggregated. Specifically, RAN compute could see an evolution, from specialized silicon to GPU. GPUs are able to handle the complex calculations necessary to manage a RAN workload, with spare capacity. Their cost, however, greatly outweighs their utility if used exclusively for RAN. Since GPUs are used in all sorts of high compute environments to facilitate Machine Learning, Artificial Intelligence, Large and Small Language Models, Models Training and inference, the idea emerged that if RAN deploys open generic compute, it could be used both for RAN workloads (AI for RAN), as well as workloads to optimize the RAN (AI on RAN and ultimately AI/ML workloads completely unrelated to RAN (AI and RAN).

While this could theoretically solve the business case of deploying costly GPUs in hundreds of thousands of cell site, provided that the compute idle capacity could be resold as GPUaaS or AIaaS, this poses new challenges from a service / infrastructure orchestration standpoint. AI RAN alliance is faced with understanding orchestration challenges between resources and AI workloads

In an open RAN environment. Near real time and non real time RICs deploy x and r Apps. The orchestration of the apps, services and resources is managed by the SMO. While not all App could be categorized as "AI", it is likely that SMO will take responsibility for AI for and on RAN orchestration. If AI and RAN requires its own orchestration beyond K8, it is unlikely that it will be in isolation from the SMO.

From my perspective, I believe that the multiple orchestration, policy management and enforcement points will not allow a multi vendor environment for the control plane. Architecture and interfaces are still in flux, specialty vendors will have trouble imposing their perspective without control of the end to end architecture. As a result, it is likely that the same vendor will provide SMO, non real time RIC and AI RAN orchestration functions (you know my feelings about near real time RIC)

If you make the Venn diagram of vendors providing / investing in all three, you will have a good idea of the direction the implementation will take.

Wednesday, April 16, 2025

Is AI-RAN the future of telco?

 AI-RAN has emerged recently as an interesting evolution of telecoms networks. The Radio Access Network (RAN) has been undergoing a transformation over the last 10 years, from a vertical, proprietary highly concentrated market segment to a disaggregated, virtualized, cloud native ecosystem.

Product of the maturation of a number of technologies, including telco cloudification, RAN virtualization and open RAN and lately AI/ML, AI-RAN has been positioned as a means to disaggregate and open up further the RAN infrastructure.

This latest development has to be examined from an economic standpoint. RAN accounts roughly for 80% of a telco deployment (excluding licenses, real estate...) costs. 80% of these costs are roughly attributable to the radios themselves and their electronics. The market is dominated by few vendors and telecom operators are exposed to substantial supply chain risks and reduced purchasing power.

The AI RAN alliance was created in 2024 to accelerate its adoption. It is led by network operators (T-Mobile, Softbank, Boost Mobile, KT, LG Uplus, SK Telecom...) telecom and IT vendors (Nvidia, arm, Nokia, Ericsson Samsung, Microsoft, Amdocs, Mavenir, Pure Storage, Fujitsu, Dell, HPE, Kyocera, NEC, Qualcomm, Red Hat, Supermicro, Toyota...).

If you are familiar with this blog, you already know of the evolution from RAN to cloud RAN and Open RAN, and more recently the forays into RAN intelligence with the early implementations of near and non real time RAN Intelligence Controller (RIC)

AI-RAN goes one step further in proposing that the specialized electronics and software traditionally embedded in RAN radios be deployed on high compute, GPU based commercial off the shelf servers and that these GPUs manage the complex RAN computation (beamforming management, spectrum and power optimization, waveform management...) and double as a general high compute environment for AI/ML applications that would benefit from deployment in the RAN (video surveillance, scene, object, biometrics recognition, augmented / virtual reality, real time digital twins...). It is very similar to the edge computing early market space.

The potential success of AI-RAN relies on a number of techno / economic assumptions:

For Operators:

  • It is desirable to be able to deploy RAN management, analytics, optimization, prediction, automation algorithms in a multivendor environment that will provide deterministic, programmable results.
  • Network operators will be able and willing to actively configure, manage and tune RAN parameters.
  • Deployment of AI-RAN infrastructure will be profitable (combination of compute costs being offloaded by cost reduction by optimization and new services opportunities).
  • AI-RAN power consumption, density, capacity, performance will exceed traditional architectures in time.
  • Network Operator will be able to accurately predict demand and deploy infrastructure in time and in the right locations to capture it.
  • Network Operators will be able to budget the CAPEX / OPEX associated with this investment before revenue materialization.
  • An ecosystem of vendors will develop that will reduce supply chain risks

For vendors:

  • RAN vendors will open their infrastructure and permit third parties to deploy AI applications.
  • RAN vendors will let operators and third parties program the RAN infrastructure.
  • There is sufficient market traction to productize AI-RAN.
  • The rate of development of AI and GPU technologies will outpace traditional architecture.
  • The cost of roadmap disruption and increased competition will be outweighed by the new revenues or is the cost to survive.
  • AI-RAN represents an opportunity for new vendors to emerge and focus on very specific aspects of the market demand without having to develop full stack solutions.

For customers:

  • There will be a market and demand for AI as a Service whereas enterprises and verticals will want to use a telco infrastructure that will provide unique computing and connectivity benefits over on-premise or public cloud solutions.
  • There are AI/ML services that (will) necessitate high performance computing environments, with guaranteed, programmable connectivity with a cost profile that is better mutualized through a multi tenant environment
  • Telcom operators are the best positioned to understand and satisfy the needs of this market
  • Security, privacy, residency, performance, reliability will be at least equivalent to on premise or cloud with a cost / performance benefit. 
As the market develops, new assumptions are added every day. The AI-RAN alliance has defined three general groups to create the framework to validate them: 
  1. AI for RAN: AI to improve RAN performance. This group focuses on how to program and optimize the RAN with AI. The expectations is that this work will drastically reduce the cost of RAN, while allowing sophisticated spectrum, radio waves and traffic manipulations for specific use cases.
  2. AI and RAN: Architecture to run AI and RAN on the same infrastructure. This group must find the multitenant architecture allowing the system to develop into a platform able to host a variety of AI workloads concurrently with the RAN. 
  3. AI on RAN: AI applications to run on RAN infrastructure. This is the most ambitious and speculative group, defining the requirements on the RAN to support the AI workloads that will be defined
As for Telco Edge Computing, and RAN intelligence, while the technological challenges appear formidable, the commercial and strategic implications are likely to dictate whether AI RAN will succeed. Telecom operators are pushing for its implementation, to increase control over spending, and user experience of the RAN, while possibly developing new revenue with the diffusion of AIaaS. Traditional RAN vendors see the nascent technology as further threat to their capacity to sell programmable networks as black boxes, configured, sold and operated by them. New vendors see the opportunity to step into the RAN market and carve out market share at the expense of legacy vendors.