Thursday, May 21, 2026

Non-RT RIC, AI-RAN, and the AI Grid: Three Different Bets on the Future of the RAN


I have been asked a few times lately what the difference is between the Non-Real Time RIC and AI-RAN. The question itself tells you something. Both sit under the broad "AI in the RAN" umbrella, marketed aggressively by the same vendors, debated in the same conference sessions. But they are fundamentally different in architecture, ambition, and business model. And neither is quite the same as what NVIDIA formally branded the AI Grid at GTC 2026 — which is where the most important and most misread opportunity actually sits.

The Non-RT RIC: the pragmatic bet

The Non-RT RIC is an O-RAN defined software layer that sits in the Service Management and Orchestration layer above the RAN, not inside it. Control loops over one second. rApps for energy saving, traffic steering, slice assurance, automated optimization. Think of it as the evolution of Self-Organizing Networks, re-platformed on open interfaces with a proper application model and a genuinely lower barrier to entry — cloud-native and OSS skills are sufficient. No RAN silicon expertise required.

This is precisely why the early commercial traction is here, not in AI-RAN. AT&T is deploying Ericsson's SMO and Non-RT RIC to replace two legacy C-SON systems. TELUS has launched an RIC platform alongside its Open RAN rollout. Swisscom is deploying one for multi-technology network management. These are not trials. These are production decisions.

AI-RAN: real performance gains, speculative revenue

AI-RAN embeds AI natively into the RAN stack itself .The AI-RAN Alliance — founded in February 2024, now at 109 member companies — defines it across three working groups: AI-for-RAN, AI-and-RAN, and AI-on-RAN.

AI-for-RAN is the most mature: using AI to optimize the RAN itself — the scheduler, link adaptation, beamforming, interference management. T-Mobile and Ericsson have been trialing an AI-driven scheduler and link adaptation engine on a live 5G Advanced network since Q2 2025, targeting commercial deployment in Q3 2026. Nokia and NVIDIA, backed by a $1 billion equity partnership, are testing GPU-accelerated AI-RAN with BT, Elisa, NTT DOCOMO, and Vodafone.

AI-and-RAN is where the narrative gets more ambitious — and more speculative. The idea is that RAN sites become shared compute infrastructure, running both network workloads and enterprise AI workloads on the same hardware. The tower becomes a distributed AI compute node. New revenue streams. Operators escape the utility trap.

AI-on-RAN is the monetization layer for the above. The commercial mechanisms are still being defined. That tells you where the maturity is.

The AI Grid: follow NVIDIA's sequencing, not its marketing

At GTC 2026, NVIDIA formally introduced the AI Grid as a reference design — geographically distributed AI infrastructure, using the telco footprint to run inference workloads closer to users. The numbers are interesting: early Comcast benchmarks showed inference cost reductions of up to 76% versus centralized deployments. HPE, SpectroCloud, and others have already announced implementations aligned to the reference architecture.

I have used this concept in my own work for years to describe the evolution from isolated MEC deployments into a coherent, programmable distributed inference fabric. Good to see NVIDIA put a formal architecture behind it. But the marketing obscures a critical sequencing question.

NVIDIA's own GTC announcements noted that many operators are starting by lighting up existing wired edge sites — central offices and mobile switching offices — as AI Grids they can monetize today. The cell site layer is a later phase. AT&T's CTO Igal Elbaz has been direct about questioning the value of pushing compute all the way to the far edge to save one or two milliseconds of latency. T-Mobile's SVP of network infrastructure defined her AI edge strategy as what is at a data center at a mobile switching office. Verizon's CTO has flagged the cost and complexity of far-edge GPU deployments.

These are the three largest US operators. They are not being conservative for the sake of it. The economics are straightforward: central offices and mobile switching offices already have power, cooling, connectivity, and physical security. They aggregate traffic from hundreds of cell sites. The sub-500ms latency threshold that NVIDIA's own reference design targets is achievable from a well-positioned CO. It does not require a GPU at the tower — not for the use cases that have a business case today.

I have seen this movie before with MEC. The industry led with its most ambitious architectural vision, ran the infrastructure investment ahead of the demand, and recovered slowly. The AI Grid does not have to repeat that pattern.

What to actually do

Start with the Non-RT RIC. The contracts are being signed, the ecosystem is opening, the business case is defensible.

On AI-RAN, wait for AI-for-RAN where your vendors have credible near-term roadmaps. Treat AI-and-RAN at the cell site as a long term speculative option — worth tracking, too early to fund at scale.

On the AI Grid, follow NVIDIA's own sequencing rather than the brochure. Central offices and mobile switching offices first. Build the orchestration and service layer from there outward. Expand to the far edge when the use cases and economics justify it — not because a GPU manufacturer's demand forecast requires it.

The cell site AI Grid is a compelling long-term vision. The central office AI Grid is deployable today. In this industry, deployable usually wins.