AI, and more particularly generative AI has been a big buzzword since the public launch of GTP. The promises of AI to automate and operate complex tasks and systems are pervading every industry and telecom is not impervious to it.
Wednesday, January 31, 2024
The AI-native telco network
Monday, January 15, 2024
Gen AI and LLM: Edging the Latency Bet
The growth of generative AI and Large Language Models has restarted a fundamental question about the value of a millisecond of latency. When I was at Telefonica, and later, consulting at Bell Canada, one of the projects I was looking after was the development, business case, deployment, use cases and operation of Edge Computing infrastructure in a telecom network.
Since I have been developing and deploying Edge Computing platforms since 2016, I have had a head start in figuring out the fundamental questions surrounding the technlogy, the business case and the commercial strategy.
Where is the edge?
The first question one has to tackle is where is the edge. It is an interesting question because it depends on your perspective. the edge is a different location if you are an hyperscaler, a telco network operator or a developer. It can also vary over time and geography. In any case, the edge is a place where one can position compute closer than the current Public or Private Cloud Infrastructure in order to derive additional benefits. It can vary from a regional, to a metro to a mini data center, all the way to on premise or on device cloud compute capability. Each has its distinct cost, limitation and benefit.
What are the benefits of Edge Computing?
The second question, or maybe the first one, from a pragmatic and commercial standpoint is why do we need edge computing? What are the benefits?
While these will vary depending on the consumer of the compute capability, and where the compute function is located, we can derive general benefits that will be indexed to the location. Among these, we can list data sovereignty, increased privacy and security and reduced latency, enabling cheaper (dumber) devices, the creation of new media types and new models and services.
What are the use cases of Edge Computing?
I have deployed and researched over 50 use cases of edge computing, from the banal storage, caching and streaming at the edge to the sophisticated TV production or the specialized Open RAN or telco User Plane Function or machine vision use cases for industrial and agriculture application.
What is the value of 1ms?
Sooner or later, after testing various use cases, locations and architectures, the fundamental question emerges. What is the value of 1ms? It is a question heavy with assumptions and correlations. In absolute, we would all like a connectivity that is faster, more resilient, more power efficient, economical and with lower latency. The factors that condition latency are the number of hops or devices the connection has to go through between the device and the origin point where the content or code is stored, transformed, computed and the distance between the device and the compute point. To radically reduce latency, you have to reduce the number of hops or reduce the distance, Edge Computing achieves both.But obviously, there is a cost. The latency will be proportional to the distance, so the fundamental question becomes what is the optimal placement of a compute resource, for which use case? Computing is a continuum and some applications and workload are not latency or privacy or sovereignty sensitive and can run on an indiscriminate public cloud, while others necessitate the compute to be in the same country, region or city. Others even require a closer proximity. The difference is staggering in terms of investments between a handful of centralized data centers and several hundreds / thousands? of micro data center.
What about AI and LLM?
Until now, these questions where somewhat theoretical and were answered organically by hyperscalers and operators based on their respective view of the market evolution. Generative AI and its extraordinary appetite for compute is rapidly changing this market space. Not only Gen AI accounts for a sizable and growing portion of all cloud compute capacity, the question of latency now is getting to the fore. Gen AI relies on Large Language Models that require large amount of storage and compute, to be able to to be trained to recognize patterns. The larger the LLM, the more compute capacity, the better the pattern recognition. Pattern recognition leads to generation of similar results based on incomplete prompts / question / data set, that is Gen AI. Where does latency come in? Part of the compute to generate a response to a question is in the inference business. While the data set resides in a large compute data center in a centralized cloud, inference is closer to the user, at the edge, where it parses the request and attempts to feed the trained model with unlabeled input to receive a prediction of the answer based on the trained model. The faster the inference is, the more responses the model can provide, which means that low latency, is a competitive advantage for a gen AI service.
As we have seen there is a relatively small number of options to reduce latency and they all involve large investment. The question then becomes: what s the value of a millisecond? Is 100 or 10 sufficient? When it comes to high frequency trading, 1ms is extremely valuable (billions of dollars). When it comes to online gaming, low latency is not as valuable as controlled and uniform latency across the players. When it comes to video streaming, latency is generally not an issue, but when it comes to machine vision for sorting fruits on a mechanical conveyor belt running at 10km/h, it is very important.
If you would like to know more, please get in touch.
Friday, November 3, 2023
Telco edge compute, RAN and AI
In recent years, the telecommunications industry has witnessed a profound transformation, driven by the rapid penetration of cloud technologies. Cloud Native Functions have become common in the packet core, OSS BSS, transport and are making their way in the access domain, both fixed and mobile. CNFs mean virtual infrastructure management and data centers have become an important part of network capex strategies.
Traditional centralized cloud infrastructure is being augmented with edge computing, effectively bringing computation and data storage closer to the point of data generation and consumption.
What are the benefits of edge computing for telecom networks?
- Low Latency: One of the key advantages of edge computing is its ability to minimize latency. This is of paramount importance in telecoms, especially in applications like autonomous vehicles, autonomous robots / manufacturing, and remote-controlled machinery.
- Bandwidth Efficiency: Edge computing reduces the need for transmitting massive volumes of data over long distances, which can strain network bandwidth. Instead, data processing and storage take place at the edge, significantly reducing the burden on core networks. This is particularly relevant for machine vision, video processing and AI use cases.
- Enhanced Security: Edge computing offers improved security by allowing sensitive data to be processed locally. This minimizes the exposure of critical information to potential threats in the cloud. Additionally, privacy, data sovereignty and residency concerns can be efficiently addressed by local storage / computing.
- Scalability: Edge computing enables telecom operators to scale resources as needed, making it easier to manage fluctuating workloads effectively.
- Simpler, cheaper devices: Edge computing allows devices to be cheaper and simpler while retaining sophisticated functionalities, as storage, processing can be offloaded to a nearby edge compute facility.
Current Trends in Edge Computing for Telecoms
The adoption of edge computing in telecoms is rapidly evolving, with several trends driving the industry forward:- 5G and private networks Integration: The deployment of 5G networks is closely intertwined with edge computing. 5G's high data transfer rates and low latency requirements demand edge infrastructure to deliver on its promises effectively. The cloud RAN and service based architecture packet core functions drive demand in edge computing for the colocation of UPF and CU/DU functions, particularly for private networks.
- Network Slicing: Network operators are increasingly using network slicing to create virtualized network segments, allowing them to allocate resources and customize services for different applications and use cases.
- Ecosystem Partnerships: Telcos are forging partnerships with cloud providers, hardware manufacturers, and application developers to explore retail and wholesale edge compute services.
Future Prospects
The future of edge computing in telecoms offers several exciting possibilities:- Edge-AI Synergy: As artificial intelligence becomes more pervasive, edge computing will play a pivotal role in real-time AI processing, enhancing applications such as facial recognition, autonomous drones, and predictive maintenance. Additionally, AI/ML is emerging as a key value proposition in a number of telco CNFs, particularly in the access domain, where RAN intelligence is key to optimize spectrum and energy usage, while tailoring user experience.
- Industry-Specific Edge Solutions: Different industries will customize edge computing solutions to cater to their unique requirements. This could result in the development of specialized edge solutions for healthcare, manufacturing, transportation, and more.
- Edge-as-a-Service: Telecom operators are likely to offer edge services as a part of their portfolio, allowing enterprises to deploy and manage edge resources with ease.
- Regulatory Challenges: As edge computing becomes more integral to telecoms, regulatory challenges may arise, particularly regarding data privacy, security, and jurisdictional concerns.
New revenues streams can also be captured with the deployment of edge computing.
- For consumers, it is likely that the lowest hanging fruit in the short term is in gaming. While hyperscalers and gaming companies have launched their own cloud gaming services, their success has been limited due to the poor online experience. The most successful game franchises are Massive Multiplayer Online. They pitch dozens of players against each other and require a very controlled latency between all players for a fair and enjoyable gameplay. Only operators can provide controlled latency if they deploy gaming servers at the edge. Without a full blown gaming service, providing game caching at the edge can drastically reduce the download time for games, updates and patches, which increases dramatically player's service satisfaction.
- For enterprise users, edge computing has dozens of use cases that can be implemented today that are proven to provide superior experience compared to the cloud. These services range from high performance cloud storage, to remote desktop, video surveillance and recognition.
- Beyond operators-owned services, the largest opportunity is certainly the enablement of edge as a service (EaaS), allowing cloud developers to use edge resources as specific cloud availability zones.
Wednesday, October 18, 2023
Generative AI and Intellectual Property
Since the launch of ChatGPT, Generative Artificial Intelligence and Large Language Models have gained an extraordinary popularity and agency in a very short amount of time. As we are all playing around with the most approachable use cases to generate texts, images and videos, governments, global organizations and companies are busy developing the technology; and racing to harness the early mover's advantage this disruption will bring to all areas of our society.
I am not a specialist in the field and my musings might be erroneous here, but it feels that the term Gen AI might be a little misguiding, since a lot of the technology relies on vast datasets that are used to assemble composite final products. Essentially, the creation aspect is more an assembly than a pure creation. One could object that every music sheet is just an assembly of notes and that creation is still there, even as the author is influenced by their taste and exposure to other authors... Fair enough, but in the case of document / text creation, it feels that the use of public information to synthetize a document is not necessarily novel.
In any case, I am an information worker, most times a labourer, sometimes an artisan but in any case I live from my intellectual property. I chose to make some of that intellectual property available license free here on this blog, while a larger part is sold in the form of reports, workshops, consulting work, etc... This work might or not be license-free but it is in always copyrighted, meaning that I hold the rights to the content and allow its distribution under specific covenants.
It strikes me that, as I see crawlers go through my blog and indexing the content I make publicly available, it serves two purposes at odds with each other. The first, allows my content to be discovered and to reach a larger audience, which benefits me in terms of notoriety and increased business. The second, more insidious not only indexes but mines my content to aggregate in LLMs so that it can be regurgitated and assembled by an AI. It could be extraordinarily difficult to apportion an AI's rendition of an aggregated document to its source, but it feels unfair that copyrighted content is not attributed.
I have playing with the idea of using LLM for creating content. Anyone can do that with prompts and some license-free software, but I am fascinated with the idea of an AI assistant that would be able to write like me, using my semantics and quirks and that I could train through reinforcement learning from human feedback. Again, this poses some issues. To be effective, this AI would have to have access to my dataset, the collection of intellectual property I have created over the years. This content is protected and is my livelihood, so I cannot part with it with a third party without strict conditions. That rules out free software that can reuse whatever content you give it to ingest.
With licensed software, I am still not sure the right mechanisms are in place for copyright and content protection and control, so that I can ensure that the content I feed to the LLM remains protected and accessible only to me, while the LLM can ingest other content from license free public domain to enrich the dataset.
Are other information workers worried that LLM/AI reuses their content without attribution? Is it time to have a conversation about Gen AI, digital rights management and copyright?
***This blog post was created organically without assistance from Gen AI, except from the picture created from Canva.com