Your single-model AI strategy is costing you millions
Are you using the same, expensive, artificial intelligence (AI) model for every single task? You're likely hemorrhaging money without even knowing it. In my latest blog, I reveal how telcos can slash AI costs by adopting a smarter, task-specific approach to AI model selection. From network operations to customer service, I offer real-world scenarios where choosing the right model can mean the difference between budget breakthrough and budget breakdown. Give it a read now!
Episode 105 NVIDIA’s vision for AI and the RAN
AI is poised to help telcos turn underutilized network capacity into a new revenue stream. NVIDIA's Chris Penrose shares how AI-RAN can help operators unlock this potential and transform their business.
The annual Amazon Web Services (AWS) re:Invent conference was held earlier this month in Las Vegas. The major theme was generative AI (GenAI), of course, and how to empower AWS customers in their pursuit of it. For telcos, these are the major takeaways: CHIPS, MODELS, and SERVICES.
Starting with chips ... as you know, AWS regularly releases newer, better versions of its own custom-designed chips, optimized for the cloud and a variety of different workloads. At re:Invent, it announced new Elastic Compute Cloud (EC2) instances that will feature the expensive, coveted NVIDIA Blackwell chips, due out next year; EC2 Trn2 instances and EC2 Trn2 “ultra servers” powered by the AWS Trainium2 chips for training and deploying large language models (LLMs), now generally available; and the 4x faster Trainium3 chips, also coming next year.
There are new foundation models on the block, and they come from AWS itself. AWS announced its Nova family of GenAI models for text, images, and video that it says are on par with offerings from OpenAI GPT, Google’s Gemini, and Anthropic’s Claude in terms of cost and performance. This is MAJOR news—and a good reminder of why we should all keep our AI options open right now.
GenAI all over: As we saw at Microsoft’s and Google’s conferences earlier this year, GenAI has also seeped into a lot of different AWS products. SageMaker, a data, analytics and AI platform, got a “next generation” upgrade to simplify the process of deploying AI and ML models, while providing a more integrated experience. The Bedrock LLM hosting service has a couple of new features to help control the cost of using LLMs, as well as a new marketplace for about 100 emerging LLMs. Amazon Q Developer and Amazon Q Business also got some AI-powered updates to help workers get busywork done faster and with less effort. Another notable: Amazon debuted Aurora DSQL, the fastest distributed SQL database to date.
With these moves, AWS is setting itself apart from the other hyperscalers and taking a different approach to GenAI. As I blogged about this week, AWS is focused on helping enterprises apply a specialized approach to AI, giving customers (YOU) more choices as you dive in and use it, and allowing everyone to optimize on model performance, accuracy, and cost as the technology continues to develop at a very fast pace.
Can you believe it? It’s time to start gearing up for Mobile World Congress (MWC) 2025, running March 3-6 at the FIRA in beautiful Barcelona. I’ll be giving one of my must-see talks at the GenAI Summit on March 3 where Totogi is a sponsor. Our team will also have a booth in Hall 2 and be in the AWS space. Be sure to stop by for demos of our amazing products! Want more details? DM me on LinkedIn or X @TelcoDR.
Microsoft has sold Metaswitch, excluding the Nexus portfolio, to Alianza in an acquisition of 300 employees, 400+ patents, and the extensive Metaswitch product lineup. If you recall Microsoft bought Metaswitch in 2020 for $270M; terms of the Alianza deal were not disclosed. We’re excited for Metaswitch customers to end up on a path to the public cloud, which is where Alianza plans to take them. Founder and CEO Brian Beutler told us all about the company’s cloud-centric strategy on the Telco in 20 podcast earlier this year, and also just recorded another episode with me about the transaction this week. Look for it in the new year!
Google’s Quantum AI team just unveiled Willow, a state-of-the-art quantum chip. Willow reduces errors as more qubits are added, achieving the first "below threshold" quantum error correction, a crucial step in making quantum systems robust for practical applications. It also completed a benchmark computation in under five minutes, which would take the world’s fastest supercomputers 10 septillion years! The takeaway: this is a huge leap forward and has the potential to revolutionize AI and other fields.
And that’s not all Google has been up to—it also pulled back the curtain on Gemini 2.0, the latest version of its GenAI model that wants to be your digital assistant. It can process text, images, and audio simultaneously, and actually DO things for you, like searching the web or working with apps. Early testing shows that it’s twice as fast as Gemini 1.5 Pro, and it’s crushing AI benchmarks. Developers can start playing with Gemini 2.0 Flash now through Google AI Studio and Vertex AI! (Totogi developers are already testing this out!) 🤖
OpenAI rolled out ChatGPT Pro, a $200/month plan that provides “scaled” access to the latest, greatest models and tools. Subscribers get unlimited access to the “smartest” model, OpenAI o1, and the o1-mini, GPT-4o, and Advanced Voice, as well as the o1 mode, an advanced version that can handle more complex problems and has additional features to support sophisticated inquiries. This is a new way for OpenAI to monetize its products, and there’s speculation that it’s looking for more—possibly introducing ads. Like I said in an earlier blog, the GenAI market is advancing at a furious pace!