GenAI race hots up as Microsoft launches own AI chips
Summary
- Microsoft expects to roll out the new chips early next year to its datacenters and will begin by powering the company’s services such as Microsoft Copilot and Azure OpenAI Service
Microsoft on Wednesday unveiled two custom-designed chips and systems at its Ignite 2023 event in Seattle. The move will help the Redmond company reduce its dependence on artificial intelligence (AI) chips from companies like Nvidia, leverage the investment it has made in OpenAI, and stave off competition from chipmakers like Intel and AMD.
While Microsoft’s Azure Maia AI Accelerator has been optimized for AI- and generative AI-specific tasks, its Azure Cobalt CPU (central processing unit) is an Arm-based processor that will cater to general-purpose tasks on Microsoft Cloud.
Microsoft has reportedly been working on developing an AI chip since 2019. Code-named Athena internally, the AI chip was also made available to a small group of Microsoft and OpenAI employees for testing, but Microsoft never officially confirmed the development. In a July 2021 blog, Microsoft described ‘Project Maia’ as a deep learning framework that plays chess to explore the relationship between humans and AI.
For Maia, the company used a deep reinforcement learning neural network that was earlier used to predict the optimal move for a given chess board position and retrained it to predict what a human player would do. Microsoft had then said that "The larger vision of Maia is to create a more productive relationship between humans and AI in chess, with the hope of applying these learnings to other domains".
Maia, in its current avatar, has already been tested by Microsoft-backed OpenAI. According to Sam Altman, CEO of OpenAI. “We were excited when Microsoft first shared their designs for the Maia chip, and we’ve worked together to refine and test it with our models. Azure’s end-to-end AI architecture, now optimized down to the silicon with Maia, paves the way for training more capable models and making those models cheaper for our customers."
Microsoft expects to roll out the new chips early next year to its datacenters and will begin by powering the company’s services such as Microsoft Copilot and Azure OpenAI Service. Engineers at its Redmond campus, and those in its multiple Centres of Excellence (COEs) around the world including India, are currently collaborating to test these chips to gauge how they work with customised server boards and tailor-made server racks that fit inside existing Microsoft data centres.
“Microsoft is building the infrastructure to support AI innovation, and we are reimagining every aspect of our datacenters to meet the needs of our customers," said Scott Guthrie, executive vice president of Microsoft’s Cloud + AI Group. According to Rani Borkar, corporate vice president for Azure Hardware Systems and Infrastructure (AHSI), Microsoft "wants to infuse AI into every experience and workload".
Insisting that while Microsoft is "known as a software company", "we are a systems company", she elaborated that the "building blocks" of a systems company include "servers, silicon, data centres, and networking". Borkar also underscored the importance of expanding industry partnerships, highlighting that Microsoft is now working with both Nvidia and AMD to develop AI chips in a bid to give "options to our customers".
For instance, Microsoft's NC H100 v5 Virtual Machine Series has been built for Nvidia H100 Tensor Core GPUs, "offering greater performance, reliability and efficiency for mid-range AI training and generative AI inferencing (a system's ability to make predictions from new data)". The company will also be adding the AMD MI300X accelerated VMs to Azure. The ND MI300 virtual machines are "designed to accelerate the processing of AI workloads for high range AI model training and generative inferencing, and will feature AMD’s latest GPU, the AMD Instinct MI300X". Microsoft is simultaneously designing second-generation versions of the Azure Maia AI Accelerator series and the Azure Cobalt CPU series.
“We have visibility into the entire stack, and silicon is just one of the ingredients," Borkar said. The "ingredients" include Azure Boost, a system that makes storage and networking faster by taking those processes off the host servers onto purpose-built hardware and software. As an example, the Maia 100 AI Accelerator has also been designed specifically for the Azure hardware stack, according to Brian Harry, a Microsoft technical fellow leading the Azure Maia team.
The market opportunity for AI chips is huge. According to a 22 August note by research firm Gartner, semiconductors designed to execute AI workloads will represent a $53.4 billion revenue opportunity for the semiconductor industry in 2023, an increase of 20.9% from 2022.
Alan Priestley, VP Analyst at Gartner, attributes the growth to the developments in generative AI and the increasing use of a wide range AI-based applications in data centers, edge infrastructure and endpoint devices which require the deployment of high-performance GPUs and optimized semiconductor devices. Gartner predicts that while AI semiconductor revenue may touch $67.1 billion in 2024, AI chips revenue may more than double the size of the market in 2023, reaching $119.4 billion, by 2027.
The explosive growth of AI and generative AI has already transformed Nvidia's graphics processing units, or GPUs, pushing the company's market capitalization to more than $1.2 trillion. Jensen Huang, Nvidia’s co-founder, president, and chief executive officer has been promoting 'accelerated computing', a term that blends CPUs, GPUs and and other processors such as data processing units (DPUs) "together as equals in an architecture sometimes called heterogeneous computing".
Nvidia has also been able to move beyond providing GPUs to just the gaming sector and leveraging the GPU architecture to create platforms for scientific computing, autonomous vehicles, robotics, metaverse and 3D internet applications among others. Nvidia's GPUs feed industries that are equally varied, from airports to food and OpenAI's ChatGPT, which has become the poster boy of Generative AI.
But Nvidia is a fabless company that does not manufacture its own chips. Intel, on the other hand, has its own foundries to make its chips but the company lost ground to semiconductor manufacturing rival, Taiwan Semiconductor Manufacturing Company (TSMC), which helped its rivals, AMD and Nvidia, to eat into its market share. Intel, too, is trying to get back on track. Intel CEO Pat Gelsinger is pushing a term he coined, 'Siliconomy', to describe "...an evolving economy enabled by the magic of silicon where semiconductors are essential to maintaining and enabling modern economies".
At the Intel Innovation 2023 this September in San Jose, California, Gelsinger said the company's "five-nodes-in-four-years process development program is progressing well...with Intel 7 already in high-volume manufacturing, Intel 4 manufacturing-ready and Intel 3 on track for the end of this year". Intel is also readying its 18A (1.8 nanometer class) and 20A (2 nanometer class) process nodes to stave off the competition. Intel is also committed “to address every phase of the AI continuum", according to Gelsinger, who added that this includes generative AI and large language models (LLMs). Intel also unveiled an array of technologies to make AI more accessible for individuals and companies, and for edge (closer to the user devices), network and cloud computing workloads. These will include AI-enabled Intel PCs that will ship in 2024.
Intel is building AI PCs too with Intel Core Ultra processors, code-named Meteor Lake, featuring Intel’s first integrated neural processing unit (NPU) "for power-efficient AI acceleration and local inference on the PC". Acer is already working on powering its laptops with Core Ultra processors, according to Jerry Kao, chief operating officer of Acer.
Given these developments, Microsoft's move to design its own chips may serve it well.
(The author is in Redmond at the invitation of Microsoft)