ChatGPT is made possible by tens of thousands of Nvidia GPUs, which Microsoft is now upgrading


Looking forward: A new report reveals the huge amount of Nvidia GPUs Microsoft uses and the innovations it has taken in arranging them to help OpenAI train ChatGPT. The news comes as Microsoft announces a major upgrade of its AI supercomputer to bolster its Homegrown AI initiative.

According to Bloomberg, OpenAI been trained ChatGPT on a supercomputer built by Microsoft from tens of thousands of Nvidia A100 GPUs. Microsoft announced a new group Benefit The latest H100 GPUs from Nvidia this week.

The challenge for companies began in 2019 after Microsoft invested $1 billion in OpenAI with the agreement to build an AI startup supercomputer. However, Microsoft didn’t have the internal hardware for what OpenAI needed.

After acquiring Nvidia chips, Microsoft had to rethink how it arranged such a huge number of GPUs to prevent overheating and blackouts. The company won’t say exactly how much the endeavor will cost, but Executive Vice President Scott Guthrie put the figure above several hundred million dollars.

Also read: Has Nvidia won the AI ​​training market?

Powering up all the A100s at the same time forced Redmond to think about how to position them and their power supplies. It also had to develop new software to increase efficiency, ensure networking equipment could handle massive amounts of data, design new cable racks that could independently manufacture, and use multiple cooling methods. Depending on the changing climate, cooling techniques included evaporation, swamp coolers, and outdoor air.

Since ChatGPT’s initial success, Microsoft and some of its competitors have begun working on parallel AI models for search engines and other applications. To speed up generative AI, the company introduced the ND H100 v5 VM, a virtual machine that can use between eight and thousands of Nvidia H100 GPUs.

The H100s communicate through NVSwitch and NVLink 4.0 with 3.6TB/s of binary bandwidth between each of the 8 local GPUs within each virtual machine. Each GPU features 400Gb/s bandwidth through Nvidia Quantum-2 CX7 InfiniBand and 64Gb/s PCIe5 connections. Each virtual machine manages 3.2 TB/s through an unbroken fat tree network. The new Microsoft system also features 4th generation Intel Xeon processors and 16-channel 4800MHz DDR5 RAM.

Microsoft plans to use the ND H100 v5 VM for the new AI-powered Bing search engine, the Edge web browser, and Microsoft Dynamics 365. The virtual machine is now available for preview and will come standard with the Azure suite. potential users Access can be requested.



Source link

Related Posts

Precaliga