Hot news

Microsoft builds a supercomputer to run OpenAI's AI research


Microsoft built a supercomputer for an artificial intelligence (AI) research startup OpenAI to train large batches of models, integrating thousands of Nvidia A100 graphics chips to support ChatGPT and Bing AI chatbot
The Windows maker invested $1 billion in OpenAI in 2019 and agreed to build a "large, cutting-edge supercomputer"
The goal of building this supercomputer is to provide enough computing power to train and retrain an increasingly large set of AI models that contain large amounts of data for long periods of time
said Nidhi Chappell, Microsoft's product lead for Azure High-Performance Computing and Artificial Intelligence
"So there was definitely a strong drive to train bigger models over a longer period of time, which means you not only need to have the biggest infrastructure, you need to be able to run them reliably for a long period of time."
At the Build developer conference in 2020, Microsoft announced that it was building a supercomputer - in collaboration with and exclusively for OpenAI - hosted in Azure specifically designed to train AI models
"Co-designing supercomputers with Azure has been critical to scaling our pressing AI training needs, making our search and alignment work on systems like ChatGPT possible," said Greg Brockman, President and Co-Founder of OpenAI
Microsoft has amassed tens of thousands of Nvidia A100 graphics chips to train AI models and changed the way servers are racked to prevent blackouts.
“We created a system architecture that could work and be reliable at a very large scale, and that's what made ChatGPT possible. This is the model that came out of it.
 "There will be many, many more," Chappelle was quoted as saying by Bloomberg
In terms of price, Scott Guthrie, Microsoft's executive vice president who oversees cloud and artificial intelligence, said the cost of the project is "probably greater" than several hundred million dollars 
To process and train models, OpenAI also needs more than a supercomputer - a powerful cloud setup 
Microsoft is already making Azure Cloud's AI capabilities even more powerful with new virtual machines that use Nvidia's H100 and A100 Tensor Core GPUs along with Quantum-2's InfiniBand network 
Microsoft will enable OpenAI and other companies — now using AI chatbots — that rely on Azure to train larger, more complex AI models 
“We saw that we would need to build special-purpose suites focused on enabling large training workloads, and OpenAI was one of the first proof-of-concepts of that 
“We worked closely with them to find out what key things they were looking for as they built their training environments and what key things they needed to do,” added Eric Boyd, Microsoft Azure AI Vice President,