Hot news

Google: Our AI supercomputer is faster and more efficient than Nvidia's


Google on Tuesday released new details about the supercomputers it uses to train its artificial intelligence models , saying its systems are faster and more power-efficient than similar systems from Nvidia, according to aitnews .

Google has designed its own custom chip called the Tensor Processing Unit , or TPU . It uses these chips for more than 90 percent of its work training artificial intelligence, a process that focuses on feeding data into models to make it useful for tasks such as answering queries with human-like text or creating images.

Google TPU is now in its fourth generation. On Tuesday, Google published a scientific paper detailing how to fuse more than 4,000 chips together in a supercomputer using its specially developed optical switches to help connect individual devices.

Improving these connections has become a key point in the competition among companies building AI supercomputers because the so-called large language models power technologies such as Google's Bard or OpenAI 's ChatGPT . ) OpenAI is hypertrophied, which means it is too large to be stored on a single chip.

Instead, the models must be split across thousands of slices, which must work together for weeks or longer to train the model. Google's PaLM model , the largest publicly disclosed language model to date, was trained by splitting it across two 4,000-chip supercomputers over 50 days.

Google said its supercomputers make it easy to reconfigure connections between chips on the fly, helping to avoid problems and tweaking for performance gains.

“Swapping circuits makes it easy to avoid failing components,” Google said of the system in a blog post. This flexibility even allows us to change the communication topology between supercomputers to speed up machine learning model performance.”

And while Google has not published the details of its supercomputer until now, it has been present and working within the company since 2020 in a data center in the US state of Oklahoma.

Google said the startup (Midjourney) used the system to train its model, which produces new images based on a few text instructions.

In the paper, Google said that for systems of similar size, its supercomputer is up to 1.7 times faster and 1.9 times more power efficient than a system based on the NVIDIA A100 chip that launched at the same time as the next generation . Fourth of TPU .

Google said that it did not compare its fourth generation with the current (H100) main chip from NVIDIA, because this chip was launched in the market after the Google chip, and it is made with newer technology.

Google hinted that it might work on a new TPU chip that would compete with (H100) from Nvidia, but it did not provide any details.