Google unveiled a new AI acceleration chip called Ironwood at its Cloud Next conference. It’s the company’s seventh-generation Tensor Processing Unit (TPU), and the first one optimized for AI inferencing. The chip will be available to Google Cloud customers this year in two configurations: a 256-chip cluster and a 9,216-chip cluster.
Ironwood delivers up to 4614 TFLOPs of performance, has 192 GB of dedicated RAM with a bandwidth of up to 7.4 Tbps. It also includes an advanced SparseCore core, which is designed to handle specific data. The chip's architecture is designed to minimize data movement and latency on the chip, which helps reduce power consumption.
Ironwood will be a key part of Google's strategy to integrate AI into its services. The chip will be integrated with the AI Hypercomputer, a modular cluster of computing resources in Google Cloud, which will allow it to process large amounts of data for AI even more efficiently.
Google's announcement of Ironwood comes amid increasing competition in the AI accelerator space. While NVIDIA has the lead in the market, other companies like Amazon and Microsoft are also actively developing their own solutions. Amazon offers Trainium, Inferentia, and Graviton processors through its AWS platform, while Microsoft has its Cobalt 100 AI chip available through Azure.
Google emphasizes that Ironwood is the most powerful and energy-efficient TPU to date, and its architecture allows for efficient running of inferential AI models.