NVIDIA Blackwell AI Chips Face Delays Due to Design Flaws
Solomon Thompson / 5 months ago
NVIDIA‘s latest AI chips, the Blackwell series, are experiencing delays due to design issues. Sources within the chip and server hardware production industry indicate that these setbacks could postpone the release by three months or more. Major customers like Meta, Google, and Microsoft may be affected, with significant shipments now expected in the first quarter of 2025.
Design Issues Discovered
The production issue stems from a flaw discovered by TSMC, the manufacturer. The problem involves the processor die connecting two Blackwell GPUs on a GB200. NVIDIA now needs to redesign the chip and undergo a new production test at TSMC before moving to mass production.
NVIDIA has hinted at a possible single-GPU version to speed up delivery. Meanwhile, TSMC’s production lines remain idle, awaiting the new design.
Lower Supply Expected
According to Dylan Patel (thanks TechPowerUp), the supply of Blackwell chips will be lower than expected in late 2024 and early 2025. This shortage is due to TSMC transitioning from CoWoS-S to CoWoS-L technology, crucial for NVIDIA’s advanced Blackwell chips.
NVIDIA appears to be prioritizing the production of GB200 NVL72 units over NVL36 units. The NVL72 design incorporates 72 GPUs, either in a single rack with 18 double GB200 compute nodes or spread across two racks, each containing 18 single nodes.
NVIDIA is also under investigation for potential AI market abuse, adding to its challenges. The company has started sampling units to customers but has not confirmed a new release date for the Blackwell chips.