AMD Mi-Next GPUs to feature 128GB per GPU
Pawsey SuperComputing is to build Setonix supercomputer featuring AMD Milan CPUs and Mi-Next GPUs.
At ISC 2021 Ugo Varetto, Pawsey’s CTO, revealed that Powsey’s supercomputer named Setonix features as many as 200,000+ AMD Milan CPU Cores and as many as 750+ AMD Mi-Next GPUs. This announcement wouldn’t be that important if not for the fact that Ugo Varetto revealed that each of those GPUs will feature 128GB of memory.
The supercomputer which was acquired thanks to a 70 million dollars funding will improve Pawsey’s work in data ingestions, data visualization, data lifecycle management, and data sharing, or as it was put by HPCWire: data work. Operating or large data sets require a large memory pool, which in combination with GPU acceleration can greatly improve the efficiency of the system.
Pawsey’s SC Setonix supercomputer, Source: Ugo Varetto
AMD Mi-Next, namely Instinct MI200 will undoubtedly be the largest GPU AMD has made so far. The GPU known as Aldebaran is to features Multi-Chip-Module (MCM) design. According to Pawsey’s announcement, this GPU will feature 128GB of HBM2 memory, which is four times more than its predecessor, the MI100.
Just a few days ago a block diagram of the MI200 GPU has been created by Locuza. It demonstrates how eight stacks of HBM2e memory would be attached to both of the GPU dies:
AMD Aldebaran GPU Block Diagram, Source: @Locuza_
AMD has so far not confirmed the specifications of its MI200 accelerator, but as far as leaks are concerned the full Aldebaran GPU features 128 Compute Units. The number of active CUs on MI200 specifically has not been confirmed yet though.
|AMD Instinct Accelerators|
|Accelerator Name||AMD Radeon Instinct MI60||AMD Instinct MI100||AMD Instinct MI200|
|Architecture||7nm GCN5||7nm CDNA1 (GFX908)||CDNA2 (GFX90A)|
|GPU||Vega 20||Arcturus||Aldebaran (MCM)|
|GPU Clock Speed||1800 MHz||~1500 MHz||TBC|
|FP16 Compute||29.5 TFLOPs||185 TFLOPs||TBC|
|FP32 Compute||14.7 TFLOPs||23.1 TFLOPs||TBC|
|FP64 Compute||7.4 TFLOPs||11.5 TFLOPs||TBC|
|VRAM||32 GB HBM2||32 GB HBM2||128 GB HBM2E|
|Memory Clock||1000 MHz||1200 MHz||TBC|
|Memory Bus||4096-bit bus||4096-bit bus||TBC|
|Memory Bandwidth||1 TB/s||1.23 TB/s||TBC|
|Form Factor||Dual Slot, Full Length||Dual Slot, Full Length||OAM|
|Cooling||Passive Cooling||Passive Cooling||TBC|