Please note that this post is tagged as a rumor.
AMD MI300 might be the start of high-performance APUs
A new leak puts a new socket name alongside MI300 and Zen4 CPU socket.
ExecutableFix discovered a new socket name “SH5” which is internally called MI300. This name is likely referring to the Instinct MI300 accelerator, a successor to MI200 which has not yet been released. The MI300 is now rumored to feature 4 GPU chipsets, a double of what MI200 is said to offer.
The leaker claims that SH5 refers to a Zen4 socket, which likely means that MI300 might have a Zen4 CPU chiplet on the same package, or at least some of its variants might.
There's a server socket called "SH5" that features CPU with CPUID 0xA80F00… called "MI300"
— ExecutableFix (@ExecuFix) September 16, 2021
As others have put it, the MI300 is not expected to launch for at least another year, which makes all those early rumors almost impossible to confirm at this point. It is, however, interesting to see what AMD is focusing on in the relatively near future.
This is not the first time are hearing about exascale APUs. In a paper “Design and Analysis of an APU for Exascale Computing” AMD has revealed a proposed design of such a high-performance design, featuring stacked HBM memory on top of GPU chiplets which would be combined with CPU chiplets.
It is also worth noting that back in 2019, so nearly 2 years ago, known leaker Komachi discovered the first traces of AMD MI200 Big APU mode. In this case, however, it might mean something else, such as a number of GPUs working in tandem with attached EPYC CPUs.
What is BIG APU Mode for MI200??
— 遠坂小町@Komachi (@KOMACHI_ENSAKA) December 26, 2019
AMD Instinct Accelerators | ||||
---|---|---|---|---|
Accelerator Name | AMD Radeon Instinct MI60 | AMD Instinct MI100 | AMD Instinct MI200 | AMD Instinct MI300 |
Architecture | 7nm GCN5 (GFX906) | 7nm CDNA1 (GFX908) | CDNA2 (GFX90A) | CDNA3 (?) |
CPU | – | – | – | Zen4 (?) |
GPU | Vega 20 | Arcturus | Aldebaran (MCM) | ? (MCM) |
Compute Tiles | 1 | 1 | 2 | 4 |
Compute Units | 64 (64) | 120 | 2x 110 or 2x 55 | 4x (?) |
FP32 Cores (Full GPU) | 4096 (4096) | 7680 (8192) | TBC | 4x (?) |
GPU Clock Speed | 1800 MHz | ~1500 MHz | TBC | TBC |
FP16 Compute | 29.5 TFLOPS | 185 TFLOPS | TBC | TBC |
FP32 Compute | 14.7 TFLOPS | 23.1 TFLOPS | TBC | TBC |
FP64 Compute | 7.4 TFLOPS | 11.5 TFLOPS | TBC | TBC |
VRAM | 32 GB HBM2 | 32 GB HBM2 | 128 GB HBM2E | TBC |
Memory Clock | 1000 MHz | 1200 MHz | TBC | TBC |
Memory Bus | 4096-bit | 4096-bit | TBC | TBC |
Memory Bandwidth | 1 TB/s | 1.23 TB/s | TBC | TBC |
Form Factor | Dual Slot, Full Length | Dual Slot, Full Length | OAM | TBC |
Cooling | Passive Cooling | Passive Cooling | TBC | TBC |
TDP | 300W | 300W | TBC | TBC |