Aldebaran could be AMD’s Multi-Chip Instinct MI200 accelerator with HBM2E memory

Published: 25th Feb 2021, 09:36 GMT   Comments

Aldebaran, AMD’s first MCM accelerator?

A successor to MI100 already receiving proper Linux patches.

Earlier this week we reported that AMD has begun implementing kernel patches for the upcoming Instinct MI200, the rumored next-gen accelerator for data centers. Last year AMD introduced MI100 which was the first graphics card to be based on CDNA architecture, a compute-oriented architecture designed to compete in supercomputing space. It was an important step for AMD, as for the first time manufacturer was not reusing its gaming chip for the server market.

The next step for AMD is to introduce the first multi-chip GPU. Rumors on Instinct MI200 are now dated back to July 2020. In fact, there are already rumors about MI300, but no details are available at this time. The MI200, on the other hand, is slowly revealing itself in new Linux kernel patches.

According to the latest entry, MI200 GPU could be codenamed Aldebaran. This is a name of a giant star in the zodiac constellation Taurus. Aldebaran has a 44.13 solar radius, nearly 75% more than Arcturus, which probably doesn’t mean anything for a graphics chip named after the star, yet it might be worth sharing.

AMD Aldebaran GPU support, Source: Freedesktop

AMD chooses GPU codenames randomly, but sometimes developers make suggestions to their legal departments. In this case, the codename has been suggested by AMD Linux developer nearly a year ago. It seems that has indeed been selected.

Card

HBM2E and newer SDMA engines

The patches do not reveal the full specifications of the GPU, however, they can help us understand what AMD is planning for the chip. Alongside other patches, AMD Linux developers have implemented HBM2E memory support. This could suggest that Aldebaran will use a newer HBM standard than Arcturus. The HBM2E will allow up to 16GB per stack, doubling the capacity over Arcturus.

The patches have also revealed that Aldebaran has fewer SDMA (System Direct Memory Access) engines. These are used to transfer data over interfaces such as PCIe or XGMI/Infinity Cache. Aldebaran will have the same number of SMDA engines that are used for GPU to CPU communication (2 engines), but the XGMI SMDA number has decreased from 6 to 3.

ARCTURUSALDEBARAN
 .asic_family = CHIP_ARCTURUS,
.asic_name = “arcturus”,
.max_pasid_bits = 16,
.max_no_of_hqd = 24,
.doorbell_size = 8,
.ih_ring_entry_size = 8 * sizeof(uint32_t),
.event_interrupt_class = &event_interrupt_class_v9,
.num_of_watch_points = 4,
.mqd_size_aligned = MQD_SIZE_ALIGNED,
.supports_cwsr = true,
.needs_iommu_device = false,
.needs_pci_atomics = false,
.num_sdma_engines = 2,
.num_xgmi_sdma_engines = 6,
.num_sdma_queues_per_engine = 8,
.asic_family = CHIP_ALDEBARAN,
.asic_name = “aldebaran”,
.max_pasid_bits = 16,
.max_no_of_hqd = 24,
.doorbell_size = 8,
.ih_ring_entry_size = 8 * sizeof(uint32_t),
.event_interrupt_class = &event_interrupt_class_v9,
.num_of_watch_points = 4,
.mqd_size_aligned = MQD_SIZE_ALIGNED,
.supports_cwsr = true,
.needs_iommu_device = false,
.needs_pci_atomics = false,
.num_sdma_engines = 2,
.num_xgmi_sdma_engines = 3,
.num_sdma_queues_per_engine = 8,

Multi-Die seemingly confirmed

One of the developers explains a new performance determinism patch for Aldebaran. The description refers to per-die control of the feature, which suggests that the new accelerator has multiple dies.

Performance Determinism is a new mode in Aldebaran where PMFW tries to maintain sustained performance level. It can be enabled on a per-die basis on aldebaran. To guarantee that it remains within the power cap, a max GFX frequency needs to be specified in this mode.

Instant MI200 is now expected to launch later this year alongside AMD EPYC CPUs codenamed Trento. It would compete with Intel Xe-HP and NVIDIA Hopper MCM-based architectures. Everything we know about Instinct MI200 so far:

AMD Instinct Accelerators
Accelerator NameAMD Radeon Instinct MI60AMD Instinct MI100AMD Instinct MI200
Architecture7nm GCN57nm CDNA1 (GFX908)CDNA2 (GFX90A) ?
GPUVega 20ArcturusAldebaran
GPU Cores40967680MCM
GPU Clock Speed1800 MHz~1500 MHzTBC
FP16 Compute29.5 TFLOPs185 TFLOPsTBC
FP32 Compute14.7 TFLOPs23.1 TFLOPsTBC
FP64 Compute7.4 TFLOPs11.5 TFLOPsTBC
VRAM32 GB HBM232 GB HBM2HBM2E
Memory Clock1000 MHz1200 MHzTBC
Memory Bus4096-bit bus4096-bit busTBC
Memory Bandwidth1 TB/s1.23 TB/sTBC
Form FactorDual Slot, Full LengthDual Slot, Full LengthOAM
CoolingPassive CoolingPassive CoolingTBC
TDP300W300WTBC

Source: Freedesktop, Coelacanth’s Dream




Comment Policy
  • Comments must be written in English.
  • Comments deemed to be spam or solely promotional in nature will be deleted. Including a link to relevant content is permitted, but comments should be relevant to the post topic.
  • Comments containing language or concepts that could be deemed offensive will be deleted. Note this may include abusive, threatening, pornographic, offensive, misleading or libelous language.
  • A failure to comply with these rules will result in a warning and, in extreme cases, a ban.
  • Please note that comments that attack or harass an individual directly will be deleted and such comments will result in a ban.
  • VideoCardz Moderating Team reserves the right to edit or delete any comments submitted to the site without notice.
  • If you would like to appeal for a comment section ban to be removed, please use this page.
  • If you have any questions about the commenting policy, please let us know through the Contact Page.
Hide Comment Policy
Comments