Big Island GPGPU in mass production
Tianshu Zhixin announced that its Big Island General Purpose Graphics Processor (GPGPU) has entered mass production.
The development of the new GPGPU started back in 2018, but it wasn’t until January this year when the Chinese company Tianshu Zhixin announced that a new compute accelerator is in development. Just 3 months later, the company officially announced that it has entered mass production using TSMC’s 7nm process node.
The GPGPU features CoWoS packing technology (Chip-on-Wafer-on-Substrate). The chip is manufactured using TSMC’s 7nm FinFET technology, the company announced on Wednesday. It’s an advanced accelerator with over 24 billion transistors.
The manufacturer revealed some performance figures of the Big Island GPU, such as FP32, FP16/BF16, and INT32/16/8 figures, which allows a side-by-side comparison to existing high-performance chips from better-known suppliers of such solutions such as AMD or NVIDIA.
According to the new slides, Big Islands GPGPU would offer 37 TFLOPs of single-precision compute power (FP32). The chip will also offer up to 147 TFLOPs of FP16/BF16 half-precision calculations. Furthermore, it will also reach 317, 147, and 295 TOPS in INT32, INT16, and IN8 calculations per second respectively. The company did not provide any clock speeds or GPU configurations, which makes it impossible to tell how many cores does the chip has.
In terms of performance, the new chip performs well in FP32 calculations, that’s assuming Big Island’s official data does not use Matrix or Sparsity. The FP32 performance is indeed higher than both AMD Instinct MI100 and NVIDIA A100, but half-precision calculations are slower.
The chip offers higher performance in an 8-bit integer though, nearly 60% faster than MI100, but still half of what A100 has to offer. The company did not provide double-precision performance figures.
|2021 Compute Accelerators|
|Solution||Iluvatar CoreX||AMD Instinct MI100||NVIDIA A100|
|Memory||32GB HBM2||32GB HBM2||40GB / 80GB HBM2|
|Memory Bandwidth||TBC||1.2 TB/s||1.6 TB/s / 2.0 TB/s|
|TDP||300W||300W||250W / 400W|
|Interface||PCIe x16 Gen4||PCIe x16 Gen4||PCIe X16 Gen4 / NVLINK|
|FP64||TBC||11.5 TFLOPS||9.7 TFLOPS|
|FP32||37 TFLOPS||23.1 TFLOPS||19.5 TFLOPS|
|– Matrix||–||46.1 TFLOPS||–|
|– Sparsity||–||–||156 TFLOPS|
|– Tensor||–||–||312 TFLOPS|
|FP16||147 TFLOPS||184.6 TFLOPS||312 TFLOPS|
|BFLOAT16||147 TFLOPS||92.3 TFLOPS||312 TFLOPS|
|– Sparsity||–||–||624 TFLOPS|
|INT8||295 TOPS||184.6 TOPS||624 TOPS|
|— Sparsity||–||–||1248 TOPS|
|INT4||TBC||184.6 TOPS||1248 TOPS|
|— Sparsity||–||–||2496 TOPS|
Tianshu Zhixin has presented various implementations of its Big Island GPU. The accelerators will be offered in a traditional dual-slot PCIe Gen4 x16 form factor with passive cooling as well as a standard mezzanine board form factor.