Tesla boosts its AI efforts with D1 Chip
Project Dojo puts Tesla back into a race for high-performance solutions for artificial intelligence training.
At Tesla AI Day, the company announced its new D1 Chip, custom processors based on 7nm process technology with 50 billion transistors. This chip has a die area of 645 mm², smaller than both NVIDIA A100 (826 mm²) and AMD Arcturus (750mm²). Specifications-wise, the chip is equipped with 354 training nodes based on a 64-bit superscalar CPU with 4 cores. Those are designed specifically for 8×8 multiplications and support a wide range of instructions used for AI training, including FP32, BFP16, CFP8, INT32, INT16, and INT8.
According to Tesla, their D1 chip offers 22.6 FLOPS of single-precision compute performance (FP32) and up 362 TFLOPS in BF16/CFP8. This performance is achieved within a TDP of 400W for a single D1 chip. For AI-trainings scalability is an important aspect, which is why Tesla came up with high-bandwidth interconnects (low latency switch fabric) with up to 10 TB/s. The I/O ring around the chip has 576 lanes, each offering 112 Gbit/s of bandwidth.
Tesla D1 can be linked through Dojo Interface Processor. The chips can be put on Training Tiles, each featuring 25 D1 chips. This Tile is made using a fan-out water process and features a complete cuboid solution with cooling and power delivery. Those tiles can further be connected to other cuboids creating a large network of Training Tiles.
Tesla demonstrated a working Tile in their laboratory operating at 2 GHz. The Training Tile offers up to 9 PFLOPS of compute performance.
Finally, Tesla revealed its plans for a full supercomputer featuring D1 chips. The ExaPOD is based on 120 Training Tiles with 3000 D1 chips offering 1,062,000 nodes. This configuration offers up to 1.1 ExaFLOPS of FP16/CFP8 compute performance. Upon its completion, ExaPod will become the fastest AI training supercomputer with 4x higher performance, 1.3x better performance per watt, and 5x smaller footprint than Tesla’s current NVIDIA-based supercomputers.