Tesla D1 chip features 50 billion transistors, scales up to 1.1 ExaFLOPS with ExaPOD

Published: Aug 20th 2021, 16:30 GMT   Comments

Tesla boosts its AI efforts with D1 Chip

Project Dojo puts Tesla back into a race for high-performance solutions for artificial intelligence training.

At Tesla AI Day, the company announced its new D1 Chip, custom processors based on 7nm process technology with 50 billion transistors. This chip has a die area of 645 mm², smaller than both NVIDIA A100 (826 mm²) and AMD Arcturus (750mm²). Specifications-wise, the chip is equipped with 354 training nodes based on a 64-bit superscalar CPU with 4 cores. Those are designed specifically for 8×8 multiplications and support a wide range of instructions used for AI training, including FP32, BFP16, CFP8, INT32, INT16, and INT8.

According to Tesla, their D1 chip offers 22.6 FLOPS of single-precision compute performance (FP32) and up 362 TFLOPS in BF16/CFP8. This performance is achieved within a TDP of 400W for a single D1 chip. For AI-trainings scalability is an important aspect, which is why Tesla came up with high-bandwidth interconnects (low latency switch fabric) with up to 10 TB/s. The I/O ring around the chip has 576 lanes, each offering 112 Gbit/s of bandwidth.

[Tesla] Tesla AI Day 2021 (2,407,289 views)

Tesla D1 can be linked through Dojo Interface Processor. The chips can be put on Training Tiles, each featuring 25 D1 chips. This Tile is made using a fan-out water process and features a complete cuboid solution with cooling and power delivery. Those tiles can further be connected to other cuboids creating a large network of Training Tiles.

Tesla demonstrated a working Tile in their laboratory operating at 2 GHz. The Training Tile offers up to 9 PFLOPS of compute performance.

Finally, Tesla revealed its plans for a full supercomputer featuring D1 chips. The ExaPOD is based on 120 Training Tiles with 3000 D1 chips offering 1,062,000 nodes. This configuration offers up to 1.1 ExaFLOPS of FP16/CFP8 compute performance. Upon its completion, ExaPod will become the fastest AI training supercomputer with 4x higher performance, 1.3x better performance per watt, and 5x smaller footprint than Tesla’s current NVIDIA-based supercomputers.

Source: Tesla via ComputerBase

Comment Policy
  1. Comments must be written in English and should not exceed 1000 characters.
  2. Comments deemed to be spam or solely promotional in nature will be deleted. Including a link to relevant content is permitted, but comments should be relevant to the post topic. Discussions about politics are not allowed on this website.
  3. Comments and usernames containing language or concepts that could be deemed offensive will be deleted.
  4. Comments complaining about the post subject or its source will be removed.
  5. A failure to comply with these rules will result in a warning and, in extreme cases, a ban. In addition, please note that comments that attack or harass an individual directly will result in a ban without warning.
  6. VideoCardz has never been sponsored by AMD, Intel, or NVIDIA. Users claiming otherwise will be banned.
  7. VideoCardz Moderating Team reserves the right to edit or delete any comments submitted to the site without notice.
  8. If you have any questions about the commenting policy, please let us know through the Contact Page.
Hide Comment Policy