Tesla D1 chip features 50 billion transistors, scales up to 1.1 ExaFLOPS with ExaPOD

Published: 20th Aug 2021, 16:30 GMT   Comments

Tesla boosts its AI efforts with D1 Chip

Project Dojo puts Tesla back into a race for high-performance solutions for artificial intelligence training.

At Tesla AI Day, the company announced its new D1 Chip, custom processors based on 7nm process technology with 50 billion transistors. This chip has a die area of 645 mm², smaller than both NVIDIA A100 (826 mm²) and AMD Arcturus (750mm²). Specifications-wise, the chip is equipped with 354 training nodes based on a 64-bit superscalar CPU with 4 cores. Those are designed specifically for 8×8 multiplications and support a wide range of instructions used for AI training, including FP32, BFP16, CFP8, INT32, INT16, and INT8.

According to Tesla, their D1 chip offers 22.6 FLOPS of single-precision compute performance (FP32) and up 362 TFLOPS in BF16/CFP8. This performance is achieved within a TDP of 400W for a single D1 chip. For AI-trainings scalability is an important aspect, which is why Tesla came up with high-bandwidth interconnects (low latency switch fabric) with up to 10 TB/s. The I/O ring around the chip has 576 lanes, each offering 112 Gbit/s of bandwidth.

[Tesla] Tesla AI Day (1,807,973 views)

Tesla D1 can be linked through Dojo Interface Processor. The chips can be put on Training Tiles, each featuring 25 D1 chips. This Tile is made using a fan-out water process and features a complete cuboid solution with cooling and power delivery. Those tiles can further be connected to other cuboids creating a large network of Training Tiles.

Tesla demonstrated a working Tile in their laboratory operating at 2 GHz. The Training Tile offers up to 9 PFLOPS of compute performance.

Finally, Tesla revealed its plans for a full supercomputer featuring D1 chips. The ExaPOD is based on 120 Training Tiles with 3000 D1 chips offering 1,062,000 nodes. This configuration offers up to 1.1 ExaFLOPS of FP16/CFP8 compute performance. Upon its completion, ExaPod will become the fastest AI training supercomputer with 4x higher performance, 1.3x better performance per watt, and 5x smaller footprint than Tesla’s current NVIDIA-based supercomputers.

Source: Tesla via ComputerBase




Comment Policy
  • Comments must be written in English.
  • Comments deemed to be spam or solely promotional in nature will be deleted. Including a link to relevant content is permitted, but comments should be relevant to the post topic.
  • Comments and usernames containing language or concepts that could be deemed offensive will be deleted. Note this may include abusive, threatening, pornographic, offensive, misleading, or libelous language.
  • Comments complaining about the article subject or its source will be removed.
  • A failure to comply with these rules will result in a warning and, in extreme cases, a ban.
  • Please note that comments that attack or harass an individual directly will be deleted and such comments will result in a ban.
  • VideoCardz Moderating Team reserves the right to edit or delete any comments submitted to the site without notice.
  • If you would like to appeal for a comment section ban to be removed, please use this page.
  • If you have any questions about the commenting policy, please let us know through the Contact Page.
  • NEW: Due to the recent increased amount of spam all links will require moderator approval. 
Hide Comment Policy
Comments