NVIDIA announces TESLA V100 with 5120 CUDA cores

Published: 2 weeks ago | Comments

NVIDIA has just announced first Volta computing card Tesla V100.

TESLA V100 has 5120 CUDA Cores

World’s first 12nm FFN GPU has just been announced by Jensen Huang at GTC17. The new Tesla has the second generation NVLink with a bandwidth of 300 GB/s. Tesla V100 utilizes 16 GB HBM2 operating at 900 GB/s.

The card is powered by new Volta GPU, which features 5120 CUDA cores and 21 billion transistors. This is the biggest GPU ever made with a die size of 815 mm2.

Volta GV100 features a new type of computing core called Tensor core. The purpose of this core is deep learning matrix arithmetics.

Jensen said that the cost of Tesla V100 development was 3 billion dollars.

NVIDIA Tesla V100
VideoCardz.com NVIDIA Tesla V100 NVIDIA Tesla P100
Die Size 815 mm2 610 m2
FP32 Computing Performance 15.0 TF 10.6 TF
CUDA Cores 5120 3584
Core Clock 1455 MHz 1480 MHz
Memory Type 4096-bit 16 GB HBM2 4096-bit 16GB HBM2
Interface NVLink 2.0 NVLink 1.0 / PCI-e 3.0
Memory Bandwidth 900 GB/s 720 GB/s
TDP 300W 300W

Key features of Tesla V100:

  • New Streaming Multiprocessor (SM) Architecture Optimized for Deep Learning
  • Second-Generation NVLink™
  • HBM2 Memory: Faster, Higher Efficiency
  • Volta Multi-Process Service
  • Enhanced Unified Memory and Address Translation Services
  • Cooperative Groups and New Cooperative Launch APIs
  • Maximum Performance and Maximum Efficiency Modes
  • Volta Optimized Software
NVIDIA Tesla
Tesla Product Tesla K40 Tesla M40 Tesla P100 Tesla V100
GPU GK110 (Kepler) GM200 (Maxwell) GP100 (Pascal) GV100 (Volta)
SMs 15 24 56 80
TPCs 15 24 28 40
FP32 Cores / SM 192 128 64 64
FP32 Cores / GPU 2880 3072 3584 5120
FP64 Cores / SM 64 4 32 32
FP64 Cores / GPU 960 96 1792 2560
Tensor Cores / SM NA NA NA 8
Tensor Cores / GPU NA NA NA 640
GPU Boost Clock 810/875 MHz 1114 MHz 1480 MHz 1455 MHz
Peak FP32 TFLOP/s* 5.04 6.8 10.6 15
Peak FP64 TFLOP/s* 1.68 2.1 5.3 7.5
Peak Tensor Core TFLOP/s* NA NA NA 120
Texture Units 240 192 224 320
Memory Interface 384-bit GDDR5 384-bit GDDR5 4096-bit HBM2 4096-bit HBM2
Memory Size Up to 12 GB Up to 24 GB 16 GB 16 GB
L2 Cache Size 1536 KB 3072 KB 4096 KB 6144 KB
Shared Memory Size / SM 16 KB/32 KB/48 KB 96 KB 64 KB Configurable up to 96 KB
Register File Size / SM 256 KB 256 KB 256 KB 256KB
Register File Size / GPU 3840 KB 6144 KB 14336 KB 20480 KB
TDP 235 Watts 250 Watts 300 Watts 300 Watts
Transistors 7.1 billion 8 billion 15.3 billion 21.1 billion
GPU Die Size 551 mm² 601 mm² 610 mm² 815 mm²
Manufacturing Process 28 nm 28 nm 16 nm FinFET+ 12 nm FFN

NVIDIA Volta GV100

With Tesla V100 NVIDIA introduces GV100 graphics processor. This is the biggest GPU ever made with 5376 CUDA FP32 cores (but only 5120 are enabled on Tesla V100). It has a new type of Streaming Multiprocessor called Volta SM, equipped with mixed-precision tensor cores and enhanced power efficiency, clock speeds and L1 data cache.

This GPU in Tesla V100 is clocked at 1455 MHz with peak computing power of 15 TFLOPs in 32-bit operations.


by WhyCry

Previous Post
Watch NVIDIA's CEO Keynote at GTC 2017 here
  Next Post
NVIDIA announces DGX and HGX Tesla Volta computing stations




Back to Top ↑