NVIDIA’s most anticipated chip — GP100 aka ‘Big Pascal’ — could feature a massive performance upgrade compared to previous generation.
NVIDIA Pascal GP100: 12 TFLOPS SP, 4 TFLOPS DP?
Presentation dated from June 2015 created by Manuel Ujaldon, a Spanish university professor and ‘CUDA Fellow’, reveals that NVIDIA is planning a new chip based on Pascal architecture that could feature 4 TFLOPs double precision computing performance and 3 times higher single precision performance (12 TFLOPs)
It means Pascal could feature 1/3 DP to SP ratio, similar to Kepler architecture, but different from ‘Single Precision Oriented’ Maxwell.
NVIDIA Pascal GP100 Computing Performance Ratio | |||||
Fabrication Process | Die Size | Native FP64 Rate | |||
---|---|---|---|---|---|
GP100 (Big Pascal) | FinFET | ? | 1/3 | ||
GM200 (Big Maxwell) | 28nm | 601mm2 | 1/32 | ||
GK110 (Big Kepler) | 28nm | 551mm2 | 1/3 | ||
GF110 (Big Fermi) | 40 nm | 520mm2 | 1/2 | ||
GT200 (Big Tesla) | 55 nm | 576mm2 | 1/8 |
More importantly computing performance gives us a hint on how many CUDA cores could Big Pascal have. Assuming GP100 has 1000 MHz core clock, the CUDA count would be 6144, twice as much as GM200. Of course the numbers shared by the professor are probably not very accurate, but we do believe they are based on information shared by NVIDIA itself, so it won’t hurt if we analyze them further.
Here’s an overview of how many cores could GP100 have assuming :
- 12 TFLOPs computing performance,
- various GPU clock speeds,
- Streaming Multiprocessor Pascal featuring 128 CUDA cores each (similar to Maxwell).
NVIDIA Pascal GP100 CUDA Cores prediction (SP Performance = 12288000 FLOPS) | ||||
GPU Base Clock | CUDA Core Count | SMX Count | ||
---|---|---|---|---|
800 MHz | 7680 | 60 | ||
850 MHz | 7168 | 56 | ||
900 MHz | 6784 | 53 | ||
950 MHz | 6400 | 50 | ||
1000 MHz | 6144 | 48 | ||
1050 MHz | 5760 | 45 | ||
1100 MHz | 5504 | 43 | ||
1150 MHz | 5248 | 41 | ||
1200 MHz | 5120 | 40 |
In our predictions we have assumed that 5120 CUDA core count is the most suitable number for GP100. Of course such core count would be reserved for professional line at launch, with new TITAN successor coming at later date.
The GP104 could therefore have 20 SMX with 2560 CUDA cores on board, which should be enough to compete against GTX 980 Ti, as we are also getting HBM and FinFET with next generation GPUs.