NVIDIA Pascal GP100 targeting 12 TFLOPs in Single Precision computing performance

Published: 1 year ago

NVIDIA’s most anticipated chip — GP100 aka ‘Big Pascal’ —  could feature a massive performance upgrade compared to previous generation.

NVIDIA Pascal GP100: 12 TFLOPS SP, 4 TFLOPS DP?

NVIDIA Pascal Computing Performance NVIDIA Pascal Computing Performance Chart

Presentation dated from June 2015 created by Manuel Ujaldon, a Spanish university professor and ‘CUDA Fellow’, reveals that NVIDIA is planning a new chip based on Pascal architecture that could feature 4 TFLOPs double precision computing performance and 3 times higher single precision performance (12 TFLOPs)

It means Pascal could feature 1/3  DP to SP ratio, similar to Kepler architecture, but different from ‘Single Precision Oriented’ Maxwell.

NVIDIA Pascal GP100 Computing Performance Ratio
Fabrication Process Die Size Native FP64 Rate
GP100 (Big Pascal) FinFET ? 1/3
GM200 (Big Maxwell) 28nm 601mm2 1/32
GK110 (Big Kepler) 28nm 551mm2 1/3
GF110 (Big Fermi) 40 nm 520mm2 1/2
GT200 (Big Tesla) 55 nm 576mm2 1/8

More importantly computing performance gives us a hint on how many CUDA cores could Big Pascal have. Assuming GP100 has 1000 MHz core clock, the CUDA count would be 6144, twice as much as GM200. Of course the numbers shared by the professor are probably not very accurate, but we do believe they are based on information shared by NVIDIA itself, so it won’t hurt if we analyze them further.

Here’s an overview of how many cores could GP100 have assuming :

  • 12 TFLOPs computing performance,
  • various GPU clock speeds,
  • Streaming Multiprocessor Pascal featuring 128 CUDA cores each (similar to Maxwell).
NVIDIA Pascal GP100 CUDA Cores prediction (SP Performance = 12288000 FLOPS)
GPU Base Clock CUDA Core Count SMX Count
800 MHz 7680 60
850 MHz 7168 56
900 MHz 6784 53
950 MHz 6400 50
1000 MHz 6144 48
1050 MHz 5760 45
1100 MHz 5504 43
1150 MHz 5248 41
1200 MHz 5120 40

In our predictions we have assumed that 5120 CUDA core count is the most suitable number for GP100. Of course such core count would be reserved for professional line at launch, with new TITAN successor coming at later date.

The GP104 could therefore have 20 SMX with 2560 CUDA cores on board, which should be enough to compete against GTX 980 Ti, as we are also getting HBM and FinFET with next generation GPUs.

Source: 3DCenter, PDF


by WhyCry

Previous Post
Hitman free with Radeon graphics cards
  Next Post
AMD and The Associated Press Collaborate to Enable Next-Generation Virtual Reality Journalism




Back to Top ↑