NVIDIA Pascal GP100 targeting 12 TFLOPs in Single Precision computing performance

Published: 17th Feb 2016 | Comments

NVIDIA’s most anticipated chip — GP100 aka ‘Big Pascal’ —  could feature a massive performance upgrade compared to previous generation.

NVIDIA Pascal GP100: 12 TFLOPS SP, 4 TFLOPS DP?

NVIDIA Pascal Computing Performance NVIDIA Pascal Computing Performance Chart

Presentation dated from June 2015 created by Manuel Ujaldon, a Spanish university professor and ‘CUDA Fellow’, reveals that NVIDIA is planning a new chip based on Pascal architecture that could feature 4 TFLOPs double precision computing performance and 3 times higher single precision performance (12 TFLOPs)

It means Pascal could feature 1/3  DP to SP ratio, similar to Kepler architecture, but different from ‘Single Precision Oriented’ Maxwell.

NVIDIA Pascal GP100 Computing Performance Ratio
Fabrication ProcessDie SizeNative FP64 Rate
GP100 (Big Pascal)FinFET?1/3
GM200 (Big Maxwell)28nm601mm21/32
GK110 (Big Kepler)28nm551mm21/3
GF110 (Big Fermi)40 nm520mm21/2
GT200 (Big Tesla)55 nm576mm21/8

More importantly computing performance gives us a hint on how many CUDA cores could Big Pascal have. Assuming GP100 has 1000 MHz core clock, the CUDA count would be 6144, twice as much as GM200. Of course the numbers shared by the professor are probably not very accurate, but we do believe they are based on information shared by NVIDIA itself, so it won’t hurt if we analyze them further.

Here’s an overview of how many cores could GP100 have assuming :

  • 12 TFLOPs computing performance,
  • various GPU clock speeds,
  • Streaming Multiprocessor Pascal featuring 128 CUDA cores each (similar to Maxwell).
NVIDIA Pascal GP100 CUDA Cores prediction (SP Performance = 12288000 FLOPS)
GPU Base ClockCUDA Core CountSMX Count
800 MHz768060
850 MHz716856
900 MHz678453
950 MHz640050
1000 MHz614448
1050 MHz576045
1100 MHz550443
1150 MHz524841
1200 MHz512040

In our predictions we have assumed that 5120 CUDA core count is the most suitable number for GP100. Of course such core count would be reserved for professional line at launch, with new TITAN successor coming at later date.

The GP104 could therefore have 20 SMX with 2560 CUDA cores on board, which should be enough to compete against GTX 980 Ti, as we are also getting HBM and FinFET with next generation GPUs.

Source: 3DCenter, PDF

by WhyCry

Previous Post
Hitman free with Radeon graphics cards
  Next Post
AMD and The Associated Press Collaborate to Enable Next-Generation Virtual Reality Journalism



Back to Top ↑