NVIDIA Pascal GP100 targeting 12 TFLOPs in Single Precision computing performance

Published: 17th Feb 2016, 13:26 GMT

NVIDIA’s most anticipated chip — GP100 aka ‘Big Pascal’ —  could feature a massive performance upgrade compared to previous generation.

NVIDIA Pascal GP100: 12 TFLOPS SP, 4 TFLOPS DP?

NVIDIA Pascal Computing Performance NVIDIA Pascal Computing Performance Chart

Presentation dated from June 2015 created by Manuel Ujaldon, a Spanish university professor and ‘CUDA Fellow’, reveals that NVIDIA is planning a new chip based on Pascal architecture that could feature 4 TFLOPs double precision computing performance and 3 times higher single precision performance (12 TFLOPs)

It means Pascal could feature 1/3  DP to SP ratio, similar to Kepler architecture, but different from ‘Single Precision Oriented’ Maxwell.

NVIDIA Pascal GP100 Computing Performance Ratio
Fabrication ProcessDie SizeNative FP64 Rate
GP100 (Big Pascal)FinFET?1/3
GM200 (Big Maxwell)28nm601mm21/32
GK110 (Big Kepler)28nm551mm21/3
GF110 (Big Fermi)40 nm520mm21/2
GT200 (Big Tesla)55 nm576mm21/8

More importantly computing performance gives us a hint on how many CUDA cores could Big Pascal have. Assuming GP100 has 1000 MHz core clock, the CUDA count would be 6144, twice as much as GM200. Of course the numbers shared by the professor are probably not very accurate, but we do believe they are based on information shared by NVIDIA itself, so it won’t hurt if we analyze them further.

Here’s an overview of how many cores could GP100 have assuming :

  • 12 TFLOPs computing performance,
  • various GPU clock speeds,
  • Streaming Multiprocessor Pascal featuring 128 CUDA cores each (similar to Maxwell).
NVIDIA Pascal GP100 CUDA Cores prediction (SP Performance = 12288000 FLOPS)
GPU Base ClockCUDA Core CountSMX Count
800 MHz768060
850 MHz716856
900 MHz678453
950 MHz640050
1000 MHz614448
1050 MHz576045
1100 MHz550443
1150 MHz524841
1200 MHz512040

In our predictions we have assumed that 5120 CUDA core count is the most suitable number for GP100. Of course such core count would be reserved for professional line at launch, with new TITAN successor coming at later date.

The GP104 could therefore have 20 SMX with 2560 CUDA cores on board, which should be enough to compete against GTX 980 Ti, as we are also getting HBM and FinFET with next generation GPUs.

Source: 3DCenter, PDF




Comment Policy
  • Comments must be written in English.
  • Comments deemed to be spam or solely promotional in nature will be deleted. Including a link to relevant content is permitted, but comments should be relevant to the post topic.
  • Comments containing language or concepts that could be deemed offensive will be deleted. Note this may include abusive, threatening, pornographic, offensive, misleading or libelous language.
  • A failure to comply with these rules will result in a warning and, in extreme cases, a ban.
  • Please note that comments that attack or harass an individual directly will be deleted and such comments will result in a ban.
  • VideoCardz Moderating Team reserves the right to edit or delete any comments submitted to the site without notice.
  • If you have any questions about the commenting policy, please let us know through the Contact Page.
Hide Comment Policy
Comments

This website relies on third-party cookies for advertisement, comments and social media integration. Check our Privacy and Cookie Policy for details.