September 5th, 2012
NVIDIA Kepler GK106 GPU Detailed
Who wants to see some GPU pr0n? Before I began, let me explain that I made this diagram myself, but it’s based on quite solid knowledge about the GPU. So if you want to learn what does this new GPU add to the Kepler family – buckle up and read below.
NVIDIA Kepler GK106
NVIDIA’s GK106 is another iteration of 28nm Kepler GPU, its architecture is quite simple. This mid-range silicon will hold 960 CUDA cores in 5 Streaming Multiprocessors (SMXs). Those SMXs are divided into 3 Graphics Processing Clusters (GPCs), with one hiding only one SMX. Each SMX holds 192 CUDA cores and 16 Texture Mapping Units (TMUs). With a very sophisticated calculations we learn that the GPU has 960 CUDA cores and 80 TMUs. Further we can notice that the GPU has three Raster Engines. 192-bit memory interface is controlled by three 64-bit memory controllers. Each memory controller is tied to 128KB L2 Cache (the GPU has 384KB in total) and 8 Raster Operating Units (ROPs). The GK106 has 24 ROPs. The whole GPU is packed with 2.54 billion transistors (that’s exactly one billion less than GK104 — 29% decrease).
I have combined all the details below for your reading pleasure. What I don’t know yet is the exact size of the die.
|Transistors Count||1.3 billion||2.54 billion||3.54 billion||7.1 billion|
|Graphics Processing Clusters (GPCs)||1||3||4||-|
|Streaming Multiprocessors (SMXs)||2||5||8||15|
|Texture Mapping Units (TMUs)||32||80||128||240|
|Raster Operating Units (ROPs)||16||24||32||-|
|Texel Fill-rate||13 Gigatexels/s||78 Gigatexels/s||128 Gigatexels/s||-|
|Memory Bandwidth||28 GB/s||144 GB/s||192 GB/s||-|
|Base Clock||900 MHz||980 MHz||1006 MHz||-|
|Boost Clock||-||1033 MHz||1058 MHz||-|
|Effective Memory Clock||1782 MHz||6008 MHz||6008 MHz||-|
* GK107 tied with GDDR3 memory
** Full GK104-400 GPU from GTX 680.
*** Some cards with GK110 will feature 13 and 14 SMXs, this is the full one from Tesla K20
For a reference here are GK104 and GK107.