The new features of NVIDIA Turing architecture

Published: 11th Sep 2018, 21:00 GMT | Comments

This post is a very short summary of NVIDIA Turing Architecture whitepaper (available on September 14th).

Key Features of Turing

INT32 Cores (Concurrent execution of floating point and integer instructions)

Turing architecture adds new execution unit (INT32). This unit will enable Turing GPUs to execute floating point and non-floating point processes in parallel. NVIDIA claims that this should theoretically provide 36% additional throughout for floating point operations.

The parallel execution will be possible thanks to new unified architecture for shared L1 memory and texture caching. NVIDIA claims that INT32/FP32 core design and other changes to the new streaming multiprocessor, provide “50% improvement in delivered performance per CUDA core”.

New Shading Advancements

  • Mesh Shading — new shader model for vertex, tesselation, geometry shading (more objects per scene)
  • Variable Rate Shading (VRS) — developer control over shading rates (to limit shading where it does not provide visual benefit)
  • Texture-Space Sharing — Storing shading results in memory (no need to duplicate sharing work for the processes)
  • Multi-View Rendering (MVR) — Extends Pascal’s Single Pass Stereo to multi views in a single pass

Turing Memory Compression

Turing architecture brings new lossless compression techniques. NVIDIA claims that their further improvements to ‘state of the art’ Pascal algorithms have provided (in NVIDIA’s own words) ‘50% increase in effective bandwidth on Turing compared to Pascal’.

Video and Display Engine

New video engine supports DisplayPort 1.4a (8K at 60 Hz). The Turing graphics cards can drive two 8K displays at 60 Hz (either through DP or USB-C. The new engine features enhanced NVENC encoder (can encode H.265 stream at 8K/30 FPS) and new NVDEC decoder with HEV YUV444 10/12b HDR, H.264 8K and VP9 10/12 HDR support.

NVLINK (only 2-way)

The TU102 GPU features TWO x8 2nd Gen NVLINK, while TU104 is equipped with a single x8 link. The TU106 does not support NVLINK. Unfortunately, NVIDIA decided to end 3-way and 4-way SLI support with Turing.

NVIDIA TU102 vs TU104 vs TU106

NVIDIA GeForce RTX 2070 is the only graphics card from the new series to utilize the full silicon. It is not, as previously speculated, based on cut-down TU104. NVIDIA confirmed that their new xx70 model will, in fact, feature TU106 GPU.

Specs-wise, Turing TU102 essentially doubles the specs of TU106. The TU104 is the only Turing chip to feature four TPCs per cluster (unlike TU102 and TU106 which have 6 per GPC).

Is TU106 a mid-range chip?

According to NVIDIA’s own naming convention, the TU106 should be a mid-range chip. What is worth noting, however, is that TU106 GPU is 131 mm2 bigger compared to GP104 (Pascal). The theory is that NVIDIA shifted TU100 to TU102 and TU102 to TU104 respectively. As long as die-size is considered, the TU106 could’ve easily been a high-end chip.

NVIDIA TURING GPUs
VideoCardz.comTU102TU104TU106
Fabrication Node12nm FFN12nm FFN12nm FFN
Die Size
 
754 mm2
 
545 mm2
 
445 mm2
Transistors
 
18.6 Billion
 
13.6 Billion
 
10.6 Billion
NVIDIA SKU w/ full chipQuadro RTX 6000Quadro RTX 5000GeForce RTX 2070
GPCs
 
6
 
6
 
3
TPCs
 
36
 
24
 
18
SMs
 
72 (12 per GPC)
 
48 (8 per GPC)
 
36 (12 per GPC)
Tensor Cores
 
576
 
384
 
288
RT Cores
 
72
 
48
 
36
FP32 Cores (CUDAs)
 
4,608
 
3,072
 
2,304
INT32 Cores
 
4,608
 
3,072
 
2,304
ROPs
 
96
 
64
 
64
TMUs
 
288
 
192
 
144
Memory Interface
 
384-bit
 
256-bit
 
256-bit
L2 Cache
 
6144 KB
 
4096 KB
 
4096 KB

Turing GPUs block diagrams

These are simplified versions of NVIDIA’s original block diagrams of Turing GPUs (they are basically 99% the same, except mine are a lot sexier).


by WhyCry

Previous Post
NVIDIA changes GeForce RTX 2080 reviews date to September 19th
Next Post
ASRock's flagship Z390 Phantom Gaming 9 motherboard leaked






Back to Top ↑