NVIDIA updates GeForce GTX 970 specifications to 56 ROPs after reports of 3.5GB issue

Published: Jan 26th 2015, 21:33 GMT

Some surprising news came from PCPerspective today. After a long debate, hundreds of reports of slower memory buffer of GTX 970, NVIDIA officially admitted that there was a mistake between marketing and engineering teams.

NVIDIA GeForce GTX 970 3.5 GB memory issue

The GM204 diagram below was made by NVIDIA’s Jonah Alben (SVP of GPU engineering) specifically to explain the differences between the GTX 970 and GTX 980 GPU. What was not known till today, and it was falsely advertised by NVIDIA, is that GTX 970 only has 56 ROPs and smaller L2 cache than GTX 980. Updated specs clarify that 970 has one out of eight L2 modules disabled and as a result the total L2 cache is not 2048 KB, but 1792 KB. It wouldn’t probably change anything, however this particular L2 module is directly connected to 0.5 GB DRAM module.

To put this as simply as possible: GeForce GTX 970 has two memory pools: 3.5 GB running at full speed, and 0.5 GB only used when 3.5 GB pool is exhausted. However the second  pool is running at 1/7th speed of the main pool.

So technically, till you deplete the memory available in the first pool, you will be using 3.5 GB buffer with 224-bit interface.

Ryan Shrout explains:

In a GTX 980, each block of L2 / ROPs directly communicate through a 32-bit portion of the GM204 memory interface and then to a 512MB section of on-board memory. When designing the GTX 970, NVIDIA used a new capability of Maxwell to implement the system in an improved fashion than would not have been possible with Kepler or previous architectures. Maxwell’s configurability allowed NVIDIA to disable a portion of the L2 cache and ROP units while using a “buddy interface” to continue to light up and use all of the memory controller segments. Now, the SMMs use a single L2 interface to communicate with both banks of DRAM (on the far right) which does create a new concern. (…)

And since the vast majority of gaming situations occur well under the 3.5GB memory size this determination makes perfect sense. It is those instances where memory above 3.5GB needs to be accessed where things get more interesting.

Let’s be blunt here: access to the 0.5GB of memory, on its own and in a vacuum, would occur at 1/7th of the speed of the 3.5GB pool of memory. If you look at the Nai benchmarks (EDIT: picture here) floating around, this is what you are seeing.

GM204_arch_0

NVIDIA GeForce GTX 970 Corrected Specifications
GeForce GTX 970GeForce GTX 970 ‘Corrected’
PictureNVIDIA-GeForce-GTX-970-angleNVIDIA-GeForce-GTX-970-angle
GPU28nm GM204-20028nm GM204-200
CUDA Cores16641664
TMUs104104
ROPs6456
L2 Cache2048 KB1792 KB
Memory Bus256-bit256-bit
Memory Size4GB4GB (3.5GB + 0.5GB)
TDP145W145W

Check this video from PCPerspective:

Source: PCPerspective




Comment Policy
  1. Comments must be written in English and should not exceed 1000 characters.
  2. Comments deemed to be spam or solely promotional in nature will be deleted. Including a link to relevant content is permitted, but comments should be relevant to the post topic. Discussions about politics are not allowed on this website.
  3. Comments and usernames containing language or concepts that could be deemed offensive will be deleted.
  4. Comments complaining about the post subject or its source will be removed.
  5. A failure to comply with these rules will result in a warning and, in extreme cases, a ban. In addition, please note that comments that attack or harass an individual directly will result in a ban without warning.
  6. VideoCardz has never been sponsored by AMD, Intel, or NVIDIA. Users claiming otherwise will be banned.
  7. VideoCardz Moderating Team reserves the right to edit or delete any comments submitted to the site without notice.
  8. If you have any questions about the commenting policy, please let us know through the Contact Page.
Hide Comment Policy
Comments