Graphics Core Next

[3] GCN requires considerably more transistors than TeraScale, but offers advantages for general-purpose GPU (GPGPU) computation due to a simpler compiler.

GCN was also used in the graphics portion of Accelerated Processing Units (APUs), including those in the PlayStation 4 and Xbox One.

[7] MIAOW is an open-source RTL implementation of the AMD Southern Islands GPGPU microarchitecture.

In November 2015, AMD announced its Boltzmann Initiative, which aims to enable the porting of CUDA-based applications to a common C++ programming model.

AMD has claimed that each GCN compute unit (CU) has 64 KiB Local Data Share (LDS).

[16] The CU scheduler is the hardware functional block, choosing which wavefronts the SIMD-VU executes.

AMD and Nvidia chose similar approaches to hide this unavoidable latency: the grouping of multiple threads.

A group of threads is the most basic unit of scheduling of GPUs that implement this approach to hide latency.

Note that in conjunction with the SSE instructions, this concept of the most basic level of parallelism is often called a "vector width".

VCE 3.0 formed a part of the third generation of GCN, adding high-quality video scaling and the HEVC (H.265) codec.

In a preview in 2011, AnandTech wrote about the unified virtual memory, supported by Graphics Core Next.

[22] This very first implementation focuses on a single "Kaveri" APU and works alongside the existing Radeon kernel graphics driver (kgd).

[25] A driver update has enabled the hardware schedulers in third generation GCN parts for production use.

A Shader Engine comprises one geometry processor, up to 44 CUs (Hawaii chip), rasterizers, ROPs, and L1 cache.

[32] At AMD Developer Summit (APU) in November 2013 Michael Mantor presented the Radeon R9 290X.

[33] Discrete GPUs (Sea Islands family): integrated into APUs: GCN 3rd generation[34] was introduced in 2014 with the Radeon R9 285 and R9 M295X, which have the "Tonga" GPU.

It features improved tessellation performance, lossless delta color compression to reduce memory bandwidth usage, an updated and more efficient instruction set, a new high quality scaler for video, HEVC encoding (VCE 3.0) and HEVC decoding (UVD 6.0), and a new multimedia engine (video encoder/decoder).

All Polaris-based chips other than the Polaris 30 are produced on the 14 nm FinFET process, developed by Samsung Electronics and licensed to GlobalFoundries.

[39] The slightly newer refreshed Polaris 30 is built on the 12 nm LP FinFET process node, developed by Samsung and GlobalFoundries.

It is an optimization for 14 nm FinFET process enabling higher GPU clock speeds than with the 3rd GCN generation.

[40] Architectural improvements include new hardware schedulers, a new primitive discard accelerator, a new display controller, and an updated UVD that can decode HEVC at 4K resolutions at 60 frames per second with 10 bits per color channel.

AMD began releasing details of their next generation of GCN Architecture, termed the 'Next-Generation Compute Unit', in January 2017.

The discrete graphics chipsets also include "HBCC (High Bandwidth Cache Controller)", but not when integrated into APUs.

[47] Additionally, the new chips were expected to include improvements in the Rasterisation and Render output units.

GCN command processing: Each Asynchronous Compute Engines (ACE) can parse incoming commands and dispatch work to the Compute Units (CUs). Each ACE can manage up to 8 independent queues. The ACEs can operate in parallel with the graphics command processor and two DMA engines. The graphics command processor handles graphics queues, the ACEs handle compute queues, and the DMA engines handle copy queues. Each queue can dispatch work items without waiting for other tasks to complete, allowing independent command streams to be interleaved on the GPU's Shader.
Geometry processor
GCN includes special purpose function blocks to be used by HSA. Support for these function blocks is available through amdkfd since Linux kernel 3.19. [ 20 ]
Die shot of the Tahiti GPU used in Radeon HD 7950 GHz Edition graphics cards
AMD PowerTune "Bonaire"
Die shot of the Hawaii GPU used in Radeon R9 290 graphics cards
Die shot of the Fiji GPU used in Radeon R9 Nano graphics cards
Die shot of the Polaris 11 GPU used in Radeon RX 460 graphics cards
Die shot of the Polaris 10 GPU used in Radeon RX 470 graphics cards
Die shot of the Vega 10 GPU used in Radeon RX Vega 64 graphics cards