Memory access pattern

These patterns differ in the level of locality of reference and drastically affect cache performance,[1] and also have implications for the approach to parallelism[2][3] and distribution of workload in shared memory systems.

[4] Further, cache coherency issues can affect multiprocessor performance,[5] which means that certain memory access patterns place a ceiling on parallelism (which manycore approaches seek to break).

Strided or simple 2D, 3D access patterns (e.g., stepping through multi-dimensional arrays) are similarly easy to predict, and are found in implementations of linear algebra algorithms and image processing.

In a gather memory access pattern, reads are randomly addressed or indexed, whilst the writes are sequential (or linear).

Compared to scatter, the disadvantage is that caching (and bypassing latencies) is now essential for efficient reads of small elements, however it is easier to parallelise since the writes are guaranteed to not overlap.

This is essentially the full operation of a GPU pipeline when performing 3D rendering- gathering indexed vertices and textures, and scattering shaded pixels in screen space.

Given a truly random memory access pattern, it may be possible to break it down (including scatter or gather stages, or other intermediate sorting) which may improve the locality overall; this is often a prerequisite for parallelizing.

A programmer will change the memory access pattern (by reworking algorithms) to improve the locality of reference,[29] and/or to increase potential for parallelism.

[26] A programmer or system designer may create frameworks or abstractions (e.g., C++ templates or higher-order functions) that encapsulate a specific memory access pattern.