Multiple object tracking

The results of MOT experiments have revealed limitations on humans' ability to monitor multiple moving objects simultaneously.

In the 1970s, researcher Zenon Pylyshyn postulated the existence of a "primitive visual process" in the human brain capable of "indexing and tracking features or feature-clusters".

[1] Data collected with Pylyshyn's MOT protocol and published in 1988 provided the first formal demonstration that the mind can keep track of the changing positions of multiple moving objects.

With that constraint, MOT task variations have been designed to probe specific aspects of how the mind tracks moving objects.

Even for a fixed set of display parameters, rather than there being a clear limit, performance falls gradually with the number of targets.

[6] Such findings undermine Pylyshyn's FINST theory that tracking is mediated by a fixed set of discrete pointers.

For example, if there is only one target, one can bring one's full cognitive abilities to bear, such as in predicting future positions, to facilitate tracking.

[12][13] Temporal crowding refers to an impairment caused by distractors visiting a target's former location within a short interval.

The phenomenon was revealed in a study with a display where distractors were evenly-spaced along a circular trajectory that was also shared by a target.

[16][17] In the case of multiple object tracking, however, several MOT studies have found evidence against extrapolation of future positions.

However, the benefit seems to disappear when there are more than one or two targets,[23][24][22] suggesting that any prediction happening is more limited in processing capacity than other aspects of object tracking.

In such experiments, the difference in targets' and distractors' motion directions or accelerations may be the facilitator of tracking rather than prediction of future positions.

[23][24] The human brain represents the positions of objects with multiple reference frames or coordinate systems.

Early stages of the visual system represent the locations of objects relative to the direction the eyes are pointing (retinotopic coordinates).

[28] One study measured an electrical brain response (ERP) to a probe that was flashed while the objects were moving.

[31] To assess maintenance of knowledge of object identities, one series of experiments used cartoon animals as targets and distractors that all moved about the screen.

[36][37] That, however, may only mean that nothing is comparing the features present before and after the change; it does not necessarily mean that object representations are not updated, so other studies are needed.

When individual ends of multiple dumbbell-shaped drawings are designated as targets, tracking performance is poor.

Stuart Anstis has shown that people are unable to track the intersection of two lines sliding over each other, except possibly at very slow speeds.

[39] Overlap among the processes underlying mental abilities can be revealed by what types of concurrent tasks interfere with each other.

The pupil size increase, which also is caused by mental effort in other tasks, may reflect norepinephrine release from the locus coeruleus.

One such study found a robust correlation between tracking performance and the effect of number of targets on the N2pc event-related potential and also on contralateral delay activity.

[65][66] Adults with Williams Syndrome have profound deficits on certain spatial assembly tasks, such as copying a four-block checkerboard pattern.

[68][65][66] In contrast, their ability to remember the locations of MOT targets if they don't move is more comparable to typically-developing 6-year-olds, which has led to the suggestion that maintaining attentional selection is a particular problem in Williams Syndrome.

[74] Although some researchers have used MOT in an attempt to ensure study participants sustain their attention over a long interval, a study with a large number of participants found little correlation with a continuous performance task specifically designed to measure lapses in attention.

[80] Another reason for skepticism of such claims is the poor track record of other commercial "brain training" products advertised for their cognitive-enhancing effects.

No published theory purports to explain all four of the following: the difficulty with tracking parts of objects, the role of temporal interference, the dissociation between position and non-positional features, and the pattern of performance decline with increasing number of targets.

A serial selection process is also included, which operates on only one object at a time and enables access to a target's motion history and other features.

[88] Central to Pylyshyn's FINST theory is that a small set of discrete pointers mediate multiple object tracking.

This article was adapted from the following source under a CC BY 4.0 license (2023) (reviewer reports): Alex O. Holcombe (15 April 2023).

Sequence of events in a typical MOT task. Target objects are initially highlighted before becoming identical to the distractors. When the objects stop moving, the challenge is to identify which objects were the targets.