Object recognition (cognitive science)

[5] A highly recognized bottom-up hierarchical theory is James DiCarlo's Untangling description [6] whereby each stage of the hierarchically arranged ventral visual pathway performs operations to gradually transform object representations into an easily extractable format.

Possible interpretations of the crude visual input is generated in the PFC and then sent to the inferotemporal cortex (IT) subsequently activating relevant object representations which are then incorporated into the slower, bottom-up process.

[9] Participants who did categorization and recognition tasks while undergoing a functional magnetic found as increased blood flow indicating activation in specific regions of the brain.

The brain regions implicated in mental rotation, such as the ventral and dorsal visual pathways and the prefrontal cortex, showed the greatest increase in blood flow during these tasks, demonstrating that they are critical for the ability to view objects from multiple angles.

[10][citation needed] Participants in a study were presented with one encoding view from each of 24 preselected objects, as well as five filler images.

Binocular disparity cues were displayed on the screen by rendering stimuli as green-red anaglyphs and the slant-tilt curves ranged from 0 to 330.

Recognition is acquired when the observed object viewpoint is mentally rotated to match the stored canonical description.

[13] This theory of recognition is based on a more holistic system rather than by parts, suggesting that objects are stored in memory with multiple viewpoints and angles.

[15] Since this initial proposal, it has been alternatively suggested that the dorsal pathway should be known as the 'How' pathway as the visual spatial information processed here provides us with information about how to interact with objects,[16] For the purpose of object recognition, the neural focus is on the ventral stream.

The brain regions most consistently found to display functional specialization are the fusiform face area (FFA), which shows increased activation for faces when compared with objects, the parahippocampal place area (PPA) for scenes vs. objects, the extrastriate body area (EBA) for body parts vs. objects, MT+/V5 for moving stimuli vs. static stimuli, and the Lateral Occipital Complex (LOC) for discernible shapes vs. scrambled stimuli.

[18] In a related [fMRI-en] study, the activation of the LOC, which occurred regardless of the presented object's visual cues such as motion, texture, or luminance contrasts, suggests that the different low-level visual cues used to define an object converge in "object-related areas" to assist in the perception and recognition process.

[20] Further experiments have proposed that the LOC consists of a hierarchical system for shape selectivity indicating greater selective activation in the posterior regions for fragments of objects whereas the [anterior-en] regions show greater activation for full or partial objects.

[21] This is consistent with previous research that suggests a hierarchical representation in the ventral temporal cortex where primary feature processing occurs in the posterior regions and the integration of these features into a whole and meaningful object occurs in the [anterior-en] regions.

Research has also provided evidence which indicates that visual semantic information converges in the fusiform gyri of the inferotemporal lobes.

In a study that compared the semantic knowledge of category versus attributes, it was found that they play separate roles in how they contribute to recognition.

These results suggest that the type of object category determines which region of the fusiform gyrus is activated for processing semantic recognition, whereas the attributes of an object determines the activation in either the left or right fusiform gyrus depending on whether global form or local detail is processed.

[26] In addition, it has been proposed that activation in [anterior-en] regions of the fusiform gyri indicate successful recognition.

[28] This is due to the proposed increased difficulty to distinguish between natural objects as they have very similar structural properties which makes them harder to identify in comparison to artefacts.

[29] Based on results from a study using [fMRI-en], it has been proposed that there is a "context network" in the brain for contextually associated objects with activity largely found in the Parahippocampal cortex (PHC) and the Retrosplenial Complex (RSC).

One notable characteristic of visual recognition memory is its remarkable capacity: even after seeing thousands of images on single trials, humans perform at high accuracy in subsequent memory tests and they remember considerable detail about the images that they have seen [31] Context allows for a much greater accuracy in object recognition.

This phenomenon remains true across all age groups and cultures, signifying that context is essential in accurately identifying facial emotion for all individuals.

[35] Recollection shares many similarities with familiarity; however, it is context-dependent, requiring specific information from the inquired incident.

When object agnosia occurs from a lesion in the dominant hemisphere, there is often a profound associated language disturbance, including loss of word meaning.

For example, it was found that lesions to the perirhinal cortex in rats causes impairments in object recognition especially with an increase in feature ambiguity.

[38] Combined amygdalohippocampal (A + H) lesions in rats impaired performance on an object recognition task when the retention intervals were increased beyond 0s and when test stimuli were repeated within a session.

[39] In an object recognition task, the level of discrimination was significantly lower in the electrolytic lesions of globus pallidus (part of the basal ganglia) in rats compared to the Substantia- Innominata/Ventral Pallidum which was in turn worse compared to Control and Medial Septum/Vertical Diagonal Band of Broca groups; however, only globus pallidus did not discriminate between new and familiar objects.

Agnosia is a rare occurrence and can be the result of a stroke, dementia, head injury, brain infection, or hereditary.

[34] Similarly, associative visual agnosia is the inability to understand the significance of objects; however, this time the deficit is in semantic memory.

Figure 1. This image, created based on Biederman's (1987) Recognition by Components theory, is an example of how objects can be broken down into Geons.
alt text
The Dorsal Stream is shown in green and the Ventral Stream in purple.