Essential matrix

correspond to the same 3D point in the scene (not an "if and only if" due to the fact that points that lie on the same epipolar line in the first image will get mapped to the same epipolar line in the second image).

The above relation which defines the essential matrix was published in 1981 by H. Christopher Longuet-Higgins, introducing the concept to the computer vision community.

Richard Hartley and Andrew Zisserman's book reports that an analogous matrix appeared in photogrammetry long before that.

from a set of corresponding normalized image coordinates as well as an algorithm for determining the relative position and orientation of the two cameras given that

Finally, it shows how the 3D coordinates of the image points can be determined with the aid of the essential matrix.

Two normalized cameras project the 3D world onto their respective image planes.

Another consequence of the normalized cameras is that their respective coordinate systems are related by means of a translation and rotation.

we are only interested in the orientations of the normalized image coordinates [1] (See also: Triple product).

As such we don't need the translational component when substituting image coordinates into the essential equation.

This gives which is the constraint that the essential matrix defines between corresponding image points.

The properties described here are sometimes referred to as internal constraints of the essential matrix.

is multiplied by a non-zero scalar, the result is again an essential matrix which defines exactly the same constraint as

can be seen as an element of a projective space, that is, two such matrices are considered equivalent if one is a non-zero scalar multiplication of the other.

These constraints are often used for determining the essential matrix from five corresponding point pairs.

The essential matrix has five or six degrees of freedom, depending on whether or not it is seen as a projective element.

If the essential matrix is considered as a projective element, however, one degree of freedom related to scalar multiplication must be subtracted leaving five degrees of freedom in total.

However, if the image points are subject to noise, which is the common case in any practical situation, it is not possible to find an essential matrix which satisfies all constraints exactly.

Depending on how the error related to each constraint is measured, it is possible to determine or estimate an essential matrix which optimally satisfies the constraints for a given set of corresponding image points.

The most straightforward approach is to set up a total least squares problem, commonly known as the eight-point algorithm.

which, according to the internal constraints of the essential matrix, must consist of two identical and one zero value.

may not completely fulfill the constraints when dealing with real world data (f.e.

, since According to the general properties of the matrix representation of the cross product it then follows that

For the translation vector this only causes a change of sign, which has already been described as a possibility.

For the rotation, on the other hand, this will produce a different transformation, at least in the general case.

In total this gives four classes of solutions for the rotation and translation between the two camera coordinate systems.

Given a pair of corresponding image coordinates, three of the solutions will always produce a 3D point which lies behind at least one of the two cameras and therefore cannot be seen.

Only one of the four classes will consistently produce 3D points which are in front of both cameras.

Still, however, it has an undetermined positive scaling related to the translation component.

has been estimated from real (and noisy) image data, it has to be assumed that it approximately satisfy the internal constraints.

, if the essential matrix is known and the corresponding rotation and translation transformations have been determined.