Stereo Vision

Stereo vision for depth estimation

Stereo vision is the process of extracting 3D information from multiple 2D views of a scene.

The 3D information can be obtained from a pair of images, also known as a stereo pair, by estimating the relative depth of points in the scene. These estimates are represented in a stereo disparity map, which is constructed by matching corresponding points in the stereo pair.

Reconstructing a scene using a pair  of stereo images

Reconstructing a scene using a pair of stereo images (top left and top right). To visualize the disparity, the right channel is combined with the left channel to create a composite (middle left). Also shown are a disparity map of the scene (middle right) and a 3D rendering of the scene (bottom center). See example for MATLAB code and explanation.

Stereo images are rectified to simplify matching, so that a corresponding point in one image can be found in the same row in the other image. This reduces the 2D stereo correspondence problem to a 1D problem. Stereo image rectification is achieved by determining a set of matched interest points, estimating the fundamental matrix, and then deriving two projective transformations.

Rectified stereo image pair

Rectified stereo image pair. Notice that matching points reside on the same row. See example for MATLAB code and explanation.

Stereo vision is used in many applications such as robot navigation, 3D movie recording and production, object tracking, machine vision, and range sensing. For more information on stereo vision, see Computer Vision System Toolbox.

Examples and How To

Software Reference

See also: camera calibration, object detection, object tracking, image and video image processing, RANSAC, feature matching, feature extraction, ransac