Note: This page has been translated by MathWorks. Please click here

To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

Structure from motion (SfM) is the process of estimating the 3-D structure of a scene from a set of 2-D images. This example shows you how to estimate the poses of a calibrated camera from two images, reconstruct the 3-D structure of the scene up to an unknown scale factor, and then recover the actual scale factor by detecting an object of a known size.

This example shows how to reconstruct a 3-D scene from a pair of 2-D images taken with a camera calibrated using the Camera Calibrator app. The algorithm consists of the following steps:

Match a sparse set of points between the two images. There are multiple ways of finding point correspondences between two images. This example detects corners in the first image using the

`detectMinEigenFeatures`

function, and tracks them into the second image using`vision.PointTracker`

. Alternatively you can use`extractFeatures`

followed by`matchFeatures`

.Estimate the fundamental matrix using

`estimateFundamentalMatrix`

.Compute the motion of the camera using the

`relativeCameraPose`

function.Match a dense set of points between the two images. Re-detect the point using

`detectMinEigenFeatures`

with a reduced`'MinQuality'`

to get more points. Then track the dense points into the second image using`vision.PointTracker`

.Determine the 3-D locations of the matched points using

`triangulate`

.Detect an object of a known size. In this scene there is a globe, whose radius is known to be 10cm. Use

`pcfitsphere`

to find the globe in the point cloud.Recover the actual scale, resulting in a metric reconstruction.

Load a pair of images into the workspace.

imageDir = fullfile(toolboxdir('vision'), 'visiondata','upToScaleReconstructionImages'); images = imageDatastore(imageDir); I1 = readimage(images, 1); I2 = readimage(images, 2); figure imshowpair(I1, I2, 'montage'); title('Original Images');

This example uses the camera parameters calculated by the cameraCalibrator app. The parameters are stored in the `cameraParams`

object, and include the camera intrinsics and lens distortion coefficients.

% Load precomputed camera parameters load upToScaleReconstructionCameraParameters.mat

Lens distortion can affect the accuracy of the final reconstruction. You can remove the distortion from each of the images using the `undistortImage`

function. This process straightens the lines that are bent by the radial distortion of the lens.

I1 = undistortImage(I1, cameraParams); I2 = undistortImage(I2, cameraParams); figure imshowpair(I1, I2, 'montage'); title('Undistorted Images');

Detect good features to track. Reduce `'MinQuality'`

to detect fewer points, which would be more uniformly distributed throughout the image. If the motion of the camera is not very large, then tracking using the KLT algorithm is a good way to establish point correspondences.

% Detect feature points imagePoints1 = detectMinEigenFeatures(rgb2gray(I1), 'MinQuality', 0.1); % Visualize detected points figure imshow(I1, 'InitialMagnification', 50); title('150 Strongest Corners from the First Image'); hold on plot(selectStrongest(imagePoints1, 150)); % Create the point tracker tracker = vision.PointTracker('MaxBidirectionalError', 1, 'NumPyramidLevels', 5); % Initialize the point tracker imagePoints1 = imagePoints1.Location; initialize(tracker, imagePoints1, I1); % Track the points [imagePoints2, validIdx] = step(tracker, I2); matchedPoints1 = imagePoints1(validIdx, :); matchedPoints2 = imagePoints2(validIdx, :); % Visualize correspondences figure showMatchedFeatures(I1, I2, matchedPoints1, matchedPoints2); title('Tracked Features');

Use the `estimateEssentialMatrix`

function to compute the essential matrix and find the inlier points that meet the epipolar constraint.

% Estimate the fundamental matrix [E, epipolarInliers] = estimateEssentialMatrix(... matchedPoints1, matchedPoints2, cameraParams, 'Confidence', 99.99); % Find epipolar inliers inlierPoints1 = matchedPoints1(epipolarInliers, :); inlierPoints2 = matchedPoints2(epipolarInliers, :); % Display inlier matches figure showMatchedFeatures(I1, I2, inlierPoints1, inlierPoints2); title('Epipolar Inliers');

Compute the location and orientation of the second camera relative to the first one. Note that `t`

is a unit vector, because translation can only be computed up to scale.

[orient, loc] = relativeCameraPose(E, cameraParams, inlierPoints1, inlierPoints2);

Re-detect points in the first image using lower `'MinQuality'`

to get more points. Track the new points into the second image. Estimate the 3-D locations corresponding to the matched points using the `triangulate`

function, which implements the Direct Linear Transformation (DLT) algorithm [1]. Place the origin at the optical center of the camera corresponding to the first image.

% Detect dense feature points. Use an ROI to exclude points close to the % image edges. roi = [30, 30, size(I1, 2) - 30, size(I1, 1) - 30]; imagePoints1 = detectMinEigenFeatures(rgb2gray(I1), 'ROI', roi, ... 'MinQuality', 0.001); % Create the point tracker tracker = vision.PointTracker('MaxBidirectionalError', 1, 'NumPyramidLevels', 5); % Initialize the point tracker imagePoints1 = imagePoints1.Location; initialize(tracker, imagePoints1, I1); % Track the points [imagePoints2, validIdx] = step(tracker, I2); matchedPoints1 = imagePoints1(validIdx, :); matchedPoints2 = imagePoints2(validIdx, :); % Compute the camera matrices for each position of the camera % The first camera is at the origin looking along the Z-axis. Thus, its % rotation matrix is identity, and its translation vector is 0. camMatrix1 = cameraMatrix(cameraParams, eye(3), [0 0 0]); % Compute extrinsics of the second camera [R, t] = cameraPoseToExtrinsics(orient, loc); camMatrix2 = cameraMatrix(cameraParams, R, t); % Compute the 3-D points points3D = triangulate(matchedPoints1, matchedPoints2, camMatrix1, camMatrix2); % Get the color of each reconstructed point numPixels = size(I1, 1) * size(I1, 2); allColors = reshape(I1, [numPixels, 3]); colorIdx = sub2ind([size(I1, 1), size(I1, 2)], round(matchedPoints1(:,2)), ... round(matchedPoints1(:, 1))); color = allColors(colorIdx, :); % Create the point cloud ptCloud = pointCloud(points3D, 'Color', color);

Use the `plotCamera`

function to visualize the locations and orientations of the camera, and the `pcshow`

function to visualize the point cloud.

% Visualize the camera locations and orientations cameraSize = 0.3; figure plotCamera('Size', cameraSize, 'Color', 'r', 'Label', '1', 'Opacity', 0); hold on grid on plotCamera('Location', loc, 'Orientation', orient, 'Size', cameraSize, ... 'Color', 'b', 'Label', '2', 'Opacity', 0); % Visualize the point cloud pcshow(ptCloud, 'VerticalAxis', 'y', 'VerticalAxisDir', 'down', ... 'MarkerSize', 45); % Rotate and zoom the plot camorbit(0, -30); camzoom(1.5); % Label the axes xlabel('x-axis'); ylabel('y-axis'); zlabel('z-axis') title('Up to Scale Reconstruction of the Scene');

Find the globe in the point cloud by fitting a sphere to the 3-D points using the `pcfitsphere`

function.

% Detect the globe globe = pcfitsphere(ptCloud, 0.1); % Display the surface of the globe plot(globe); title('Estimated Location and Size of the Globe'); hold off

The actual radius of the globe is 10cm. You can now determine the coordinates of the 3-D points in centimeters.

% Determine the scale factor scaleFactor = 10 / globe.Radius; % Scale the point cloud ptCloud = pointCloud(points3D * scaleFactor, 'Color', color); loc = loc * scaleFactor; % Visualize the point cloud in centimeters cameraSize = 2; figure plotCamera('Size', cameraSize, 'Color', 'r', 'Label', '1', 'Opacity', 0); hold on grid on plotCamera('Location', loc, 'Orientation', orient, 'Size', cameraSize, ... 'Color', 'b', 'Label', '2', 'Opacity', 0); % Visualize the point cloud pcshow(ptCloud, 'VerticalAxis', 'y', 'VerticalAxisDir', 'down', ... 'MarkerSize', 45); camorbit(0, -30); camzoom(1.5); % Label the axes xlabel('x-axis (cm)'); ylabel('y-axis (cm)'); zlabel('z-axis (cm)') title('Metric Reconstruction of the Scene');

This example showed how to recover camera motion and reconstruct the 3-D structure of a scene from two images taken with a calibrated camera.

[1] Hartley, Richard, and Andrew Zisserman. Multiple View Geometry in Computer Vision. Second Edition. Cambridge, 2000.