Documentation |
Estimate object velocities
The Optical Flow block estimates the direction and speed of object motion from one image to another or from one video frame to another using either the Horn-Schunck or the Lucas-Kanade method.
Port | Output | Supported Data Types | Complex Values Supported |
---|---|---|---|
I/I1 | Scalar, vector, or matrix of intensity values |
| No |
I2 | Scalar, vector, or matrix of intensity values | Same as I port | No |
|V|^2 | Matrix of velocity magnitudes | Same as I port | No |
V | Matrix of velocity components in complex form | Same as I port | Yes |
To compute the optical flow between two images, you must solve the following optical flow constraint equation:
$${I}_{x}u+{I}_{y}v+{I}_{t}=0$$
In this equation, the following values are represented:
$${I}_{x}$$, $${I}_{y}$$ and $${I}_{t}$$ are the spatiotemporal image brightness derivatives
u is the horizontal optical flow
v is the vertical optical flow
Because this equation is underconstrained, there are several methods to solve for u and v:
Horn-Schunck Method
Lucas-Kanade Method
See the following two sections for descriptions of these methods
By assuming that the optical flow is smooth over the entire image, the Horn-Schunck method computes an estimate of the velocity field, $$[\begin{array}{cc}u& v{]}^{T}\end{array}$$, that minimizes this equation:
$$E={\displaystyle \iint ({I}_{x}u+{I}_{y}v}+{I}_{t}{)}^{2}dxdy+\alpha {\displaystyle \iint \left\{{\left(\frac{\partial u}{\partial x}\right)}^{2}+{\left(\frac{\partial u}{\partial y}\right)}^{2}+{\left(\frac{\partial v}{\partial x}\right)}^{2}+{\left(\frac{\partial v}{\partial y}\right)}^{2}\right\}}dxdy$$
In this equation, $$\frac{\partial u}{\partial x}$$ and $$\frac{\partial u}{\partial y}$$ are the spatial derivatives of the optical velocity component u, and $$\alpha $$ scales the global smoothness term. The Horn-Schunck method minimizes the previous equation to obtain the velocity field, [u v], for each pixel in the image, which is given by the following equations:
$$\begin{array}{l}{u}_{x,y}^{k+1}={\overline{u}}_{x,y}^{k}-\frac{{I}_{x}[{I}_{x}{\overline{u}}^{k}{}_{x,y}+{I}_{y}{\overline{v}}^{k}{}_{x,y}+{I}_{t}]}{{\alpha}^{2}+{I}_{x}^{2}+{I}_{y}^{2}}\\ {v}_{x,y}^{k+1}={\overline{v}}_{x,y}^{k}-\frac{{I}_{y}[{I}_{x}{\overline{u}}^{k}{}_{x,y}+{I}_{y}{\overline{v}}^{k}{}_{x,y}+{I}_{t}]}{{\alpha}^{2}+{I}_{x}^{2}+{I}_{y}^{2}}\end{array}$$
In this equation, $$\left[\begin{array}{cc}{u}_{x,y}^{k}& {v}_{x,y}^{k}\end{array}\right]$$ is the velocity estimate for the pixel at (x,y), and $$\left[\begin{array}{cc}{\overline{u}}_{x,y}^{k}& {\overline{v}}_{x,y}^{k}\end{array}\right]$$ is the neighborhood average of $$\left[\begin{array}{cc}{u}_{x,y}^{k}& {v}_{x,y}^{k}\end{array}\right]$$. For k=0, the initial velocity is 0.
When you choose the Horn-Schunck method, u and v are solved as follows:
Compute $${I}_{x}$$ and $${I}_{y}$$ using the Sobel convolution kernel: $$\left[\begin{array}{ccc}-1& -2& \begin{array}{ccc}\begin{array}{ccc}\begin{array}{ccc}-1;& 0& 0\end{array}& 0;& 1\end{array}& 2& 1\end{array}\end{array}\right]$$, and its transposed form for each pixel in the first image.
Compute $${I}_{t}$$ between images 1 and 2 using the $$\left[\begin{array}{cc}-1& 1\end{array}\right]$$ kernel.
Assume the previous velocity to be 0, and compute the average velocity for each pixel using $$\left[\begin{array}{ccc}0& 1& \begin{array}{ccc}0;& 1& \begin{array}{ccc}0& 1;& \begin{array}{ccc}0& 1& 0\end{array}\end{array}\end{array}\end{array}\right]$$ as a convolution kernel.
Iteratively solve for u and v.
To solve the optical flow constraint equation for u and v, the Lucas-Kanade method divides the original image into smaller sections and assumes a constant velocity in each section. Then, it performs a weighted least-square fit of the optical flow constraint equation to a constant model for $${\left[\begin{array}{cc}u& v\end{array}\right]}^{T}$$ in each section, $$\Omega $$, by minimizing the following equation:
$$\sum _{x\in \Omega}{W}^{2}{[{I}_{x}u+{I}_{y}v+{I}_{t}]}^{2}$$
Here, W is a window function that emphasizes the constraints at the center of each section. The solution to the minimization problem is given by the following equation:
$$\left[\begin{array}{cc}{\displaystyle \sum {W}^{2}{I}_{x}^{2}}& {\displaystyle \sum {W}^{2}{I}_{x}{I}_{y}}\\ {\displaystyle \sum {W}^{2}{I}_{y}{I}_{x}}& {\displaystyle \sum {W}^{2}{I}_{y}^{2}}\end{array}\right]\left[\begin{array}{c}u\\ v\end{array}\right]=-\left[\begin{array}{c}{\displaystyle \sum {W}^{2}{I}_{x}{I}_{t}}\\ {\displaystyle \sum {W}^{2}{I}_{y}{I}_{t}}\end{array}\right]$$
When you choose the Lucas-Kanade method, $${I}_{t}$$ is computed using a difference filter or a derivative of a Gaussian filter.
The two following sections explain how $${I}_{x}$$, $${I}_{y}$$, $${I}_{t}$$, and then u and v are computed.
When you set the Temporal gradient filter to Difference filter [-1 1], u and v are solved as follows:
Compute $${I}_{x}$$ and $${I}_{y}$$ using the kernel $$\left[\begin{array}{cccc}-1& 8& 0& \begin{array}{cc}-8& 1\end{array}\end{array}\right]/12$$ and its transposed form.
If you are working with fixed-point data types, the kernel values are signed fixed-point values with word length equal to 16 and fraction length equal to 15.
Compute $${I}_{t}$$ between images 1 and 2 using the $$\left[\begin{array}{cc}-1& 1\end{array}\right]$$ kernel.
Smooth the gradient components, $${I}_{x}$$, $${I}_{y}$$, and $${I}_{t}$$, using a separable and isotropic 5-by-5 element kernel whose effective 1-D coefficients are $$\left[\begin{array}{cccc}\begin{array}{cc}1& 4\end{array}& 6& 4& 1\end{array}\right]/16$$. If you are working with fixed-point data types, the kernel values are unsigned fixed-point values with word length equal to 8 and fraction length equal to 7.
Solve the 2-by-2 linear equations for each pixel using the following method:
If $$A=\left[\begin{array}{cc}a& b\\ b& c\end{array}\right]=\left[\begin{array}{cc}{\displaystyle \sum {W}^{2}{I}_{x}^{2}}& {\displaystyle \sum {W}^{2}{I}_{x}{I}_{y}}\\ {\displaystyle \sum {W}^{2}{I}_{y}{I}_{x}}& {\displaystyle \sum {W}^{2}{I}_{y}^{2}}\end{array}\right]$$
Then the eigenvalues of A are $${\lambda}_{i}=\frac{a+c}{2}\pm \frac{\sqrt{4{b}^{2}+{(a-c)}^{2}}}{2};i=1,2$$
In the fixed-point diagrams, $$P=\frac{a+c}{2},Q=\frac{\sqrt{4{b}^{2}+{(a-c)}^{2}}}{2}$$
The eigenvalues are compared to the threshold, $$\tau $$, that corresponds to the value you enter for the threshold for noise reduction. The results fall into one of the following cases:
Case 1: $${\lambda}_{1}\ge \tau $$ and $${\lambda}_{2}\ge \tau $$
A is nonsingular, the system of equations are solved using Cramer's rule.
Case 2: $${\lambda}_{1}\ge \tau $$ and $${\lambda}_{2}<\tau $$
A is singular (noninvertible), the gradient flow is normalized to calculate u and v.
Case 3: $${\lambda}_{1}<\tau $$ and $${\lambda}_{2}<\tau $$
The optical flow, u and v, is 0.
If you set the temporal gradient filter to Derivative of Gaussian, u and v are solved using the following steps. You can see the flow chart for this process at the end of this section:
Compute $${I}_{x}$$ and $${I}_{y}$$ using the following steps:
Use a Gaussian filter to perform temporal filtering. Specify the temporal filter characteristics such as the standard deviation and number of filter coefficients using the Number of frames to buffer for temporal smoothing parameter.
Use a Gaussian filter and the derivative of a Gaussian filter to smooth the image using spatial filtering. Specify the standard deviation and length of the image smoothing filter using the Standard deviation for image smoothing filter parameter.
Compute $${I}_{t}$$ between images 1 and 2 using the following steps:
Use the derivative of a Gaussian filter to perform temporal filtering. Specify the temporal filter characteristics such as the standard deviation and number of filter coefficients using the Number of frames to buffer for temporal smoothing parameter.
Use the filter described in step 1b to perform spatial filtering on the output of the temporal filter.
Smooth the gradient components, $${I}_{x}$$, $${I}_{y}$$, and $${I}_{t}$$, using a gradient smoothing filter. Use the Standard deviation for gradient smoothing filter parameter to specify the standard deviation and the number of filter coefficients for the gradient smoothing filter.
Solve the 2-by-2 linear equations for each pixel using the following method:
If $$A=\left[\begin{array}{cc}a& b\\ b& c\end{array}\right]=\left[\begin{array}{cc}{\displaystyle \sum {W}^{2}{I}_{x}^{2}}& {\displaystyle \sum {W}^{2}{I}_{x}{I}_{y}}\\ {\displaystyle \sum {W}^{2}{I}_{y}{I}_{x}}& {\displaystyle \sum {W}^{2}{I}_{y}^{2}}\end{array}\right]$$
Then the eigenvalues of A are $${\lambda}_{i}=\frac{a+c}{2}\pm \frac{\sqrt{4{b}^{2}+{(a-c)}^{2}}}{2};i=1,2$$
When the block finds the eigenvalues, it compares them to the threshold, $$\tau $$, that corresponds to the value you enter for the Threshold for noise reduction parameter. The results fall into one of the following cases:
Case 1: $${\lambda}_{1}\ge \tau $$ and $${\lambda}_{2}\ge \tau $$
A is nonsingular, so the block solves the system of equations using Cramer's rule.
Case 2: $${\lambda}_{1}\ge \tau $$ and $${\lambda}_{2}<\tau $$
A is singular (noninvertible), so the block normalizes the gradient flow to calculate u and v.
Case 3: $${\lambda}_{1}<\tau $$ and $${\lambda}_{2}<\tau $$
The optical flow, u and v, is 0.
The following diagrams shows the data types used in the Optical Flow block for fixed-point signals. The block supports fixed-point data types only when the Method parameter is set to Lucas-Kanade.
You can set the product output, accumulator, gradients, threshold, and output data types in the block mask.
The Main pane of the Optical Flow dialog box appears as shown in the following figure.
Select the method the block uses to calculate the optical flow. Your choices are Horn-Schunck or Lucas-Kanade.
Select Two images to compute the optical flow between two images. Select Current frame and N-th frame back to compute the optical flow between two video frames that are N frames apart.
This parameter is visible if you set the Method parameter to Horn-Schunck or you set the Method parameter to Lucas-Kanade and the Temporal gradient filter to Difference filter [-1 1].
Enter a scalar value that represents the number of frames between the reference frame and the current frame. This parameter becomes available if you set the Compute optical flow between parameter, you select Current frame and N-th frame back.
If the relative motion between the two images or video frames is large, enter a large positive scalar value. If the relative motion is small, enter a small positive scalar value. This parameter becomes available if you set the Method parameter to Horn-Schunck.
Use this parameter to control when the block's iterative solution process stops. If you want it to stop when the velocity difference is below a certain threshold value, select When velocity difference falls below threshold. If you want it to stop after a certain number of iterations, choose When maximum number of iterations is reached. You can also select Whichever comes first. This parameter becomes available if you set the Method parameter to Horn-Schunck.
Enter a scalar value that represents the maximum number of iterations you want the block to perform. This parameter is only visible if, for the Stop iterative solution parameter, you select When maximum number of iterations is reached or Whichever comes first. This parameter becomes available if you set the Method parameter to Horn-Schunck.
Enter a scalar threshold value. This parameter is only visible if, for the Stop iterative solution parameter, you select When velocity difference falls below threshold or Whichever comes first. This parameter becomes available if you set the Method parameter to Horn-Schunck.
If you select Magnitude-squared, the block outputs the optical flow matrix where each element is of the form $${u}^{2}+{v}^{2}$$. If you select Horizontal and vertical components in complex form, the block outputs the optical flow matrix where each element is of the form $$u+jv$$.
Specify whether the block solves for u and v using a difference filter or a derivative of a Gaussian filter. This parameter becomes available if you set the Method parameter to Lucas-Kanade.
Use this parameter to specify the temporal filter characteristics such as the standard deviation and number of filter coefficients. This parameter becomes available if you set the Temporal gradient filter parameter to Derivative of Gaussian.
Specify the standard deviation for the image smoothing filter. This parameter becomes available if you set the Temporal gradient filter parameter to Derivative of Gaussian.
Specify the standard deviation for the gradient smoothing filter. This parameter becomes available if you set the Temporal gradient filter parameter to Derivative of Gaussian.
Select this check box if you want the block to set the motion vector to zero when the optical flow constraint equation is ill-conditioned. This parameter becomes available if you set the Temporal gradient filter parameter to Derivative of Gaussian.
Select this check box if you want the block to output the image that corresponds to the motion vector being output by the block. This parameter becomes available if you set the Temporal gradient filter parameter to Derivative of Gaussian.
Enter a scalar value that determines the motion threshold between each image or video frame. The higher the number, the less small movements impact the optical flow calculation. This parameter becomes available if you set the Method parameter to Lucas-Kanade.
The Data Types pane of the Optical Flow dialog box appears as shown in the following figure. The parameters on this dialog box becomes visible only when the Lucas-Kanade method is selected.
Select the rounding mode for fixed-point operations.
Select the overflow mode for fixed-point operations.
Use this parameter to specify how to designate the product output word and fraction lengths.
When you select Binary point scaling, you can enter the word length and the fraction length of the product output in bits.
When you select Slope and bias scaling, you can enter the word length in bits and the slope of the product output. The bias of all signals in the Computer Vision System Toolbox™ blocks is 0.
Use this parameter to specify how to designate this accumulator word and fraction lengths.
When you select Same as product output, these characteristics match those of the product output.
When you select Binary point scaling, you can enter the word length and the fraction length of the accumulator in bits.
When you select Slope and bias scaling, you can enter the word length in bits and the slope of the accumulator. The bias of all signals in the Computer Vision System Toolbox blocks is 0.
Choose how to specify the word length and fraction length of the gradients data type:
When you select Same as accumulator, these characteristics match those of the accumulator.
When you select Same as product output, these characteristics match those of the product output.
When you select Binary point scaling, you can enter the word length and the fraction length of the quotient, in bits.
When you select Slope and bias scaling, you can enter the word length in bits and the slope of the quotient. The bias of all signals in the Computer Vision System Toolbox blocks is 0.
Choose how to specify the word length and fraction length of the threshold data type:
When you select Same word length as first input, the threshold word length matches that of the first input.
When you select Specify word length, enter the word length of the threshold data type.
When you select Binary point scaling, you can enter the word length and the fraction length of the threshold, in bits.
When you select Slope and bias scaling, you can enter the word length in bits and the slope of the threshold. The bias of all signals in the Computer Vision System Toolbox blocks is 0.
Choose how to specify the word length and fraction length of the output data type:
When you select Binary point scaling, you can enter the word length and the fraction length of the output, in bits.
When you select Slope and bias scaling, you can enter the word length in bits and the slope of the output. The bias of all signals in the Computer Vision System Toolbox blocks is 0.
Select this parameter to prevent the fixed-point tools from overriding the data types you specify on the block mask. For more information, see fxptdlg, a reference page on the Fixed-Point Tool in the Simulink^{®} documentation.
[1] Barron, J.L., D.J. Fleet, S.S. Beauchemin, and T.A. Burkitt. Performance of optical flow techniques. CVPR, 1992.