Estimate object velocities

Analysis & Enhancement

`visionanalysis`

The Optical Flow block estimates the direction and speed of object motion from one image to another or from one video frame to another using either the Horn-Schunck or the Lucas-Kanade method.

Port | Output | Supported Data Types | Complex Values Supported |
---|---|---|---|

I/I1 | Scalar, vector, or matrix of intensity values | Double-precision floating point Single-precision floating point Fixed point (supported when the **Method**parameter is set to`Lucas-Kanade` )
| No |

I2 | Scalar, vector, or matrix of intensity values | Same as I port | No |

|V|^2 | Matrix of velocity magnitudes | Same as I port | No |

V | Matrix of velocity components in complex form | Same as I port | Yes |

To compute the optical flow between two images, you must solve the following optical flow constraint equation:

$${I}_{x}u+{I}_{y}v+{I}_{t}=0$$

.

$${I}_{x}$$, $${I}_{y}$$, and $${I}_{t}$$ are the spatiotemporal image brightness derivatives.

*u*is the horizontal optical flow.*v*is the vertical optical flow.

By assuming that the optical flow is smooth over the entire image, the Horn-Schunck method computes an estimate of the velocity field, $$[\begin{array}{cc}u& v{]}^{T}\end{array}$$, that minimizes this equation:

$$E={\displaystyle \iint ({I}_{x}u+{I}_{y}v}+{I}_{t}{)}^{2}dxdy+\alpha {\displaystyle \iint \left\{{\left(\frac{\partial u}{\partial x}\right)}^{2}+{\left(\frac{\partial u}{\partial y}\right)}^{2}+{\left(\frac{\partial v}{\partial x}\right)}^{2}+{\left(\frac{\partial v}{\partial y}\right)}^{2}\right\}}dxdy$$

.

In this equation, $$\frac{\partial u}{\partial x}$$ and $$\frac{\partial u}{\partial y}$$ are
the spatial derivatives of the optical velocity component, *u*,
and $$\alpha $$ scales
the global smoothness term. The Horn-Schunck method minimizes the
previous equation to obtain the velocity field, [*u v*],
for each pixel in the image. This method is given by the following
equations:

$$\begin{array}{l}{u}_{x,y}^{k+1}={\overline{u}}_{x,y}^{k}-\frac{{I}_{x}[{I}_{x}{\overline{u}}^{k}{}_{x,y}+{I}_{y}{\overline{v}}^{k}{}_{x,y}+{I}_{t}]}{{\alpha}^{2}+{I}_{x}^{2}+{I}_{y}^{2}}\\ {v}_{x,y}^{k+1}={\overline{v}}_{x,y}^{k}-\frac{{I}_{y}[{I}_{x}{\overline{u}}^{k}{}_{x,y}+{I}_{y}{\overline{v}}^{k}{}_{x,y}+{I}_{t}]}{{\alpha}^{2}+{I}_{x}^{2}+{I}_{y}^{2}}\end{array}$$

.

In these equations, $$\left[\begin{array}{cc}{u}_{x,y}^{k}& {v}_{x,y}^{k}\end{array}\right]$$ is
the velocity estimate for the pixel at (*x,y*),
and $$\left[\begin{array}{cc}{\overline{u}}_{x,y}^{k}& {\overline{v}}_{x,y}^{k}\end{array}\right]$$ is
the neighborhood average of $$\left[\begin{array}{cc}{u}_{x,y}^{k}& {v}_{x,y}^{k}\end{array}\right]$$.
For *k = 0*, the initial velocity is 0.

To solve *u* and *v* using
the Horn-Schunck method:

Compute $${I}_{x}$$ and $${I}_{y}$$ using the Sobel convolution kernel, $$\left[\begin{array}{ccc}-1& -2& \begin{array}{ccc}\begin{array}{ccc}\begin{array}{ccc}-1;& 0& 0\end{array}& 0;& 1\end{array}& 2& 1\end{array}\end{array}\right]$$, and its transposed form, for each pixel in the first image.

Compute $${I}_{t}$$ between images 1 and 2 using the $$\left[\begin{array}{cc}-1& 1\end{array}\right]$$ kernel.

Assume the previous velocity to be 0, and compute the average velocity for each pixel using $$\left[\begin{array}{ccc}0& 1& \begin{array}{ccc}0;& 1& \begin{array}{ccc}0& 1;& \begin{array}{ccc}0& 1& 0\end{array}\end{array}\end{array}\end{array}\right]$$ as a convolution kernel.

Iteratively solve for

*u*and*v*.

To solve the optical flow constraint equation for *u* and *v*,
the Lucas-Kanade method divides the original image into smaller sections
and assumes a constant velocity in each section. Then, it performs
a weighted least-square fit of the optical flow constraint equation
to a constant model for $${\left[\begin{array}{cc}u& v\end{array}\right]}^{T}$$ in
each section $$\Omega $$.
The method achieves this fit by minimizing the following equation:

$$\sum _{x\in \Omega}{W}^{2}{[{I}_{x}u+{I}_{y}v+{I}_{t}]}^{2}$$

*W* is a window function that emphasizes
the constraints at the center of each section. The solution to the
minimization problem is

$$\left[\begin{array}{cc}{\displaystyle \sum {W}^{2}{I}_{x}^{2}}& {\displaystyle \sum {W}^{2}{I}_{x}{I}_{y}}\\ {\displaystyle \sum {W}^{2}{I}_{y}{I}_{x}}& {\displaystyle \sum {W}^{2}{I}_{y}^{2}}\end{array}\right]\left[\begin{array}{c}u\\ v\end{array}\right]=-\left[\begin{array}{c}{\displaystyle \sum {W}^{2}{I}_{x}{I}_{t}}\\ {\displaystyle \sum {W}^{2}{I}_{y}{I}_{t}}\end{array}\right]$$

.

When you set the **Temporal gradient filter** to ```
Difference
filter [-1 1]
```

, *u* and *v* are
solved as follows:

Compute $${I}_{x}$$ and $${I}_{y}$$ using the kernel $$\left[\begin{array}{cccc}-1& 8& 0& \begin{array}{cc}-8& 1\end{array}\end{array}\right]/12$$ and its transposed form.

If you are working with fixed-point data types, the kernel values are signed fixed-point values with word length equal to 16 and fraction length equal to 15.

Compute $${I}_{t}$$ between images 1 and 2 using the $$\left[\begin{array}{cc}-1& 1\end{array}\right]$$ kernel.

Smooth the gradient components, $${I}_{x}$$, $${I}_{y}$$, and $${I}_{t}$$, using a separable and isotropic 5-by-5 element kernel whose effective 1-D coefficients are $$\left[\begin{array}{cccc}\begin{array}{cc}1& 4\end{array}& 6& 4& 1\end{array}\right]/16$$. If you are working with fixed-point data types, the kernel values are unsigned fixed-point values with word length equal to 8 and fraction length equal to 7.

Solve the 2-by-2 linear equations for each pixel using the following method:

If $$A=\left[\begin{array}{cc}a& b\\ b& c\end{array}\right]=\left[\begin{array}{cc}{\displaystyle \sum {W}^{2}{I}_{x}^{2}}& {\displaystyle \sum {W}^{2}{I}_{x}{I}_{y}}\\ {\displaystyle \sum {W}^{2}{I}_{y}{I}_{x}}& {\displaystyle \sum {W}^{2}{I}_{y}^{2}}\end{array}\right]$$

Then the eigenvalues of A are $${\lambda}_{i}=\frac{a+c}{2}\pm \frac{\sqrt{4{b}^{2}+{(a-c)}^{2}}}{2};i=1,2$$

In the fixed-point diagrams, $$P=\frac{a+c}{2},Q=\frac{\sqrt{4{b}^{2}+{(a-c)}^{2}}}{2}$$

The eigenvalues are compared to the threshold, $$\tau $$, that corresponds to the value you enter for the threshold for noise reduction. The results fall into one of the following cases:

Case 1: $${\lambda}_{1}\ge \tau $$ and $${\lambda}_{2}\ge \tau $$

A is nonsingular, the system of equations are solved using Cramer's rule.

Case 2: $${\lambda}_{1}\ge \tau $$ and $${\lambda}_{2}<\tau $$

A is singular (noninvertible), the gradient flow is normalized to calculate

*u*and*v*.Case 3: $${\lambda}_{1}<\tau $$ and $${\lambda}_{2}<\tau $$

The optical flow,

*u*and*v*, is 0.

If you set the temporal gradient filter to ```
Derivative
of Gaussian
```

, *u* and *v* are
solved using the following steps. You can see the flow chart for this
process at the end of this section:

Compute $${I}_{x}$$ and $${I}_{y}$$ using the following steps:

Use a Gaussian filter to perform temporal filtering. Specify the temporal filter characteristics such as the standard deviation and number of filter coefficients using the

**Number of frames to buffer for temporal smoothing**parameter.Use a Gaussian filter and the derivative of a Gaussian filter to smooth the image using spatial filtering. Specify the standard deviation and length of the image smoothing filter using the

**Standard deviation for image smoothing filter**parameter.

Compute $${I}_{t}$$ between images 1 and 2 using the following steps:

Use the derivative of a Gaussian filter to perform temporal filtering. Specify the temporal filter characteristics such as the standard deviation and number of filter coefficients using the

**Number of frames to buffer for temporal smoothing**parameter.Use the filter described in step 1b to perform spatial filtering on the output of the temporal filter.

Smooth the gradient components, $${I}_{x}$$, $${I}_{y}$$, and $${I}_{t}$$, using a gradient smoothing filter. Use the

**Standard deviation for gradient smoothing filter**parameter to specify the standard deviation and the number of filter coefficients for the gradient smoothing filter.Solve the 2-by-2 linear equations for each pixel using the following method:

If $$A=\left[\begin{array}{cc}a& b\\ b& c\end{array}\right]=\left[\begin{array}{cc}{\displaystyle \sum {W}^{2}{I}_{x}^{2}}& {\displaystyle \sum {W}^{2}{I}_{x}{I}_{y}}\\ {\displaystyle \sum {W}^{2}{I}_{y}{I}_{x}}& {\displaystyle \sum {W}^{2}{I}_{y}^{2}}\end{array}\right]$$

Then the eigenvalues of A are $${\lambda}_{i}=\frac{a+c}{2}\pm \frac{\sqrt{4{b}^{2}+{(a-c)}^{2}}}{2};i=1,2$$

When the block finds the eigenvalues, it compares them to the threshold, $$\tau $$, that corresponds to the value you enter for the

**Threshold for noise reduction**parameter. The results fall into one of the following cases:Case 1: $${\lambda}_{1}\ge \tau $$ and $${\lambda}_{2}\ge \tau $$

A is nonsingular, so the block solves the system of equations using Cramer's rule.

Case 2: $${\lambda}_{1}\ge \tau $$ and $${\lambda}_{2}<\tau $$

A is singular (noninvertible), so the block normalizes the gradient flow to calculate

*u*and*v*.Case 3: $${\lambda}_{1}<\tau $$ and $${\lambda}_{2}<\tau $$

The optical flow,

*u*and*v*, is 0.

The following diagrams shows the data types used in the Optical
Flow block for fixed-point signals. The block supports fixed-point
data types only when the **Method** parameter is
set to `Lucas-Kanade`

.

You can set the product output, accumulator, gradients, threshold, and output data types in the block mask.

**Method**Select the method the block uses to calculate the optical flow. Your choices are

`Horn-Schunck`

or`Lucas-Kanade`

.**Compute optical flow between**Select

`Two images`

to compute the optical flow between two images. Select`Current frame and N-th frame back`

to compute the optical flow between two video frames that are N frames apart.This parameter is visible if you set the

**Method**parameter to`Horn-Schunck`

or you set the**Method**parameter to`Lucas-Kanade`

and the**Temporal gradient filter**to`Difference filter [-1 1]`

.**N**Enter a scalar value that represents the number of frames between the reference frame and the current frame. This parameter becomes available if you set the

**Compute optical flow between**parameter, you select`Current frame and N-th frame back`

.**Smoothness factor**If the relative motion between the two images or video frames is large, enter a large positive scalar value. If the relative motion is small, enter a small positive scalar value. This parameter becomes available if you set the

**Method**parameter to`Horn-Schunck`

.**Stop iterative solution**Use this parameter to control when the block's iterative solution process stops. If you want it to stop when the velocity difference is below a certain threshold value, select

`When velocity difference falls below threshold`

. If you want it to stop after a certain number of iterations, choose`When maximum number of iterations is reached`

. You can also select`Whichever comes first`

. This parameter becomes available if you set the**Method**parameter to`Horn-Schunck`

.**Maximum number of iterations**Enter a scalar value that represents the maximum number of iterations you want the block to perform. This parameter is only visible if, for the

**Stop iterative solution**parameter, you select`When maximum number of iterations is reached`

or`Whichever comes first`

. This parameter becomes available if you set the**Method**parameter to`Horn-Schunck`

.**Velocity difference threshold**Enter a scalar threshold value. This parameter is only visible if, for the

**Stop iterative solution**parameter, you select`When velocity difference falls below threshold`

or`Whichever comes first`

. This parameter becomes available if you set the**Method**parameter to`Horn-Schunck`

.**Velocity output**If you select

`Magnitude-squared`

, the block outputs the optical flow matrix where each element is of the form $${u}^{2}+{v}^{2}$$. If you select`Horizontal and vertical components in complex form`

, the block outputs the optical flow matrix where each element is of the form $$u+jv$$.**Temporal gradient filter**Specify whether the block solves for

*u*and*v*using a difference filter or a derivative of a Gaussian filter. This parameter becomes available if you set the**Method**parameter to`Lucas-Kanade`

.**Number of frames to buffer for temporal smoothing**Use this parameter to specify the temporal filter characteristics such as the standard deviation and number of filter coefficients. This parameter becomes available if you set the

**Temporal gradient filter**parameter to`Derivative of Gaussian`

.**Standard deviation for image smoothing filter**Specify the standard deviation for the image smoothing filter. This parameter becomes available if you set the

**Temporal gradient filter**parameter to`Derivative of Gaussian`

.**Standard deviation for gradient smoothing filter**Specify the standard deviation for the gradient smoothing filter. This parameter becomes available if you set the

**Temporal gradient filter**parameter to`Derivative of Gaussian`

.**Discard normal flow estimates when constraint equation is ill-conditioned**Select this check box if you want the block to set the motion vector to zero when the optical flow constraint equation is ill-conditioned. This parameter becomes available if you set the

**Temporal gradient filter**parameter to`Derivative of Gaussian`

.**Output image corresponding to motion vectors (accounts for block delay)**Select this check box if you want the block to output the image that corresponds to the motion vector being output by the block. This parameter becomes available if you set the

**Temporal gradient filter**parameter to`Derivative of Gaussian`

.**Threshold for noise reduction**Enter a scalar value that determines the motion threshold between each image or video frame. The higher the number, the less small movements impact the optical flow calculation. This parameter becomes available if you set the

**Method**parameter to`Lucas-Kanade`

.

**Rounding mode**Select the rounding mode for fixed-point operations.

**Overflow mode**Select the overflow mode for fixed-point operations.

**Product output**Use this parameter to specify how to designate the product output word and fraction lengths.

When you select

`Binary point scaling`

, you can enter the word length and the fraction length of the product output in bits.When you select

`Slope and bias scaling`

, you can enter the word length in bits and the slope of the product output. The bias of all signals in the Computer Vision Toolbox™ blocks is 0.

**Accumulator**Use this parameter to specify how to designate this accumulator word and fraction lengths.

When you select

`Same as product output`

, these characteristics match those of the product output.When you select

`Binary point scaling`

, you can enter the word length and the fraction length of the accumulator in bits.When you select

`Slope and bias scaling`

, you can enter the word length in bits and the slope of the accumulator. The bias of all signals in the Computer Vision Toolbox blocks is 0.

**Gradients**Choose how to specify the word length and fraction length of the gradients data type:

When you select

`Same as accumulator`

, these characteristics match those of the accumulator.When you select

`Same as product output`

, these characteristics match those of the product output.When you select

`Binary point scaling`

, you can enter the word length and the fraction length of the quotient, in bits.When you select

`Slope and bias scaling`

, you can enter the word length in bits and the slope of the quotient. The bias of all signals in the Computer Vision Toolbox blocks is 0.

**Threshold**Choose how to specify the word length and fraction length of the threshold data type:

When you select

`Same word length as first input`

, the threshold word length matches that of the first input.When you select

`Specify word length`

, enter the word length of the threshold data type.When you select

`Binary point scaling`

, you can enter the word length and the fraction length of the threshold, in bits.When you select

`Slope and bias scaling`

, you can enter the word length in bits and the slope of the threshold. The bias of all signals in the Computer Vision Toolbox blocks is 0.

**Output**Choose how to specify the word length and fraction length of the output data type:

When you select

`Binary point scaling`

, you can enter the word length and the fraction length of the output, in bits.When you select

`Slope and bias scaling`

, you can enter the word length in bits and the slope of the output. The bias of all signals in the Computer Vision Toolbox blocks is 0.

**Lock data type settings against change by the fixed-point tools**Select this parameter to prevent the fixed-point tools from overriding the data types you specify on the block mask. For more information, see

`fxptdlg`

, a reference page on the Fixed-Point Tool in the Simulink^{®}documentation.

[1] Barron, J.L., D.J. Fleet, S.S. Beauchemin,
and T.A. Burkitt. *Performance of optical flow techniques*.
CVPR, 1992.

Computer Vision Toolbox software | |

Computer Vision Toolbox software |