Documentation

# normalize

Normalize data

## Description

example

N = normalize(A) returns the vectorwise z-score of the data in A with center 0 and standard deviation 1.

• If A is a vector, then normalize operates on the entire vector.

• If A is a matrix, table, or timetable, then normalize operates on each column of data separately.

• If A is a multidimensional array, then normalize operates along the first array dimension whose size does not equal 1.

example

N = normalize(A,dim) returns the z-score along dimension dim. For example, normalize(A,2) normalizes each row.

example

N = normalize(___,method) specifies a normalization method for either of the previous syntaxes. For example, normalize(A,'norm') normalizes the data in A by the Euclidean norm (2-norm).

example

N = normalize(___,method,methodtype) specifies the type of normalization for the given method. For example, normalize(A,'norm',Inf) normalizes the data in A using the infinity norm.

example

N = normalize(___,'DataVariables',datavars) specifies variables to operate on when the input data is in a table or timetable.

## Examples

collapse all

Normalize data in a vector and matrix by computing the z-score.

Create a vector v and compute the z-score, normalizing the data to have mean 0 and standard deviation 1.

v = 1:5;
N = normalize(v)
N = 1×5

-1.2649   -0.6325         0    0.6325    1.2649

Create a matrix B and compute the z-score for each column. Then, normalize each row.

B = magic(3)
B = 3×3

8     1     6
3     5     7
4     9     2

N1 = normalize(B)
N1 = 3×3

1.1339   -1.0000    0.3780
-0.7559         0    0.7559
-0.3780    1.0000   -1.1339

N2 = normalize(B,2)
N2 = 3×3

0.8321   -1.1094    0.2774
-1.0000         0    1.0000
-0.2774    1.1094   -0.8321

Scale a vector A by its standard deviation.

A = 1:5;
Ns = normalize(A,'scale')
Ns = 1×5

0.6325    1.2649    1.8974    2.5298    3.1623

Scale A so that its range is in the interval [0,1].

Nr = normalize(A,'range')
Nr = 1×5

0    0.2500    0.5000    0.7500    1.0000

Create a vector A and normalize it by its 1-norm.

A = 1:5;
Np = normalize(A,'norm',1)
Np = 1×5

0.0667    0.1333    0.2000    0.2667    0.3333

Center the data in A so that it has mean 0.

Nc = normalize(A,'center','mean')
Nc = 1×5

-2    -1     0     1     2

Create a table containing height information for five people.

LastName = {'Sanchez';'Johnson';'Lee';'Diaz';'Brown'};
Height = [71;69;64;67;64];
T = table(LastName,Height)
T=5×2 table
LastName     Height
_________    ______

'Sanchez'      71
'Johnson'      69
'Lee'          64
'Diaz'         67
'Brown'        64

Normalize the height data by the maximum height.

N = normalize(T,'norm',Inf,'DataVariables','Height')
N=5×2 table
LastName     Height
_________    _______

'Sanchez'          1
'Johnson'    0.97183
'Lee'        0.90141
'Diaz'       0.94366
'Brown'      0.90141

## Input Arguments

collapse all

Input data, specified as a scalar, vector, matrix, multidimensional array, table, or timetable.

If A is a numeric array and has type single, then the output also has type single. Otherwise, the output has type double.

normalize ignores NaN values in A.

Data Types: double | single | table | timetable
Complex Number Support: Yes

Dimension to operate along, specified as a positive integer scalar.

Data Types: double | single | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Normalization method, specified as one of the following options:

Method

Description

'zscore'

z-score with mean 0 and standard deviation 1

'norm'

2-norm

'scale'

Scale by standard deviation

'range'

Scale range of data to [0,1]

'center'

Center data to have mean 0

Method type, specified as a scalar, a 2-element row vector, or a character vector, depending on the specified method:

Method

Method Type Options

Description

'zscore'

'std' (default)

Center and scale to have mean 0 and standard deviation 1

'robust'

Center and scale to have mean 0 and median absolute deviation 1

'norm'

Positive numeric scalar (default is 2)

p-norm

Inf

Infinity norm

'scale'

'std' (default)

Scale by standard deviation

Scale by median absolute deviation

'first'

Scale by first element of data

Numeric scalar

Scale data by numeric value

'range'

2-element row vector (default is [0 1])

Interval of the form [a b] where a < b

'center'

'mean'

Center to have mean 0

'median'

Center to have median 0

Numeric scalar

Shift center by numeric value

Table variables, specified as the comma-separated pair consisting of 'DataVariables' and a scalar, vector, cell array, or function handle. The 'DataVariables' value indicates which columns of the input table to operate on, and can be one of the following:

• A character vector or scalar string specifying a single table variable name

• A cell array of character vectors or string array where each element is a table variable name

• A vector of table variable indices

• A logical vector whose elements each correspond to a table variable, where true includes the corresponding variable and false excludes it

• A function handle that takes the table as input and returns a logical scalar

Example: 'Age'

Example: {'Height','Weight'}

Example: @isnumeric

Data Types: char | string | cell | double | single | logical | function_handle

collapse all

### Z-Score

For a random variable X with mean μ and standard deviation σ, the z-score of a value x is

For sample data with mean and standard deviation S, the z-score of a data point x is

z-scores measure the distance of a data point from the mean in terms of the standard deviation. The standardized data set has mean 0 and standard deviation 1, and retains the shape properties of the original data set (same skewness and kurtosis).

### P-Norm

The general definition for the p-norm of a vector v that has N elements is

${‖v‖}_{p}={\left[\sum _{k=1}^{N}{|{v}_{k}|}^{p}\right]}^{\text{\hspace{0.17em}}1/p}\text{\hspace{0.17em}},$

where p is any positive real value, Inf, or -Inf. Some common values of p are:

• If p is 1, then the resulting 1-norm is the sum of the absolute values of the vector elements.

• If p is 2, then the resulting 2-norm gives the vector magnitude or Euclidean length of the vector.

• If p is Inf, then ${‖v‖}_{\infty }={\mathrm{max}}_{i}\left(|v\left(i\right)|\right)$.