Note: This page has been translated by MathWorks. Click here to see

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Group data into bins or categories

`Y = discretize(X,edges)`

```
[Y,E] =
discretize(X,N)
```

```
[Y,E] =
discretize(X,dur)
```

`[___] = discretize(___,values)`

`[___] = discretize(___,'categorical')`

`[___] = discretize(___,'categorical',displayFormat)`

`[___] = discretize(___,'categorical',categoryNames)`

`[___] = discretize(___,'IncludedEdge',side)`

returns
the indices of the bins that contain the elements of `Y`

= discretize(`X`

,`edges`

)`X`

.
The `j`

th bin contains element `X(i)`

if ```
edges(j)
<= X(i) < edges(j+1)
```

for `1 <= j < N`

,
where `N`

is the number of bins and ```
length(edges)
= N+1
```

. The last bin contains both edges such that ```
edges(N)
<= X(i) <= edges(N+1)
```

.

`[___] = discretize(___,`

returns
the corresponding element in `values`

)`values`

rather than
the bin number, using any of the previous input or output argument
combinations. For example, if `X(1)`

is in bin 5,
then `Y(1)`

is `values(5)`

rather
than `5`

. `values`

must be a vector
with length equal to the number of bins.

`[___] = discretize(___,'categorical')`

creates
a categorical array where each bin is a category. In most cases, the
default category names are of the form “`[A,B)`

”
(or “`[A,B]`

” for the last bin), where `A`

and `B`

are
consecutive bin edges. If you specify `dur`

as a
character vector, then the default category names might have special
formats. See `Y`

for a listing of the display formats.

`[___] = discretize(___,'categorical',`

,
for datetime or duration array inputs, uses the specified datetime
or duration display format in the category names of the output.`displayFormat`

)

`[___] = discretize(___,'categorical',`

also
names the categories in `categoryNames`

)`Y`

using the cell array
of character vectors, `categoryNames`

. The length
of `categoryNames`

must be equal to the number of
bins.

`[___] = discretize(___,'IncludedEdge',`

,
where `side`

)`side`

is `'left'`

or
`'right'`

, specifies whether each bin includes its right or
left bin edge. For example, if `side`

is
`'right'`

, then each bin includes the right bin edge,
except for the *first* bin which includes both edges. In this
case, the `j`

th bin contains an element `X(i)`

if `edges(j) < X(i) <= edges(j+1)`

, where ```
1 <
j <= N
```

and `N`

is the number of bins. The
first bin includes the left edge such that it contains ```
edges(1) <=
X(i) <= edges(2)
```

. The default for `side`

is
`'left'`

.

The behavior of

`discretize`

is similar to that of the`histcounts`

function. Use`histcounts`

to find the number of elements in each bin. On the other hand, use`discretize`

to find which bin each element belongs to (without counting).

Was this topic helpful?