# groupcounts

Number of group elements

## Syntax

``G = groupcounts(T,groupvars)``
``G = groupcounts(T,groupvars,groupbins)``
``G = groupcounts(___,Name,Value)``
``B = groupcounts(A)``
``B = groupcounts(A,groupbins)``
``B = groupcounts(___,Name,Value)``
``[B,BG,BP] = groupcounts(A,___)``

## Description

example

````G = groupcounts(T,groupvars)` computes the number of elements in each group of data in a table or timetable, and returns a table containing the groups, their counts, and the percentage (0 to 100) each count represents. Each group is defined by a unique combination of grouping variables in `groupvars`. For example, ```G = groupcounts(T,'Gender')``` returns a table showing the number of `Male` elements, the number of `Female` elements, and so on for any other categories in the variable `Gender`.```

example

````G = groupcounts(T,groupvars,groupbins)` specifies how to bin the data in `groupvars`. For example, ```G = groupcounts(T,'SaleDate','year')``` gives the yearly sales counts and percentages according to `SaleDate`.```

example

````G = groupcounts(___,Name,Value)` specifies additional grouping properties using one or more name-value pairs for any of the previous syntaxes. For example, ```G = groupcounts(T,'Category1','IncludeMissingGroups',false)``` excludes the group made from missing `categorical` data indicated by `<undefined>`.```
````B = groupcounts(A)` returns a vector containing the group counts for each unique combination of grouping vectors represented in a column vector, matrix, or cell array of column vectors `A`.```
````B = groupcounts(A,groupbins)` bins the data according to `groupbins`.```
````B = groupcounts(___,Name,Value)` specifies additional grouping properties using one or more name-value pairs.```

example

````[B,BG,BP] = groupcounts(A,___)` also returns the groups `BG` and percentages `BP` corresponding to the counts in `B`.```

## Examples

collapse all

Compute the number of group elements from table data.

Create a table containing information about five individuals.

```Gender = ["male";"female";"male";"female";"male"]; Smoker = logical([1;0;1;0;1]); Weight = [176;163;131;133;119]; T = table(Gender,Smoker,Weight)```
```T=5×3 table Gender Smoker Weight ________ ______ ______ "male" true 176 "female" false 163 "male" true 131 "female" false 133 "male" true 119 ```

Count the number of elements in each group by gender.

`G1 = groupcounts(T,'Gender')`
```G1=2×3 table Gender GroupCount Percent ________ __________ _______ "female" 2 40 "male" 3 60 ```

Count the number of elements in each group by gender and smoker status. By default, `groupcounts` suppresses groups with zero elements, so no groups are returned for female smokers or male nonsmokers.

`G2 = groupcounts(T,{'Gender','Smoker'})`
```G2=2×4 table Gender Smoker GroupCount Percent ________ ______ __________ _______ "female" false 2 40 "male" true 3 60 ```

To count all groups, including those with zero elements, specify the `'IncludeEmptyGroups'` parameter with value `true`.

`G3 = groupcounts(T,{'Gender','Smoker'},'IncludeEmptyGroups',true)`
```G3=4×4 table Gender Smoker GroupCount Percent ________ ______ __________ _______ "female" false 2 40 "female" true 0 0 "male" false 0 0 "male" true 3 60 ```

Group data according to specified bins.

Create a timetable containing sales information for days within a single month.

```TimeStamps = datetime([2017 3 4; 2017 3 2; 2017 3 15; 2017 3 10;... 2017 3 14; 2017 3 31; 2017 3 25;... 2017 3 29; 2017 3 21; 2017 3 18]); Profit = [2032 3071 1185 2587 1998 2899 3112 909 2619 3085]'; TotalItemsSold = [14 13 8 5 10 16 8 6 7 11]'; TT = timetable(TimeStamps,Profit,TotalItemsSold)```
```TT=10×2 timetable TimeStamps Profit TotalItemsSold ___________ ______ ______________ 04-Mar-2017 2032 14 02-Mar-2017 3071 13 15-Mar-2017 1185 8 10-Mar-2017 2587 5 14-Mar-2017 1998 10 31-Mar-2017 2899 16 25-Mar-2017 3112 8 29-Mar-2017 909 6 21-Mar-2017 2619 7 18-Mar-2017 3085 11 ```

Compute the group counts by the total items sold, binning the groups into intervals of item numbers.

`G = groupcounts(TT,'TotalItemsSold',[0 4 8 12 16])`
```G=3×3 table disc_TotalItemsSold GroupCount Percent ___________________ __________ _______ [4, 8) 3 30 [8, 12) 4 40 [12, 16] 3 30 ```

Compute the group counts grouped by day of the week.

`G = groupcounts(TT,'TimeStamps','dayname')`
```G=5×3 table dayname_TimeStamps GroupCount Percent __________________ __________ _______ Tuesday 2 20 Wednesday 2 20 Thursday 1 10 Friday 2 20 Saturday 3 30 ```

Determine which elements in a vector appear more than once.

Create a column vector with values between 1 and 5.

`v = [1 1 2 2 3 5 3 3 1 4]';`

Use `groupcounts` to determine the unique groups in the vector and count the group members.

`[gc,grps] = groupcounts(v)`
```gc = 5×1 3 2 3 1 1 ```
```grps = 5×1 1 2 3 4 5 ```

Determine which elements in the vector appear more than once by creating a logical index for the groups with a count larger than 1. Index into the groups to return the vector elements that are duplicated.

`duplicates = grps(gc > 1)`
```duplicates = 3×1 1 2 3 ```

Compute the group counts for four groups based on their gender and smoker status.

Store patient information as three vectors of different types.

```Gender = ["male";"female";"male";"female";"male"]; Smoker = logical([1;0;1;0;1]); Weight = [176;163;131;133;119];```

Grouping by gender and smoker status, compute the group counts. Specify three outputs to also return the groups `BG` and percentages `BP`. The `B` output contains the counts for each group, and `BP` contains the percentages represented by those counts.

`[B,BG,BP] = groupcounts({Gender,Smoker},'IncludeEmptyGroups',true)`
```B = 4×1 2 0 0 3 ```
```BG=1×2 cell array {4x1 string} {4x1 logical} ```
```BP = 4×1 40 0 0 60 ```

`BG` is a cell array containing two vectors that describe the groups as you look at their elements rowwise. For instance, the first row of `BG{1}` says that the patients in the first group are female, and the first row of `BG{2}` says that they are nonsmokers. The count for that group is 2, found in the corresponding row of `B`.

`BG{1}`
```ans = 4x1 string "female" "female" "male" "male" ```
`BG{2}`
```ans = 4x1 logical array 0 1 0 1 ```

## Input Arguments

collapse all

Input data, specified as a table or timetable.

Input vectors, specified as a column vector, matrix, or cell array of column vectors representing grouping vectors. When `A` is a matrix, the grouping vectors are columnwise.

Grouping variables or vectors, specified as one of the options in this table. For table or timetable input data, `groupvars` indicates which variables to use to compute groups in the data. Other variables not specified by `groupvars` are not operated on and do not pass through to the output.

OptionDescriptionExamples
Variable name

A character vector or scalar string specifying a single table variable name

`'Var1'`

`"Var1"`

Vector of variable names

A cell array of character vectors or string array where each element is a table variable name

`{'Var1' 'Var2'}`

`["Var1" "Var2"]`

Scalar or vector of variable indices

A scalar or vector of table variable indices

`1`

`[1 3 5]`

Logical vector

A logical vector whose elements each correspond to a table variable, where `true` includes the corresponding variable and `false` excludes it

`[true false true]`

Function handle

A function handle that takes a table variable as input and returns a logical scalar

`@isnumeric`

`vartype` subscript

A table subscript generated by the `vartype` function

`vartype('numeric')`

Example: `groupcounts(T,"Var3")`

Binning scheme, specified as one of the following options:

• `'none'`, indicating the groups are returned according to the specified grouping variables only

• A list of bin edges, specified as a numeric vector for numeric grouping variables, or as a `datetime` vector for `datetime` grouping variables

• A number of bins, specified as an integer scalar

• A time duration, specified as a scalar of type `duration` or `calendarDuration`, indicating bin widths (for `datetime` or `duration` grouping variables or vectors only)

• A cell array listing binning rules for each grouping variable or vector

• A time bin for `datetime` and `duration` grouping variables or vectors only, specified as one of the following character vectors:

ValueDescriptionData Type
`'second'`

Each bin is 1 second.

`datetime` and `duration`
`'minute'`

Each bin is 1 minute.

`datetime` and `duration`
`'hour'`

Each bin is 1 hour.

`datetime` and `duration`
`'day'`

Each bin is 1 calendar day. This value accounts for Daylight Saving Time shifts.

`datetime` and `duration`
`'week'`Each bin is 1 calendar week.`datetime` only
`'month'`Each bin is 1 calendar month.`datetime` only
`'quarter'`Each bin is 1 calendar quarter.`datetime` only
`'year'`

Each bin is 1 calendar year. This value accounts for leap days.

`datetime` and `duration`
`'decade'`Each bin is 1 decade (10 calendar years).`datetime` only
`'century'`Each bin is 1 century (100 calendar years).`datetime` only
`'secondofminute'`

Bins are seconds from 0 to 59.

`datetime` only
`'minuteofhour'`

Bins are minutes from 0 to 59.

`datetime` only
`'hourofday'`

Bins are hours from 0 to 23.

`datetime` only
`'dayofweek'`

Bins are days from 1 to 7. The first day of the week is Sunday.

`datetime` only
`'dayname'`Bins are full day names such as `'Sunday'`.`datetime` only
`'dayofmonth'`Bins are days from 1 to 31.`datetime` only
`'dayofyear'`Bins are days from 1 to 366.`datetime` only
`'weekofmonth'`Bins are weeks from 1 to 6.`datetime` only
`'weekofyear'`Bins are weeks from 1 to 54.`datetime` only
`'monthname'`Bins are full month names such as `'January'`.`datetime` only
`'monthofyear'`

Bins are months from 1 to 12.

`datetime` only
`'quarterofyear'`Bins are quarters from 1 to 4.`datetime` only

When multiple grouping variables or vectors are specified, you can provide a single binning rule that is applied to all grouping variables, or a cell array containing a binning method for each grouping variable, such as ```{'none',[0 2 4 Inf]}```.

### Name-Value Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: ```G = groupcounts(T,groupvars,groupbins,'IncludedEdge','right')```

Included bin edge, specified as either `'left'` or `'right'`, indicating which end of the bin interval is inclusive.

This name-value pair can only be specified when `groupbins` is specified, and the value is applied to all binning schemes for all grouping variables or vectors.

Missing groups indicator, specified as a numeric or logical `1` (`true`) or `0` (`false`). If the parameter value is `true`, then `groupcounts` displays groups made up of missing values, such as `NaN`. If the parameter value is `false`, then `groupcounts` does not display the missing value groups.

Empty groups indicator, specified as a numeric or logical `0` (`false`) or `1` (`true`). If the parameter value is `false`, then `groupcounts` does not display groups with zero elements. If the parameter value is `true`, then `groupcounts` displays the empty groups.

## Output Arguments

collapse all

Output table, returned as a table containing the computed groups, number of elements in each group, and percentages represented by each group count. For a single grouping variable, the output groups are sorted according to the order returned by the `unique` function with the `'sorted'` option.

Group counts for non-table input data, returned as a column vector containing the number of elements in each group.

Groups for non-table input data, returned as a column vector or cell array of column vectors. For a single grouping vector, the output groups are sorted according to the order returned by the `unique` function with the `'sorted'` option.

When you provide more than one input vector, `BG` is a cell array containing column vectors of equal length. The group information can be found by looking at the elements rowwise across all vectors in `BG`. The count for each group is contained in the corresponding row of the first output argument `B`.

Group count percentages for non-table input data, returned as a column vector containing the percentage each group count in `B` represents. The percentages are in the range `[0 100]`.

## Tips

• When making many calls to `groupcounts`, consider converting grouping variables to type `categorical` or `logical` when possible for improved performance. For example, if you have a grouping variable of type `char` (such as `Gender` with elements `'Male'` and `'Female'`), you can convert it to a categorical variable using the command `categorical(Gender)`.

## Compatibility Considerations

expand all

Behavior changed in R2021a