Documentation

This is machine translation

Translated by
Mouseover text to see original. Click the button below to return to the English verison of the page.

unstack

Class: dataset

Unstack data from single variable into multiple variables

The `dataset` data type might be removed in a future release. To work with heterogeneous data, use the MATLAB® `table` data type instead. See MATLAB `table` documentation for more information.

Syntax

```A = unstack(B,datavar,indvar) [A,iB] = unstack(B,datavar,indvar) A = unstack(B,datavar,indvar,'Parameter',value) ```

Description

`A = unstack(B,datavar,indvar)` unstacks a single variable in dataset array `B` into multiple variables in `A`. In general `A` contains more variables, but fewer observations, than `B`.

`datavar` specifies the data variable in `B` to unstack. `indvar` specifies an indicator variable in `B` that determines which variable in `A` each value in `datavar` is unstacked into. `unstack` treats the remaining variables in `B` as grouping variables. Each unique combination of their values defines a group of observations in `B` that will be unstacked into a single observation in `A`.

`unstack` creates `m` data variables in `A`, where `m` is the number of group levels in `indvar`. The values in `indvar` indicate which of those `m` variables receive which values from `datavar`. The `j`-th data variable in `A` contains the values from `datavar` that correspond to observations whose `indvar` value was the `j`-th of the `m` possible levels. Elements of those `m` variables for which no corresponding data value in `B` exists contain a default value.

`datavar` is a positive integer, a variable name, or a logical vector containing a single true value. `indvar` is a positive integer, a variable name, or a logical vector containing a single true value.

`[A,iB] = unstack(B,datavar,indvar)` returns an index vector `iB` indicating the correspondence between observations in `A` and those in `B`. For each observation in `A`, `iB` contains the index of the first in the corresponding group of observations in `B`.

Input Arguments

`A = unstack(B,datavar,indvar,'Parameter',value)` uses the following parameter name/value pairs to control how `unstack` converts variables in `B` to variables in `A`:

 `'GroupVars'` Grouping variables in `B` that define groups of observations. `groupvars` is a positive integer, a vector of positive integers, a variable name, a cell array containing one or more variable names, or a logical vector. The default is all variables in `B` not listed in `datavar` or `indvar`. `'NewDataVarNames'` A cell array of character vectors containing names for the data variables `unstack` should create in `A`. Default is the group names of the grouping variable specified in `indvar`. `'AggregationFun'` A function handle that accepts a subset of values from `datavar` and returns a single value. `stack` applies this function to observations from the same group that have the same value of `indvar`. The function must aggregate the data values into a single value, and in such cases it is not possible to recover `B` from `A` using `stack`. The default is `@sum` for numeric data variables. For non-numeric variables, there is no default, and you must specify `'AggregationFun'` if multiple observations in the same group have the same values of `indvar`. `'ConstVars'` Variables in `B` to copy to `A` without unstacking. The values for these variables in `A` are taken from the first observation in each group in `B`, so these variables should typically be constant within each group. `ConstVars` is a positive integer, a vector of positive integers, a variable name, a cell array containing one or more variable names, or a logical vector. The default is no variables.

You can also specify more than one data variable in `B`, each of which becomes a set of `m` variables in `A`. In this case, specify `datavar` as a vector of positive integers, a cell array containing variable names, or a logical vector. You may specify only one variable with `indvar`. The names of each set of data variables in `A` are the name of the corresponding data variable in `B` concatenated with the names specified in `'NewDataVarNames'`. The function specified in `'AggregationFun'` must return a value with a single row.

Examples

Combine several variables for estimated influenza rates into a single variable. Then unstack the estimated influenza rates by date.

```load flu % FLU has a 'Date' variable, and 10 variables for estimated influenza rates % (in 9 different regions, estimated from Google searches, plus a % nationwide estimate from the CDC). Combine those 10 variables into an % array that has a single data variable, 'FluRate', and an indicator % variable, 'Region', that says which region each estimate is from. [flu2,iflu] = stack(flu, 2:11, 'NewDataVarName','FluRate', ... 'IndVarName','Region') % The second observation in FLU is for 10/16/2005. Find the observations % in FLU2 that correspond to that date. flu(2,:) flu2(iflu==2,:) % Use the 'Date' variable from that array to split 'FluRate' into 52 % separate variables, each containing the estimated influenza rates for % each unique date. The new array has one observation for each region. In % effect, this is the original array FLU "on its side". dateNames = cellstr(datestr(flu.Date,'mmm_DD_YYYY')); [flu3,iflu2] = unstack(flu2, 'FluRate', 'Date', ... 'NewDataVarNames',dateNames) % Since observations in FLU3 represent regions, IFLU2 indicates the first % occurrence in FLU2 of each region. flu2(iflu2,:)```