Add and Delete Variables

This example shows how to add and delete variables in a dataset array. You can also edit dataset arrays using the Variables editor.

Load sample data.

Navigate to the folder containing sample data.

cd(matlabroot)
cd('help/toolbox/stats/examples')

Import the data from the first worksheet in hospitalSmall.xlsx into a dataset array.

ds = dataset('XLSFile','hospitalSmall.xlsx');
size(ds)
ans =

    14     6

The dataset array, ds, has 14 observations (rows) and 6 variables (columns).

Add variables by concatenating dataset arrays.

The worksheet Heights in hospitalSmall.xlsx has heights for the patients on the first worksheet. Concatenate the data in this spreadsheet with ds.

ds2 = dataset('XLSFile','hospitalSmall.xlsx','Sheet','Heights');
ds = [ds ds2];
size(ds)
ans =

    14     7

The dataset array now has seven variables. You can only horizontally concatenate dataset arrays with observations in the same position, or with the same observation names.

ds.Properties.VarNames{end}
ans =

hgt

The name of the last variable in ds is hgt, which dataset read from the first row of the imported spreadsheet.

Delete variables by variable name.

First, specify the unique identifiers in the variable id as observation names. Then, delete the variable id from the dataset array.

ds.Properties.ObsNames = ds.id;
ds.id = [];
size(ds)
ans =

    14     6

The dataset array now has six variables. List the variable names.

ds.Properties.VarNames(:)
ans = 

    'name'
    'sex'
    'age'
    'wgt'
    'smoke'
    'hgt'

There is no longer a variable called id.

Add a new variable by name.

Add a new variable, bmi—which contains the body mass index (BMI) for each patient—to the dataset array. BMI is a function of height and weight. Display the last name, gender, and BMI for each patient.

ds.bmi = ds.wgt*703./ds.hgt.^2;
ds(:,{'name','sex','bmi'})
ans = 

               name              sex        bmi   
    YPL-320    'SMITH'           'm'        24.544
    GLI-532    'JOHNSON'         'm'        24.068
    PNI-258    'WILLIAMS'        'f'        23.958
    MIJ-579    'JONES'           'f'        25.127
    XLK-030    'BROWN'           'f'        21.078
    TFP-518    'DAVIS'           'f'        27.729
    LPD-746    'MILLER'          'f'        26.828
    ATA-945    'WILSON'          'm'         24.41
    VNL-702    'MOORE'           'm'        27.822
    LQW-768    'TAYLOR'          'f'        22.655
    QFY-472    'ANDERSON'        'f'        23.409
    UJG-627    'THOMAS'          'f'        25.883
    XUE-826    'JACKSON'         'm'        24.265
    TRW-072    'WHITE'           'm'        29.827

The operators ./ and .^ in the calculation of BMI indicate element-wise division and exponentiation, respectively.

Delete variables by variable number.

Delete the variable wgt, the fourth variable in the dataset array.

ds(:,4) = [];
ds.Properties.VarNames(:)
ans = 

    'name'
    'sex'
    'age'
    'smoke'
    'hgt'
    'bmi'

The variable wgt is deleted from the dataset array.

See Also

Related Examples

More About

Was this topic helpful?