MATLAB Examples

# Access Data in Dataset Array Variables

This example shows how to work with dataset array variables and their data.

## Access variables by name.

You can access variable data, or select a subset of variables, by using variable (column) names and dot indexing. Load a sample dataset array. Display the names of the variables in hospital.

```load hospital hospital.Properties.VarNames(:) ```
```ans = 7x1 cell array {'LastName' } {'Sex' } {'Age' } {'Weight' } {'Smoker' } {'BloodPressure'} {'Trials' } ```

The dataset array has 7 variables (columns) and 100 observations (rows). You can double-click hospital in the Workspace window to view the dataset array in the Variables editor.

## Plot histogram.

Plot a histogram of the data in the variable Weight.

```figure histogram(hospital.Weight) ```

The histogram shows that the weight distribution is bimodal.

## Plot data grouped by category.

Draw box plots of Weight grouped by the values in Sex (Male and Female). That is, use the variable Sex as a grouping variable.

```figure boxplot(hospital.Weight,hospital.Sex) ```

The box plot suggests that gender accounts for the bimodality in weight.

## Select a subset of variables.

Create a new dataset array with only the variables LastName, Sex, and Weight. You can access the variables by name or column number.

```ds1 = hospital(:,{'LastName','Sex','Weight'}); ds2 = hospital(:,[1,2,4]); ```

The dataset arrays ds1 and ds2 are equivalent. Use parentheses ( ) when indexing dataset arrays to preserve the data type; that is, to create a dataset array from a subset of a dataset array. You can also use the Variables editor to create a new dataset array from a subset of variables and observations.

## Convert the variable data type.

Convert the data type of the variable Smoker from logical to nominal with labels No and Yes.

```hospital.Smoker = nominal(hospital.Smoker,{'No','Yes'}); class(hospital.Smoker) ```
```ans = 'nominal' ```

## Explore data.

Display the first 10 elements of Smoker.

```hospital.Smoker(1:10) ```
```ans = 10x1 nominal array Yes No No No No No Yes No No No ```

If you want to change the level labels in a nominal array, use setlabels.

The variable BloodPressure is a 100-by-2 array. The first column corresponds to systolic blood pressure, and the second column to diastolic blood pressure. Separate this array into two new variables, SysPressure and DiaPressure.

```hospital.SysPressure = hospital.BloodPressure(:,1); hospital.DiaPressure = hospital.BloodPressure(:,2); hospital.Properties.VarNames(:) ```
```ans = 9x1 cell array {'LastName' } {'Sex' } {'Age' } {'Weight' } {'Smoker' } {'BloodPressure'} {'Trials' } {'SysPressure' } {'DiaPressure' } ```

The dataset array, hospital, has two new variables.

## Search for variables by name.

Use regexp to find variables in hospital with 'Pressure' in their name. Create a new dataset array containing only these variables.

```bp = regexp(hospital.Properties.VarNames,'Pressure'); bpIdx = cellfun(@isempty,bp); bpData = hospital(:,~bpIdx); bpData.Properties.VarNames(:) ```
```ans = 3x1 cell array {'BloodPressure'} {'SysPressure' } {'DiaPressure' } ```

The new dataset array, bpData, contains only the blood pressure variables.

## Delete variables.

Delete the variable BloodPressure from the dataset array, hospital.

```hospital.BloodPressure = []; hospital.Properties.VarNames(:) ```
```ans = 8x1 cell array {'LastName' } {'Sex' } {'Age' } {'Weight' } {'Smoker' } {'Trials' } {'SysPressure'} {'DiaPressure'} ```

The variable BloodPressure is no longer in the dataset array.