Class: dataset
(Not Recommended) Print summary of dataset array
The dataset data type is not recommended. To work with heterogeneous data,
use the MATLAB®
table data type instead. See MATLAB
table documentation for more information.
summary(A)
s = summary(A)
summary(A) prints a summary of a dataset array and
the variables that it contains.
s = summary(A) returns a scalar structure s that
contains a summary of the dataset A and the variables that
A contains. For more information on the fields in s,
see Outputs.
Summary information depends on the type of the variables in the data set:
For numerical variables, summary computes a five-number summary
of the data, giving the minimum, the first quartile, the median, the third quartile, and
the maximum.
For logical variables, summary counts the number of
trues and falses in the data.
For categorical variables, summary counts the number of data at
each level.
The following list describes the fields in the structure s:
Description — A character array containing the dataset
description.
Variables — A structure array with one element for each
dataset variable in A. Each element has the following fields:
Name — A character vector containing the name of the
variable.
Description — A character vector containing the
variable's description.
Units — A character vector containing the variable's
units.
Size — A numeric vector containing the size of the
variable.
Class — A character vector containing the class of
the variable.
Data — A scalar structure containing the following
fields.
For numeric variables:
Probabilities — A numeric vector containing
the probabilities [0.0 .25 .50 .75 1.0] and NaN (if any are present in the
corresponding dataset variable).
Quantiles — A numeric vector containing the
values that correspond to 'Probabilities' for the corresponding dataset
variable, and a count of NaNs (if any are present).
For logical variables:
Values — The logical vector [true
false].
Counts — A numeric vector of counts for each
logical value.
For categorical variables:
Levels — A cell array containing the labels
for each level of the corresponding dataset variable.
Counts — A numeric vector of counts for each
level.
'Data' is empty if variable is not numeric, categorical, or
logical. If a dataset variable has more than one column, then the corresponding
'Quantiles' or 'Counts' field is a matrix
or an array.
Summarize Fisher's iris data:
load fisheriris
species = nominal(species);
data = dataset(species,meas);
summary(data)
species: [150x1 nominal]
setosa versicolor virginica
50 50 50
meas: [150x4 double]
min 4.3000 2 1 0.1000
1st Q 5.1000 2.8000 1.6000 0.3000
median 5.8000 3 4.3500 1.3000
3rd Q 6.4000 3.3000 5.1000 1.8000
max 7.9000 4.4000 6.9000 2.5000Summarize the data in hospital.mat:
load hospital
summary(hospital)
Dataset array created from the data file hospital.dat.
The first column of the file ("id") is used for observation
names. Other columns ("sex" and "smoke") have been
converted from their original coded values into categorical
and logical variables. Two sets of columns ("sys" and
"dia", "trial1" through "trial4") have been combined into
single variables with multivariate observations. Column
headers have been replaced with more descriptive variable
names. Units have been added where appropriate.
LastName: [100x1 cell array of character vectors]
Sex: [100x1 nominal]
Female Male
53 47
Age: [100x1 double, Units = Yrs]
min 1st Q median 3rd Q max
25 32 39 44 50
Weight: [100x1 double, Units = Lbs]
min 1st Q median 3rd Q max
111 130.5000 142.5000 180.5000 202
Smoker: [100x1 logical]
true false
34 66
BloodPressure: [100x2 double, Units = mm Hg]
Systolic/Diastolic
min 109 68
1st Q 117.5000 77.5000
median 122 81.5000
3rd Q 127.5000 89
max 138 99
Trials: [100x1 cell, Units = Counts]
From zero to four measurement trials performed