The dataset data type might be removed in
a future release. To work with heterogeneous data, use the MATLAB^{®}table data
type instead. See MATLAB table documentation
for more information.

Syntax

C = union(A,B) C = union(A,B,vars) C = union(A,B,vars,setOrder) [C,iA,iB]
= union(___)

Description

C = union(A,B) for dataset arrays A and B returns
the combined set of observations from the two arrays, with repetitions
removed. The observations in the dataset array C are
sorted.

C = union(A,B,vars) returns
the combined set of observations from the two arrays, with repetitions
of unique combinations of the variables specified in vars removed.
The observations in the dataset array C are sorted
by those variables.

The values for variables not specified in vars for
each observation in C are taken from the corresponding
observation in A or B, or from A if
there are common observations in both A and B.
If there are multiple observations in A or B that
correspond to an observation in C, those values
are taken from the first occurrence.

C = union(A,B,vars,setOrder) returns
the observations in C in the order specified
by setOrder.

[C,iA,iB]
= union(___) also returns index vectors iA and iB such
that C is a sorted combination of the values A(iA,:) and B(iB,:).
If there are common observations in A and B,
then union returns only the index from A,
in iA. If there are repeated observations in A or B,
then the index of the first occurrence is returned. You can use any
of the previous input arguments.

Input Arguments

A,B

Input dataset arrays.

vars

Cell array of strings containing variable names or a vector
of integers containing variable column numbers, indicating the variables
for which union removes repetitions of unique combinations
of the variables.

Specify vars as [] to
use its default value of all variables.

setOrder

Flag indicating the sorting order for the observations in C.
The possible values of setOrder are:

'sorted'

Observations in C are in sorted order
(default).

'stable'

Observations in C are in the same order
that they appear in A, then B.

Output Arguments

C

Dataset array with the combined observations of A and B,
with repetitions removed. C is in sorted order
(by default), or the order specified by setOrder.

iA

Index vector, indicating the observations in A that
contribute to the union. iA contains the index
to the first occurrence of any repeated observations in A.

iB

Index vector, indicating the observations in B that
contribute to the union. If there are common observations in A and B,
then union returns only the index from A,
in iA. iB contains the index
to the first occurrence of any repeated observations in B.

Navigate to the folder containing sample data, and load
sample data.

cd(matlabroot)
cd('help/toolbox/stats/examples')
A = dataset('XLSFile','hospitalSmall.xlsx');
B = dataset('XLSFile','hospitalSmall.xlsx','Sheet',2);
[length(A) length(B)]

ans =
14 8

The first dataset array, A, has 14 observations.
The second dataset array, B, has 8 observations.

Return the union.

C = union(A,B);
length(C)

ans =
21

The union of the two dataset arrays has 21 observations, indicating
that there was one observation replicated in A and B.