Note: This page has been translated by MathWorks. Please click here

To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

**The dataset data type might be removed in
a future release. To work with heterogeneous data, use the MATLAB ^{®} table data
type instead. See MATLAB table documentation
for more information.**

`C = join(A,B)`

C = join(A,B,keys)

C = join(A,B,* param1*,

`val1`

`param2`

`val2`

[C,IB] = join(...)

C = join(A,B,'Type',TYPE,...)

C = join(A,B,'Type',TYPE,'MergeKeys',true,...)

[C,IA,IB] = join(A,B,'Type',TYPE,...)

`C = join(A,B)`

creates a
dataset array `C`

by merging observations from the
two dataset arrays `A`

and `B`

. `join`

performs
the merge by first finding *key variables*, that
is, pairs of dataset variables, one in `A`

and one
in `B`

, that share the same name. Each observation
in `B`

must contain a unique combination of values
in the key variables, and must contain all combinations of values
that are present in the keys from `A`

. `join`

then
uses these key variables to define a many-to-one correspondence between
observations in `A`

and those in `B`

. `join`

uses
this correspondence to replicate the observations in `B`

and
combine them with the observations in `A`

to create `C`

.

`C = join(A,B,keys)`

performs
the merge using the variables specified by `keys`

as
the key variables in both `A`

and `B`

. `keys`

is
a positive integer, a vector of positive integers, a variable name,
a cell array of variable names, or a logical vector.

`C`

contains one observation for each observation
in `A`

. Variables in `C`

include
all of the variables from `A`

, as well as one variable
corresponding to each variable in `B`

(except for
the keys from `B`

). If `A`

and `B`

contain
variables with identical names, `join`

adds the suffix `'_left'`

and `'_right'`

to
the corresponding variables in `C`

.

`C = join(A,B,`

specifies
optional parameter name/value pairs to control how the dataset variables
in * param1*,

`val1`

`param2`

`val2`

`A`

and `B`

are used in the merge.
Parameters are:`'Keys'`

— Specifies the variables to use as keys in both`A`

and`B`

.`'LeftKeys'`

— Specifies the variables to use as keys in`A`

.`'RightKeys'`

— Specifies the variables to use as keys in`B`

.

You may provide either the `'Keys'`

parameter,
or both the `'LeftKeys'`

and `'RightKeys'`

parameters.
The value for these parameters is a positive integer, a vector of
positive integers, a variable name, a cell array containing variable
names, or a logical vector. `'LeftKeys'`

or `'RightKeys'`

must
both specify the same number of key variables, and `join`

pairs
the left and right keys in the order specified.

`'LeftVars'`

— Specifies which variables from`A`

to include in`C`

. By default,`join`

includes all variables from`A`

.`'RightVars'`

— Specifies which variables from`B`

to include in`C`

. By default,`join`

includes all variables from`B`

except the key variables.

You can use `'LeftVars'`

or `'RightVars'`

to
include or exclude key variables as well as data variables. The value
for these parameters is a positive integer, a vector of positive integers,
a variable name, a cell array containing one or more variable names,
or a logical vector.

`[C,IB] = join(...)`

returns
an index vector `IB`

, where `join`

constructs `C`

by
horizontally concatenating `A(:,LeftVars)`

and `B(IB,RightVars)`

. `join`

can
also perform more complicated inner and outer join operations that
allow a many-to-many correspondence between `A`

and `B`

,
and allow unmatched observations in either `A`

or `B`

.

`C = join(A,B,'Type',TYPE,...)`

performs
the join operation specified by `TYPE`

. `TYPE`

is
one of `'inner'`

, `'leftouter'`

, `'rightouter'`

, `'fullouter'`

,
or `'outer'`

(which is a synonym for `'fullouter'`

).
For an inner join, `C`

only contains observations
corresponding to a combination of key values that occurred in both `A`

and `B`

.
For a left (or right) outer join, `C`

also contains
observations corresponding to keys in `A`

(or `B`

)
that did not match any in `B`

(or `A`

).
Variables in `C`

taken from `A`

(or `B`

)
contain null values in those observations. A full outer join is equivalent
to a left and right outer join. `C`

contains variables
corresponding to the key variables from both `A`

and `B`

,
and `join`

sorts the observations in `C`

by
the key values.

For inner and outer joins, `C`

contains variables
corresponding to the key variables from both `A`

and `B`

by
default, as well as all the remaining variables. `join`

sorts
the observations in the result `C`

by the key values.

`C = join(A,B,'Type',TYPE,'MergeKeys',true,...)`

includes
a single variable in `C`

for each key variable pair
from `A`

and `B`

, rather than including
two separate variables. For outer joins, `join`

creates
the single variable by merging the key values from `A`

and `B`

,
taking values from `A`

where a corresponding observation
exists in `A`

, and from `B`

otherwise.
Setting the `'MergeKeys'`

parameter to `true`

overrides
inclusion or exclusion of any key variables specified via the `'LeftVars'`

or `'RightVars'`

parameter.
Setting the `'MergeKeys'`

parameter to `false`

is
equivalent to not passing in the `'MergeKeys'`

parameter.

`[C,IA,IB] = join(A,B,'Type',TYPE,...)`

returns
index vectors `IA`

and `IB`

indicating
the correspondence between observations in `C`

and
those in `A`

and `B`

. For an inner
join, `join`

constructs `C`

by horizontally
concatenating `A(IA,LeftVars)`

and `B(IB,RightVars)`

.
For an outer join, `IA`

or `IB`

may
also contain zeros, indicating the observations in `C`

that
do not correspond to observations in `A`

or `B`

,
respectively.

Create a dataset array from Fisher's iris data:

load fisheriris NumObs = size(meas,1); NameObs = strcat({'Obs'},num2str((1:NumObs)','%-d')); iris = dataset({nominal(species),'species'},... {meas,'SL','SW','PL','PW'},... 'ObsNames',NameObs);

Create a separate dataset array with the diploid chromosome counts for each species of iris:

snames = nominal({'setosa';'versicolor';'virginica'}); CC = dataset({snames,'species'},{[38;108;70],'cc'}) CC = species cc setosa 38 versicolor 108 virginica 70

Broadcast the data in `CC`

to the rows of `iris`

using
the key variable `species`

in each dataset:

iris2 = join(iris,CC); iris2([1 2 51 52 101 102],:) ans = species SL SW PL PW cc Obs1 setosa 5.1 3.5 1.4 0.2 38 Obs2 setosa 4.9 3 1.4 0.2 38 Obs51 versicolor 7 3.2 4.7 1.4 108 Obs52 versicolor 6.4 3.2 4.5 1.5 108 Obs101 virginica 6.3 3.3 6 2.5 70 Obs102 virginica 5.8 2.7 5.1 1.9 70

Create two datasets and join them using the `'MergeKeys'`

flag:

% Create two data sets that both contain the key variable % 'Key1'. The two arrays contain observations with common % values of Key1, but each array also contains observations % with values of Key1 not present in the other. a = dataset({'a' 'b' 'c' 'e' 'h'}',[1 2 3 11 17]',... 'VarNames',{'Key1' 'Var1'}) b = dataset({'a' 'b' 'd' 'e'}',[4 5 6 7]',... 'VarNames',{'Key1' 'Var2'}) % Combine a and b with an outer join, which matches up % observations with common key values, but also retains % observations whose key values don't have a match. % Keep the key values as separate variables in the result. couter = join(a,b,'key','Key1','Type','outer') % Join a and b, merging the key values as a single variable % in the result. coutermerge = join(a,b,'key','Key1','Type','outer',... 'MergeKeys',true) % Join a and b, retaining only observations whose key % values match. cinner = join(a,b,'key','Key1','Type','inner',... 'MergeKeys',true) a = Key1 Var1 'a' 1 'b' 2 'c' 3 'e' 11 'h' 17 b = Key1 Var2 'a' 4 'b' 5 'd' 6 'e' 7 couter = Key1_left Var1 Key1_right Var2 'a' 1 'a' 4 'b' 2 'b' 5 'c' 3 '' NaN '' NaN 'd' 6 'e' 11 'e' 7 'h' 17 '' NaN coutermerge = Key1 Var1 Var2 'a' 1 4 'b' 2 5 'c' 3 NaN 'd' NaN 6 'e' 11 7 'h' 17 NaN cinner = Key1 Var1 Var2 'a' 1 4 'b' 2 5 'e' 11 7

Was this topic helpful?