Documentation Center

  • Trial Software
  • Product Updates

intersect

Class: dataset

Set intersection for dataset array observations

The dataset data type might be removed in a future release. To work with heterogeneous data, use the MATLAB® table data type instead. See MATLAB table documentation for more information.

Syntax

C = intersect(A,B)
C = intersect(A,B,vars)
C = intersect(A,B,vars,setOrder)
[C,iA,iB] = intersect(___)

Description

C = intersect(A,B) for dataset arrays A and B returns the common set of observations from the two arrays, with repetitions removed. The observations in the dataset array C are in sorted order.

C = intersect(A,B,vars) returns the set of common observations from the two arrays, considering only the variables specified in vars, with repetitions removed. The observations in the dataset array C are sorted by those variables.

The values for variables not specified in vars for each observation in C are taken from the corresponding observations in A. If there are multiple observations in A that correspond to an observation in C, then those values are taken from the first occurrence.

C = intersect(A,B,vars,setOrder) returns the observations in C in the order specified by setOrder.

[C,iA,iB] = intersect(___) also returns index vectors iA and iB such that C = A(iA,:) and C = B(iB,:). If there are repeated observations in A or B, then intersect returns the index of the first occurrence. You can use any of the previous input arguments.

Input Arguments

A,B

Input dataset arrays.

vars

Cell array of strings containing variable names or a vector of integers containing variable column numbers, indicating the variables in A and B that intersect considers.

Specify vars as [] to use its default value of all variables.

setOrder

Flag indicating the sorting order for the observations in C. The possible values of setOrder are:

'sorted'Observations in C are in sorted order (default).
'stable'Observations in C are in the same order that they appear in A.

Output Arguments

C

Dataset array with the common set of observations in A and B, with repetitions removed. C is in sorted order (by default), or the order specified by setOrder.

iA

Index vector, indicating the observations in A that are common to B. The vector iA contains the index to the first occurrence of any repeated observations in A.

iB

Index vector, indicating the observations in B that are common to A. The vector iB contains the index to the first occurrence of any repeated observations in B.

Examples

expand all

Intersection of Two Dataset Arrays

Navigate to the folder containing sample data, and load sample data.

cd(matlabroot)
cd('help/toolbox/stats/examples')

A = dataset('XLSFile','hospitalSmall.xlsx');
B = dataset('XLSFile','hospitalSmall.xlsx','Sheet',2); 

Return the intersection and index vectors.

[C,iA,iB] = intersect(A,B);
C = 

    id               name           sex        age    wgt    smoke
    'TRW-072'        'WHITE'        'm'        39     202    1    

There is one observation in common between A and B.

Find the observation in the original dataset arrays.

A(iA,:)
ans = 

    id               name           sex        age    wgt    smoke
    'TRW-072'        'WHITE'        'm'        39     202    1    
B(iB,:)
ans = 

    id               name           sex        age    wgt    smoke
    'TRW-072'        'WHITE'        'm'        39     202    1    

See Also

| | | | | |

More About

Was this topic helpful?