Documentation Center

  • Trial Software
  • Product Updates

join

Merge two tables by matching up rows using key variables

Syntax

Description

example

C = join(A,B) merges tables A and B by matching up rows using all the variables with the same name as key variables. The key values must be common to both A and B, except for order.

join retains all the variables from A and appends the corresponding contents from the nonkey variables of B.

example

C = join(A,B,Name,Value) joins the tables with additional options specified by one or more Name,Value pair arguments.

For example, you can specify the variables to use as key variables.

example

[C,ib] = join(___) also returns an index vector, ib, such that each element of ib identifies the row in B that corresponds to that row in the output table, C. You can use this syntax with any of the input arguments of the previous syntaxes.

Examples

expand all

Append Values from One Table to Another

Create a table, A.

A = table({'John','Jane','Jim','Jerry','Jill'}',[1;2;1;2;1],...
    'VariableNames',{'Employee' 'Department'})
A = 

    Employee    Department
    ________    __________

    'John'      1         
    'Jane'      2         
    'Jim'       1         
    'Jerry'     2         
    'Jill'      1         

Create a table, B, with a variable in common with A.

B = table([1 2]',{'Mary' 'Mike'}',...
    'VariableNames',{'Department' 'Manager'})
B = 

    Department    Manager
    __________    _______

    1             'Mary' 
    2             'Mike' 

Create a new table, C, containing data from tables A and B. Use the join function to repeat and append Manager data from table B to the data from table A based on the key variable, Department.

C = join(A,B)
C = 

    Employee    Department    Manager
    ________    __________    _______

    'John'      1             'Mary' 
    'Jane'      2             'Mike' 
    'Jim'       1             'Mary' 
    'Jerry'     2             'Mike' 
    'Jill'      1             'Mary' 

Merge Tables with One Variable in Common

Create a table, A.

A = table([5;12;23;2;6],...
    {'cereal';'pizza';'salmon';'cookies';'pizza'},...
    'VariableNames',{'Age','FavoriteFood'},...
    'RowNames',{'Amy','Bobby','Holly','Harry','Sally'})
A = 

             Age    FavoriteFood
             ___    ____________

    Amy       5     'cereal'    
    Bobby    12     'pizza'     
    Holly    23     'salmon'    
    Harry     2     'cookies'   
    Sally     6     'pizza'     

Create a table, B, with one variable in common with A.

B = table({'cereal';'cookies';'pizza';'salmon';'cake'},...
    [110;160;140;367;243],...
    {'B';'D';'B-';'A';'C-'},...
    'VariableNames',{'FavoriteFood','Calories','NutritionGrade'})
B = 

    FavoriteFood    Calories    NutritionGrade
    ____________    ________    ______________

    'cereal'        110         'B'           
    'cookies'       160         'D'           
    'pizza'         140         'B-'          
    'salmon'        367         'A'           
    'cake'          243         'C-'          

Create a new table, C, with data from tables A and B. Use FavoriteFood as a key variable to the join function.

C = join(A,B)
C = 

             Age    FavoriteFood    Calories    NutritionGrade
             ___    ____________    ________    ______________

    Amy       5     'cereal'        110         'B'           
    Bobby    12     'pizza'         140         'B-'          
    Holly    23     'salmon'        367         'A'           
    Harry     2     'cookies'       160         'D'           
    Sally     6     'pizza'         140         'B-'          

Table C does not include information from the last row of table B about 'cake' since there is no corresponding entry in table A.

Merge Tables by Specifying One Key Variable

Create a table, A.

A = table([10;4;2;3;7],[5;4;9;6;1],[10;3;8;8;4])
A = 

    Var1    Var2    Var3
    ____    ____    ____

    10      5       10  
     4      4        3  
     2      9        8  
     3      6        8  
     7      1        4  

Create a table, B, giving Var2 the same contents as Var2 from table A.

B = table([6;1;1;6;8],[5;4;9;6;1])
B = 

    Var1    Var2
    ____    ____

    6       5   
    1       4   
    1       9   
    6       6   
    8       1   

Create a new table, C, containing data from tables A and B. Use Var1 in table A and Var2 in table B as key variables to the join function.

C = join(A,B,'Keys','Var2')
C = 

    Var1_A    Var2    Var3    Var1_B
    ______    ____    ____    ______

    10        5       10      6     
     4        4        3      1     
     2        9        8      1     
     3        6        8      6     
     7        1        4      8     

join adds a unique suffix to the nonkey variable Var1 to distinguish the data from tables A and B.

Keep One Copy of Nonkey Variables

Create a new table with data from tables A and B. If any nonkey variables have the same name in both tables, keep only the copy from table A.

Create a table, A.

A = table([10;4;2;3;7],[5;4;9;6;1])
A = 

    Var1    Var2
    ____    ____

    10      5   
     4      4   
     2      9   
     3      6   
     7      1   

Create a table, B, giving Var2 the same contents as Var2 from table A.

B = table([6;1;1;6;8],[5;4;9;6;1],[10;3;8;8;4])
B = 

    Var1    Var2    Var3
    ____    ____    ____

    6       5       10  
    1       4        3  
    1       9        8  
    6       6        8  
    8       1        4  

Create a new table, C, with data from tables A and B. Use Var2 as a key variable to the join function and keep only the copy of Var1 from table A.

C = join(A,B,'Keys','Var2','KeepOneCopy','Var1')
C = 

    Var1    Var2    Var3
    ____    ____    ____

    10      5       10  
     4      4        3  
     2      9        8  
     3      6        8  
     7      1        4  

C does not contain the Var1 data from table B.

Merge Tables Using Row Names as Keys

Create a table, A.

A = table(['M';'M';'F';'F';'F'],[38;43;38;40;49],...
    'VariableNames',{'Gender' 'Age'},...
    'RowNames',{'Smith' 'Johnson' 'Williams' 'Jones' 'Brown'})
A = 

                Gender    Age
                ______    ___

    Smith       M         38 
    Johnson     M         43 
    Williams    F         38 
    Jones       F         40 
    Brown       F         49 

Create a table, B, such that there is a one-to-one correspondence between the rows of A and the rows of B.

B = table([64;69;67;71;64],...
    [119;163;133;176;131],...
    [122 80; 109 77; 117 75; 124 93; 125 83],...
    'VariableNames',{'Height' 'Weight' 'BloodPressure'},...
    'RowNames',{'Brown' 'Johnson' 'Jones' 'Smith' 'Williams'})
B = 

                Height    Weight     BloodPressure 
                ______    ______    _______________

    Brown       64        119       122          80
    Johnson     69        163       109          77
    Jones       67        133       117          75
    Smith       71        176       124          93
    Williams    64        131       125          83

Create a new table, C, with data from tables A and B. Use the row names as keys to the join function.

C = join(A,B,'Keys','RowNames')
C = 

                Gender    Age    Height    Weight     BloodPressure 
                ______    ___    ______    ______    _______________

    Smith       M         38     71        176       124          93
    Johnson     M         43     69        163       109          77
    Williams    F         38     64        131       125          83
    Jones       F         40     67        133       117          75
    Brown       F         49     64        119       122          80

The rows of C are in the same order as A.

Merge Tables Using Left and Right Keys

Create a table, A.

A = table([10;4;2;3;7],[5;4;9;6;1],[10;3;8;8;4])
A = 

    Var1    Var2    Var3
    ____    ____    ____

    10      5       10  
     4      4        3  
     2      9        8  
     3      6        8  
     7      1        4  

Create a table, B, giving Var2 the same contents as Var1 from table A, except for order.

B = table([6;1;1;6;8],[2;3;4;7;10])
B = 

    Var1    Var2
    ____    ____

    6        2  
    1        3  
    1        4  
    6        7  
    8       10  

Create a new table, C, containing data from tables A and B. Use Var1 from table A with Var2 from table B as key variables to the join function.

[C,ib] = join(A,B,'LeftKeys',1,'RightKeys',2)
C = 

    Var1_A    Var2    Var3    Var1_B
    ______    ____    ____    ______

    10        5       10      8     
     4        4        3      1     
     2        9        8      6     
     3        6        8      1     
     7        1        4      6     


ib =

     5
     3
     1
     2
     4

C is the horizontal concatenation of A and B(ib,2).

Input Arguments

expand all

A,B — Input tablestables

Input tables, specified as tables. For all key variables, each row of A must match exactly one row in B.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'Keys',2 uses the second variable in A and the second variable in B as key variables.

'Keys' — Variables to use as keyspositive integer | vector of positive integers | variable name | cell array of variable names | logical vector | 'RowNames'

Variables to use as keys, specified as the comma-separated pair consisting of 'Keys' and a positive integer, vector of positive integers, variable name, cell array of variable names, logical vector, or 'RowNames'.

If you specify the string 'RowNames', then join uses the row names of A and row names of B as keys. In this case, there must be a row in B for every row in A.

You cannot use the 'Keys' name-value pair argument with the 'LeftKeys' and 'RightKeys' name-value pair arguments.

Example: 'Keys',[1 3] uses the first and third variables from A and B as a key variables.

'LeftKeys' — Variables to use as keys in Apositive integer | vector of positive integers | variable name | cell array of variable names | logical vector

Variables to use as keys in A, specified as the comma-separated pair consisting of 'LeftKeys' and a positive integer, vector of positive integers, variable name, cell array of variable names, or logical vector.

You must use the 'LeftKeys' name-value pair argument in conjunction with the 'RightKeys' name-value pair argument. 'LeftKeys' and 'RightKeys' both must specify the same number of key variables. join pairs key values based on their order.

Example: 'LeftKeys',1 uses only the first variable in A as a key variable.

'RightKeys' — Variables to use as keys in Bpositive integer | vector of positive integers | variable name | cell array of variable names | logical vector

Variables to use as keys in B, specified as the comma-separated pair consisting of 'RightKeys' and a positive integer, vector of positive integers, variable name, cell array of variable names, or logical vector.

You must use the 'RightKeys' name-value pair argument in conjunction with the 'LeftKeys' name-value pair argument. 'LeftKeys' and 'RightKeys' both must specify the same number of key variables. join pairs values in A and B based on their order.

Example: 'RightKeys',3 uses only the third variable in B as a key variable.

'LeftVariables' — Variables from A to include in Cpositive integer | vector of positive integers | variable name | cell array containing one or more variable names | logical vector

Variables from A to include in C, specified as the comma-separated pair consisting of 'LeftVariables' and a positive integer, vector of positive integers, variable name, cell array of variable names, or logical vector.

You can use 'LeftVariables'to include or exclude key variables, as well as nonkey variables from the output, C.

By default, join includes all variables from A.

'RightVariables' — Variables from B to include in Cpositive integer | vector of positive integers | variable name | cell array containing one or more variable names | logical vector

Variables from B to include in C, specified as the comma-separated pair consisting of 'RightVariables' and a positive integer, vector of positive integers, variable name, cell array of variable names, or logical vector.

You can use 'RightVariables' to include or exclude key variables, as well as nonkey variables from the output, C.

By default, join includes all the variables from B except the key variables.

'KeepOneCopy' — Variables for which join retains only the copy from Avariable name | cell array containing one or more variable names

Variables for which join retains only the copy from A, specified as the comma-separated pair consisting of 'KeepOneCopy' and a variable name or a cell array containing one or more variable names.

Key variables appear once in C, but if nonkey variables with identical names occur in A and B, then join retains both copies in C by default. You must use the 'KeepOneCopy' name-value pair to retain only the copy from A.

Example: 'KeepOneCopy',Var2 keeps only the copy from A of the nonkey variable Var2.

Output Arguments

expand all

C — Merged data from A and Btable

Merged data from A and B, returned as a table. The table, C, contains one row for each row in A, appearing in the same order. If A and B contain nonkey variables with the same name, join adds a unique suffix to the corresponding variable names in C, unless you specify the 'KeepOneCopy' name-value pair argument.

join creates C by horizontally concatenating A(:,LeftVars) and B(ib,RightVars). By default, LeftVars is all the variables of A, and RightVars is all the nonkey variables from B. Otherwise, LefttVars consists of the variables specified by the 'LeftVariables' name-value pair argument, and RightVars consists of the variables specified by the 'RightVariables' name-value pair argument.

You can store additional metadata such as descriptions, variable units, variable names, and row names in the output table, C. For more information, see Table Properties.

ib — Index to Bcolumn vector

Index to B, returned as a column vector. Each element of ib identifies the row in B that corresponds to that row in the output table, C.

More About

expand all

Key Variable

Variable used to match and combine data between the input tables, A and B.

Key Value

Entry in a key variable of A.

Algorithms

The join function first finds one or more key variables. Then, join uses the key variables to find the row in input table B that matches each row in input table A, and combines those rows to create a row in output table C.

  • If there is a one-to-one mapping between key values in A and B, then join sorts the data in B and appends it to table A.

  • If there is a many-to-one mapping between key values in A and B, then join sorts and repeats the data in B before appending it to table A.

  • If there is data in a key variable of B that does not map to a key value in A, then join does not include that data in the output table, C.

See Also

|

Was this topic helpful?