MATLAB Answers

Robert
0

How to delete/retain variables (columns) from a table based on sections of the variables name

Asked by Robert
on 12 Jan 2018
Latest activity Commented on by Robert
on 12 Jan 2018
Hello,
Wondering if anyone knows how to delete/retain variables (columns) from a table based on portions of the variables name?
This is a simple example, real data has more than 30000 variables (columns) with combinations of names and other information.
AA_001_AMER_USA AA_002_AMER_USA AA_001_AMER_CAN BB_001_AMER_USA BB_002_AMER_CAN
How to delete only variables with name section 'USA'
How to delete only variables with name section '001' & 'USA'
How to retain only variables with name section= 'USA'
How to retain only variables with name section= '001' & 'USA'
Thanks!

  0 Comments

Sign in to comment.

1 Answer

Answer by per isakson
on 12 Jan 2018
Edited by per isakson
on 12 Jan 2018
 Accepted Answer

One way
%%Sample table
tbl = array2table( magic(5) );
tbl.Properties.VariableNames = ...
{ 'A_001_AMER_USA' , 'AA_002_AMER_USA' ...
, 'AA_001_AMER_CAN', 'BB_001_AMER_USA' ...
, 'BB_002_AMER_CAN' };
How to retain only variables with name section 'USA'
cac = regexp( tbl.Properties.VariableNames, '^.+_USA$', 'match' );
T01 = tbl( :, cat(2,cac{:}) );
%%How to retain only variables with name section= '001' & 'USA'
cac = regexp( tbl.Properties.VariableNames, '^[^_]+_001_[^_]+_USA$', 'match' );
T02 = tbl( :, cat(2,cac{:}) );
How to delete only variables with name section 'USA'
cac = regexp( tbl.Properties.VariableNames, '^.+_USA$', 'match' );
T03 = tbl;
T03( :, cat(2,cac{:}) ) = [];
%%How to delete only variables with name section '001' & 'USA'
cac = regexp( tbl.Properties.VariableNames, '^[^_]+_001_[^_]+_USA$', 'match' );
T04 = tbl;
T04( :, cat(2,cac{:}) ) = [];
and a variant, which I guess is faster
How to retain only variables with name section 'USA'
cac = regexp( tbl.Properties.VariableNames, '^.+_USA$' );
T01 = tbl( :, not( cellfun( @isempty, cac ) ) );
%%How to retain only variables with name section= '001' & 'USA'
cac = regexp( tbl.Properties.VariableNames, '^[^_]+_001_[^_]+_USA$' );
T02 = tbl( :, not( cellfun( @isempty, cac ) ) );
How to delete only variables with name section 'USA'
cac = regexp( tbl.Properties.VariableNames, '^.+_USA$' );
T03 = tbl( :, cellfun( @isempty, cac ) );
%%How to delete only variables with name section '001' & 'USA'
cac = regexp( tbl.Properties.VariableNames, '^[^_]+_001_[^_]+_USA$' );
T04 = tbl( :, cellfun( @isempty, cac ) );

  1 Comment

This is great, thank you so much, very appreciated!

Sign in to comment.