Large Cell Array Data Query (USDA FIA data)
Show older comments
Hi, I have two large cell array data sets (USDA FIA data). Trying to connect two (Data B to A) using TRE_CN (tree numbers in string, e.g. '212152293031', 212152393031' ...).
I tried two options.
1. for loop and strcmp
Fnl_mat=cell(rows_dataA,6);
Fnl_mat(:,1:5)=dataA;
for i=1:rows_dataA
Qry_mat=strcmp([dataB{:,1}]',dataA{i,1}{1,1});
Fnl_mat(i,6)=dataB(Qry_mat,2);
end
save(filename,'Fnl_mat');
2. getnameidx
idx=getnameidx([dataB{:,1}],[dataA{:,1}]);
Fnl_mat=cell(rows_dataA,6);
Fnl_mat(:,1:5)=dataA;
Fnl_mat(:,6)=dataB(idx,2);
save(filename,'Fnl_mat');
But,,, both options take too much time (10,000 secs) in processes due to large amount of rows (>30,000 for dataA and >600,000 for dataB). How can I solve this problem?
Dataset A
% TRE_CN PLT_CN INVYR SUBP HT
% String String Number Number Number
'291024' '12312' 2009 1 60
'291124' '12312' 2009 1 38
...
...
over 30000 rows
Dataset B
% TRE_CN BIOMASS
% String Number
'220324' 800
'220424' 345
...
...
'291024' 580
'291124' 304
...
...
over 600000 rows
Answers (1)
SUNGHO
on 25 Sep 2012
0 votes
Categories
Find more on Characters and Strings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!