Is there anyway to speed this code up ??
1 view (last 30 days)
Show older comments
Basically, the case with me is that I have a large set of data with only 4 columns. I currently have a dataset array and I want to separate the observations. I have indexed the different groups of observations by 1, 2, 3, ..... 12984. All of these are currently stored in a dataset array. What I want to do is to create a numerical matrix that consists of columns for the 12984 different observations as well as their different rows since I can operate on it faster and more efficiently and have repeated observations. The problem is some of them are all unique and so some of the groups of observations have for instance (616,3), (500,3) etc etc.
This is the code I have so far:
mat = ones(200,3);
i = 1;
while i <= 10
c = ds(ds.newid == i, {'permno','monthlycumlnret','dates',});
d = cat(1, [ double(c) ]);
if length(d) == length(mat);
mat = [mat d];
if length(d) > length(mat)
z = ones(length(d)-length(mat),3);
mat = [vertcat(1,z,mat) d];
else
z = zeros(length(mat)-length(d),3);
mat = [mat vertcat(1,z,d)];
end
i = i + 1;
end
end
I haven't even done it up to 100 and it is already taking for ever to run. Can someone please help me? I am a bit new at matlab so any help is appreciated.
5 Comments
Guillaume
on 15 Jan 2016
In addition, do not use length on 2D arrays. If your arrays has less rows than columns, length will return the number of columns, which is not what you want in your code above.
Always be explicit. You want the number of rows, so use
size(mat, 1)
Accepted Answer
Guillaume
on 15 Jan 2016
Edited: Guillaume
on 15 Jan 2016
Your code has bugs (the vertcat calls all have an invalid 1 as first argument, you're using length with arbitrary sized matrices) and sometimes doesn't make much sense ( d = cat(1, [ double(c) ]) is the same as d = double(c)). In any case, the way you're approaching the problem is very inneficient.
I'm not familiar with datasets nor nominal, they both have been deprecated but the following should work for you:
if you have 2015b, there are some new functions that makes it somewhat easy to do what you want: findgroups and splitapply:
groupeddata = splitapply(@(subd) {subd}, double(ds(:, {'permno','monthlycumlnret','dates'})), findgroups(ds.newid)) %creates a cell array for each id
maxheight = max(cellfun(@(m) size(m, 1), groupeddata)); %get height of output matrix
groupeddata = cellfun(@(m) [m; ones(maxheight - size(m, 1), size(m, 2))], groupeddata, 'UniformOutput', false); %resize all group to the maxheight
grouppeddata = [grouppeddata{:}] %and concatenate
If you don't have 2015b, then the splitapply step can be replaced by:
grouppeddata = arrayfun(@(id) double(ds(ds.newid == id, {'permno','monthlycumlnret','dates'})), unique(ds.newid), 'UniformOutput', false)
3 Comments
Guillaume
on 15 Jan 2016
More Answers (0)
See Also
Categories
Find more on Get Started with MATLAB in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!