Info

This question is closed. Reopen it to edit or answer.

increase the speed of a certain code

1 view (last 30 days)
Jwana
Jwana on 19 Dec 2012
Closed: MATLAB Answer Bot on 20 Aug 2021
Hi,
I have a code with a call function that gives the Type of each gene in system ( I can find it by comparing the order of each gene with its children and parent).. the code is working fine with a small amout of cells arrays, but when increasing the amount to thousands it takes hours... the code is:
Types=[];
type1=level1_root; % it is fixed value (GO:0008150)
for k=1:100
type{k}=type_fc(p1,c1,type1);
type1=type{k}';
temp1=num2cell(repmat(k+1,length(type1),1));
type1=[type1 temp1];
Types=[Types; type1];
end
Types
function type=type_fc(p1,c1,type1)
type=[];
for j=1:length(type1)
for i=1:length(p1)
a=[p1(i),c1(i)];
if isequal(a(1), type1(j))
type=[type a(2)];
end
end
end
for 13 genes I have these sample results:
p1'= %refer to parent genes
'GO:0008150'
'GO:0016740'
'GO:0016787'
'GO:0008150'
'GO:0016740'
'GO:0016740'
'GO:0016787'
'GO:0016787'
'GO:0016787'
'GO:0006810'
'GO:0006412'
'GO:0004672'
c1'= % refer to children genes
'GO:0016740'
'GO:0016787'
'GO:0006810'
'GO:0006412'
'GO:0004672'
'GO:0016779'
'GO:0004386'
'GO:0003774'
'GO:0016298'
'GO:0016192'
'GO:0005215'
'GO:0030533'
and the result will be: Types =
'GO:0016740' [2]
'GO:0006412' [2]
'GO:0016787' [3]
'GO:0004672' [3]
'GO:0016779' [3]
'GO:0005215' [3]
'GO:0006810' [4]
'GO:0004386' [4]
'GO:0003774' [4]
'GO:0016298' [4]
'GO:0030533' [4]
'GO:0016192' [5]
so do you have any idea about how to increase the speed of this code..
Thanks.

Answers (2)

Jan
Jan on 19 Dec 2012
Edited: Jan on 19 Dec 2012
Letting an array grow inside a loop is a bad idea, because Matlab has to allocate the memory for the new array and copy the old values repeatedly. Whenn the result has 1000 elements, Matlab has to allocate and copy sum(1:1000) elements, and this means 2'004'000 Bytes when the data have the type double, which use 8 Bytes per element.
Look for the term "preallocation" or "pre-allocation" to find more explanations and examples.
Another point: Avoid all unnecessary work. You do not use a = [p1(i),c1(i)] at all, then do not create it.
function type = type_fc(p1,c1,type1)
type=[];
for j = 1:length(type1)
for i = 1:length(p1)
if isequal(p1(i), type1(j))
type = [type, c1(i)];
end
end
end
And with a pre-allocation:
function type = type_fc(p1, c1, type1)
type = zeros(1, length(type1) * length(p1));
count = 0;
for j = 1:length(type1)
for i = 1:length(p1)
if isequal(p1(i), type1(j))
count = count + 1;
type(count) = c1(i);
end
end
end
An equivalent method is required in the main function also.

Matt J
Matt J on 19 Dec 2012
In addition to what Jan said, avoid unnecessary repeated memory allocations inside for-loops, like
a=[p1(i),c1(i)];
and repeated table-lookups like type1(j). The following should be a better version of type_fc.
function type=type_fc(p1,c1,type1)
type=cell(1,length(type1)*length(p1));
k=0;
for j=1:length(type1)
tj=type1(j);
for i=1:length(p1)
k=k+1;
if isequal(p1(i), tj)
type{k}= c1(i);
end
end
end
type=[type{:}];

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!