Script optimization: for loop, if statement, large dataset

Hello everyone,
below is the part of my script that I would like to be optimized. Any tips would be greatly appreciated.
Note: A = [1200x1200]
Pm = 0.2
N = length(A);
for Node_i = 1: N
if(Int(Node_i) > 0) && (rand <= Pm)
k_Node_i_in = 0;
k_Node_i_out = 0;
for Node_j = 1 : N
if(Node_i ~= Node_j) && (Child.CID(Node_j) == Child.CID(Node_i))
k_Node_i_in = k_Node_i_in + A(Node_i, Node_j);
end
if(Node_i ~= Node_j) && (Child.CID(Node_j) ~= Child.CID(Node_i))
k_Node_i_out = k_Node_i_out + A(Node_i, Node_j);
end
end
end
end

11 Comments

Please porivde some input data. What ist Child.CID ?
Hello Jan, thank you for your reply,
Child.CID is struct [1:1200] contain values from 0 to 60 distributed randomly into Child.CID.
Example: Child.CID(1) = 4
Child.CID(2) = 16
Child.CID(3) = 44
|
|
Child.CID(1200) = 55
and Int(Node_i) is double (1:1200) contain values from 0 to 20 distributed randomly into Int
(Int(Node_i) > 0)
What is Int()?
Also, if you define
k_Node_i_in = 0;
k_Node_i_out = 0;
inside your for loop, they will be rewritten at every iteration.
Thank you Joshi for your comment,
Int() may include 0 value. However, I need to optimize this code to find alternative to (for loop, if statement,...) because this script it takes a while time.
The cheapest acceleration is to avoid the repeated addressing in the struct Child. Copy the contents of Child.CID to the variable CID.
I asked you for providing input data, because it is very hard to optimize code without running it. It is ineffieicnt if all readers write some code to create some data. In addition they could implement wrong ideas and their work is wasted.
I still do not understand, what Int is. A function or array?
Again: Please provide some meaningful input data, e.g. created by rand or randi, such that we can run you code.
Hello Jan, Thanks again for your reply. I attach two files (the first one includes the parameters, the second one is the script).
Thanks
Please clarify what exactly Int() is.
@Dyuman Joshi: Look onto the data and the provided script: "Int" is "NumInteractionProtein" in the MAT file.

Sign in to comment.

 Accepted Answer

As I haved guessed already, replacing the repeated calls of Child.CmplxID by using a copy of the field, reduces the run time reamrkably:
Simplify:
k_Node_i_in = 0;
k_Node_i_out = 0;
for Node_j = 1 : N
if(Node_i ~= Node_j) && (Child.CID(Node_j) == Child.CID(Node_i))
k_Node_i_in = k_Node_i_in + A(Node_i, Node_j);
end
if(Node_i ~= Node_j) && (Child.CID(Node_j) ~= Child.CID(Node_i))
k_Node_i_out = k_Node_i_out + A(Node_i, Node_j);
end
end
to
CID = Child.CID; % Then replace all occurences of Child.CID inside the code
...
k_Node_i_in = 0;
k_Node_i_out = 0;
for Node_j = 1 : N
if(Node_i ~= Node_j)
if (CID(Node_j) == CID(Node_i))
k_Node_i_in = k_Node_i_in + A(Node_i, Node_j);
else
k_Node_i_out = k_Node_i_out + A(Node_i, Node_j);
end
end
end
The same in the second part:
k_Node_i_in = 0;
k_Node_i_out = 0;
for Node_j = 1 : N
if (Node_i ~= Node_j)
if CID(Node_j) == i
k_Node_i_in = k_Node_i_in + A(Node_i, Node_j);
else
k_Node_i_out = k_Node_i_out + A(Node_i, Node_j);
end
end
end
This reduces the runtime from 1.4 to 0.09 sec on my i7/Win10/R2018b.

1 Comment

Thank you very much Jan, I will do that. Very glad for your cooperation.

Sign in to comment.

More Answers (0)

Categories

Find more on App Building in Help Center and File Exchange

Asked:

on 8 Jun 2022

Commented:

on 10 Jun 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!