How can I improve Matlab perfomance for this code?

Question

Helen on 25 Aug 2014

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/152205-how-can-i-improve-matlab-perfomance-for-this-code

Edited: Matz Johansson Bergström on 25 Aug 2014

Accepted Answer: Matz Johansson Bergström

Adn and custo are same size vectors, with a length of 10252 elements (length(Adn)=10252). Every row value in custo corresponds to the cost of the same row value in Adn . I'm trying to get the sum of all the possible combinations for the values in Adn and analyze the cost of that combination. From the Adn combinations, I am just interested in the ones that sum to a specific interval (which is lower than sum(A) and greater than sum(A)*0.9). A is previously written in the code, so it doesn't matter now. Then I want to store the combinations and their equivalent costs ( custoAll ) in two new vectors.

nA=numel(Adn);
for k1=1:nA
   B=combntns(Adn,k1);
   C=sum(B,2);          
   E=combntns(custo,k1);
   F=sum(E,2);
   D=find(((C<(sum(A)))&(C>(sum(A)*0.9))));
   if ~isempty(D)
       combinations=[combinations; B(D,:)];
       custoAll=[custoAll; F(D)];
   end          
end

This code is taking very long though. I know I should preallocate the variables combinations and custoAll otherwise they will change size in every iteration and it'll take longer to run, but I haven't been able to find a good way to do it, since Adn is already so huge. Does anybody have an insight on this?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Matz Johansson Bergström on 25 Aug 2014

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/152205-how-can-i-improve-matlab-perfomance-for-this-code#answer_149725

Edited: Matz Johansson Bergström on 25 Aug 2014

Open in MATLAB Online

It seems to me that combntns (which is the same as nchoosek) is taking a long time to finish. So, the appending of the combinations is not the bottleneck in this case.

Consider your example with 10252 elements. Then during the first loop nchoosek(10252, 1) = 10252. For the next loop we have nchoosek(10252,2)= 52546626. This will be the number of rows of B.

So B will then become nchoosek(10252,2)*8*2 bytes, because we use double precision, which is nchoosek(10252,2)*8*2/2^20 = 801 MB.

Ok, just for the fun of it, for the next loop B will become nchoosek(10252,3)*3*8/2^20 which is 4 TB. This is just to store the answer to B, we are not even considering the computational time.

So, you will never be able to compute this using matrices in the current way because you cannot possibly store them. I would suggest to use another approach like looping all combinations in some way. I will have to think about it.

The computation "N choose k" will increase and become smaller, sort of like a parabola. The largest number of combinations you will get from

nchoosek(10252, 10252/2)

In which we get from Matlab

Warning: Result may not be exact. Coefficient is greater than 9.007199e+15 and is only accurate to 15 digits

Remember, this is just the number of rows in B.

So I would suggest to use something else to solve the problem. I'm thinking about binary numbers, but I'm not sure. I will have to get back to this question later today.

1 Comment
Show -1 older commentsHide -1 older comments

Matz Johansson Bergström on 25 Aug 2014

Edited: Matz Johansson Bergström on 25 Aug 2014

I believe this problem is NP-complete. It can be seen as a relaxed version of the "Subset sum" problem, but they are really the same, see http://en.wikipedia.org/wiki/Subset_sum_problem.

This really means that this kind of problem does not have a program that can compute the answer in a quick way. However, this does not mean it is not solvable, you just have to find another way.

Even if you managed to speed up the code by creating a mask of binary numbers for instance, you will have to find all combinations and there are many iterations.

I would look at the subset sum and see if your question can be modified to that problem and use that code.

Sign in to comment.

How can I improve Matlab perfomance for this code?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

How can I improve Matlab perfomance for this code?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments