How to use spmd with heavy input data ?

4 views (last 30 days)
Alexandre Kazantsev
Alexandre Kazantsev on 27 Feb 2017
Answered: Edric Ellis on 28 Feb 2017
Hello everyone,
I use spmd blocks to process a big matrix by blocks. Since spmd copies the workspace on 16 workers, it rapidly saturates my system's memory. I tried to create subblocks with a limited size and then pass them individually to an spmd block into a separate script, but this changed nothing in the memory demand. Does spmd copy all the local workspaces on workers, no matter from which workspace it is launched ? Do you know any solution other then writing files on disk and clearing variables ?
Thank you for reading !
Alex

Answers (2)

Edric Ellis
Edric Ellis on 28 Feb 2017
When using spmd, your best bet is to create the arrays within the spmd block itself, rather than transferring it from the client. For example:
spmd
myData = labindex * rand(1000);
end
In this way, each worker has its own copy of myData. Also note that at the client, myData is a Composite - this is a special type that simply refers to the data on the workers, and nothing is transferred (unless you explicitly request it using indexing).
You might also be interested in distributed arrays. These are arrays that are designed to keep data on the workers, and work with spmd.

Walter Roberson
Walter Roberson on 27 Feb 2017
If you are not altering the large matrix, then possibly it would help to use parallel.pool.constant
  1 Comment
Alexandre Kazantsev
Alexandre Kazantsev on 27 Feb 2017
Thank you very much Walter, I will test this week and give feedback.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!