File Exchange

image thumbnail

Multicore - Parallel processing on multiple cores

version 1.39 (44.6 KB) by

This package provides parallel processing on multiple cores/machines.

4.75806
72 Ratings

42 Downloads

Updated

View License

This package provides parallel processing on multiple cores on a single machine or on multiple machines that have access to a common directory.
If you have multiple function calls that are independent of each other, and you can reformulate your code as

for k = 1:numel(parameterCell)
  resultCell{k} = myfun(parameterCell{k});
end

then, replacing the loop by

resultCell = startmulticoremaster(@myfun, parameterCell);

allows you to evaluate your loop in parallel. All you need to do is to start as many additional Matlab sessions/processes as you want slaves to work, and to run

startmulticoreslave

in those additional Matlab sessions.

Everything is programmed in plain and platform-independent Matlab - no toolboxes are used, no compilation of mex-files is necessary.

Please get started with 1. the documentation in file multicore.html, 2. the help lines of function startmulticoremaster.m and 3. the demo function multicoredemo.m.

Discuss with other users here: http://groups.yahoo.com/group/multicore_for_matlab

I have spent many hours to develop this package. If you would like to let me know that you appreciate my work, you can do so by leaving a donation: https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=GPUZTN4K63NRY

Keywords: Parallel processing, distributed computing, multiple cores.

Comments and Ratings (88)

grega

grega (view profile)

Michèle

Thanks, it works very well. One additional question -- any suggestion how to tell the workers that their process is done and they don't have to wait for any more evaluation demands?

Namit Sharma

I am sorry about my previous comment.
I meant to post it under the Differential Evolution submission.

ul1693

ul1693 (view profile)

Hallo,
wie kann ich Ihr Script optimal nutzen um den zweifachen Aufruf eines "sim-Befehls" innerhalb einer Funktion zu parallelisieren ?

Vielen Dank im voraus

James Amis

Liqun

Liqun (view profile)

how to use this package on a cluster (Linux based)? I can access to a cluster, but have to submit the job to the cluster from within the Matlab on my local laptop or desktop, using job = batch(......)

How to open as many as matlab sessions on the cluster, using job = batch(......)?

Hope can hear some feedback!

Thanks!

Hiroyuki

Scratch my first suggestion on my last post. I just read David's post below and was not aware of the -nodisplay flag.

Rami

Rami (view profile)

Excellent function. It was really helpful for me in the case of multi-objective optimization

Markus Buehren

Markus Buehren (view profile)

Hi Theo,

the different Matlab processes communicate via the file system in the multicore package. All function input and output arguments are saved to/read from the file system. In your example, the multicore master process will save 100 files of about 8 MB (1 million doubles) to the disk which are read by the slaves. The overhead is in this case clearly larger than the benefit of the parallel processing.

For further dicussion, please use the Yahoo group: http://groups.yahoo.com/group/multicore_for_matlab

Yours
Markus

Theo

Theo (view profile)

When I try to run the following piece of code:

clear all;
for ii=1:100
heavy_cell{ii}=rand(1000,1000);
end
myfun=@inv;
resultCell = cell(size(heavy_cell));
tic
for k=1:numel(heavy_cell)
  resultCell{k} = myfun(heavy_cell{k});
end
toc
clear resultCell;
tic
resultCell = startmulticoremaster(@inv, heavy_cell);
toc

I end up with :

Elapsed time is 26.412324 seconds.
Elapsed time is 111.729816 seconds.

The later being the result of the parallel processing. I take I must be missing something here. Could someone elaborate please?

Kilian Thomas

Very nice package.
Works perfectly for me.

Moreover, author answers your questions fast and well.

Good job thank you

David

David (view profile)

I added this bit of code to startmulticoremaster so that it automatically starts the appropriate amount of slaves (make sure your path is setup properly so that it can find startmulticoreslave.m upon startup):

% Start slaves:
max_instances = 4;
[status,result] = system('tasklist /FI "imagename eq matlab.exe" /fo table /nh');
currently_running = length(strfind(result,'MATLAB.exe'));

for i = 1:(max_instances-currently_running)
    !"C:\Program Files\MATLAB\R2013b\bin\matlab.exe" -nodisplay -nosplash -nodesktop -r "run('startmulticoreslave.m');exit;"
end

Erd

Erd (view profile)

Package is quite useful. If it was not using variable transfer based on files but direct memory, then it would have been much more useful. Currently it takes a while for the parallel processes to start, since there is a large overhead

David T_

Yulin

Yulin (view profile)

I test this code and compared with parfor

Elapsed time running STARTMULTICOREMASTER: 4.80 seconds.
Elapsed time running STARTMULTICOREMASTER: 4.72 seconds.
Elapsed time running STARTMULTICOREMASTER: 4.70 seconds.
Elapsed time running STARTMULTICOREMASTER: 4.69 seconds.
Elapsed time running STARTMULTICOREMASTER: 4.70 seconds.
Elapsed time without slave support: 20.84 seconds.

and with parfor only
Elapsed time running TESTFUN with parfor only: 3.21 seconds.

it seems that parfor is more powerful and simple to use.

Alan Mackay

Very useful tool for running parallel sessions. With a few minor tweaks to code, primiarly additional 'set up' and 'clean up' functions to handle opening and closing minimalist slave sessions, I have run this across multiple cores on the same machine and across multiple machines too.

Currently running this quite happily with ~23/24 sessions.

Alan Mackay

Tim

Tim (view profile)

Great, works exactly as described! It's a bit ridiculous that this functionality isn't included natively in Matlab considering how much we pay for the software.

haidi

haidi (view profile)

Hi, has anyone been successful using this package for a larger scale. In my case, it actually does not work very well when running on 100 cores. For some reasons the slave got stuck in a loop with the message "Ignoring old semaphore of file"???

haidi

haidi (view profile)

Great work!!
Is there a way to use this package without having to manually starting Matlab?
The reason I ask the question is as follows. I use Torque to submit to a cluster of 16 nodes, each node having 64 cores. However, I cannot start multiple matlab instances in a single node (even it has 64 cores). Your help is greatly appreciated!!! Many thanks.

Pink_panther

I activated only 3 slaves. Demo works great on my 8 core laptop! Be sure each session has path set to see the multicore folder.

>> multicoredemo
Elapsed time running STARTMULTICOREMASTER: 7.04 seconds.
Elapsed time running STARTMULTICOREMASTER: 7.39 seconds.
Elapsed time running STARTMULTICOREMASTER: 7.35 seconds.
Elapsed time running STARTMULTICOREMASTER: 6.74 seconds.
Elapsed time running STARTMULTICOREMASTER: 6.77 seconds.
Elapsed time without slave support: 20.86 seconds.
Elapsed time running TESTFUN directly: 19.84 seconds.

Pink_panther

Pink_panther

Works very well, thanks

Rohit Verma

I tried running my function simultaneously with 3 different datasets together, but the slave processes doesnt do anything and the time is the same as if I am running without it.

Rohit Verma

Can I use it for running multiple processes accessing the same function using different parameter cell. Please note that the function takes in image input in each call. Will that be a problem ?

Thanks very nice code

Zhanhong

Great Package! Definitely useful for multicore CPUs. I have a 6 core AMD. It's such a pain if MATLAB can't run parallel.
I have a demo tip for newbies:
If you want to see the effect, make sure your function have to run at least several times on the input cell. If your function only has one input, all the slave sessions will be doing nothing because the function is already running on master. You'd probably want to make your input into segments to make all the slaves run simultaneously.

laoya

laoya (view profile)

The tool is really powerful. I want to know if we can develop a similar tool by Fortran or C language. There are two reasons:
1) since every master and slave program need a matlab window, it is too expensive to buy multiple licenses of Matlab to run on different machines;
2) the slave program is launched by matlab, which will also need more than 100 MB memory, however, maybe the slave program only need to launch another external execute program with different parameters. Write a pure master/pure program will decrease the memory usage greatly.

Thanks,
Zhanhgong Tang

Jason D

Simply wonderful!
I have an optimization problem -including several launches of a Simulink model- running on a single core in a little more than 30 hours. By reformulating the problem as required by the Marcus' scripts (half a day work) I'm currently using five Windows PCs with a total of 14 cores and carrying out the simulation in 3.1 hours.
Thank you!!!

Xinghua Lou

Hi Markus,

Great work!

I may have one suggestion: the file I/O becomes a bottleneck in my application since saving the meta-data of a task (large image sequences) costs almost as much time as processing the task. Maybe it helps a lot if the file I/O can be replace by shared memory functions and there is a new Matlab library for use: http://www.mathworks.com/matlabcentral/fileexchange/28572-sharedmatrix. I think it is a perfect complement to your library.

Best,
Xinghua

Darin McCoy

nevermind my previous comment.....i didnt read the instructions :)

5 stars for the m file and 6 stars for customer service. Thanks Markus!

Darin McCoy

No improvement running the multicoredemo.m file

Elapsed time running STARTMULTICOREMASTER: 21.53 seconds.
Elapsed time running STARTMULTICOREMASTER: 21.67 seconds.
Elapsed time running STARTMULTICOREMASTER: 21.31 seconds.
Elapsed time running STARTMULTICOREMASTER: 21.31 seconds.
Elapsed time running STARTMULTICOREMASTER: 21.30 seconds.
Elapsed time without slave support: 21.14 seconds.
Elapsed time running TESTFUN directly: 20.00 seconds.

Johnny Ta

awesome. you're my savior. the code works like charm!

Torfinn

Thank you for these tools, they are vastly useful and will save me much time.

Hamid Badi

Great

Hamid Badi

Robert Stead

I'm having problems with the lasterror function in this code. There appear to be several instances where the lasterror function is passed a string 'reset' as an input argument, but the function lasterror is only defined for inputs of structure type. This causes errors at several points in the code, and I am unable to run the multicoredemo routine. I'm sure this is something I'm doing wrong, but I'd be grateful if someone could help me!

Karl

Karl (view profile)

Hi Markus,

first of all -- you've developed a great tool that helps me a lot with my simulations in the field of audio signal processing. Seeing my dual quad core at 100% (instead of 13%) load warms my heart :).

Well, I'm not sure whether I'm getting something wrong here, but I had some problems when the function that is executed by the multicoremaster() (and the slaves) has more than one return value. It seems that in this case all but the first return values get lost. I applied a little trick that I found out about some time ago to solve this problem:
Everytime that feval() is called (in startmulticoremaster() and startmulticoreslave() ) I added something like this:

N_returnValues = nargout(functionHandleCell{k});
clear('returnValue'); % this is dirty! The next call doesn't work
     % without clearing "returnValue" beforehand if
     % N_returnValues==1. If it's greater->no
     % problem (even without clearing
     % "returnValue")
[returnValue{1:N_returnValues}] = feval(functionHandleCell{k}, parameterCell{k}{:}); % ha!
resultCell{k} = returnValue;

Doing so, the resultsCell that is returned from startmulticoremaster() is always a cell -- even if the called function has only one return value...

I hope this is of any value to anybody and that I'm not causing trouble by posting this hack :). I'm always open to learn a better solution....

Cheers, and thanks again for Multicore!

    Karl

Karl

Karl (view profile)

DAdler

DAdler (view profile)

Thank you very much for this great tool! I started using your code a few of months ago, and I must say it saved me lots of hours of work.

Awesome tool, I use it for fitting a computationally expensive financial model and it works just great. Thanks a lot for publishing it!! Johannes

Amir

Amir (view profile)

I have little bit of problem running the master and slave processes over the network. I am not sure how to share the folder over a linux network!
Anybody can help?

German

German (view profile)

sorry just missed to start a second matlab session with "startmulticoreslave".
Great code.

German

German (view profile)

Hello, when i am running the multicoredemo, only the master is working, but not the slave. What am i doing wrong?
multithreading unabled/ is set to one core.
I use a duo core processor with windows vista and Matlab 7.8.0 R2009a 64-bit.
Thank you very much.

Nir

Nir (view profile)

Great Work !
Speeds up my work more than twice on a quad computer. Not much change had to be done in order to use it.
Thanks
Nir

Thomas

Thomas (view profile)

Great tool!

dpb10

dpb10 (view profile)

Richard

Brilliant tool! Great being able to sit back and watch a progress bar sliding along as a room full of computers gets to work doing your simulation.

I have a question though - what dimensions are typically used for parameterCell? I've tried doing some large multidimensional runs (e.g. 4x150x10) and things seemed to grind to a halt - I'm still looking into it, just wondered if the dimensions I'm using are typical or too large.

Cheers. Great work,
Richard.

Richard Crozier

Fantastic program, and particularly suited to my work with genetic algorithms. There is one mnor error I've noticed though. In startmulticoreslave, if you activate debug mode you reach the following line (77):

fileNr = str2double(regexptokens(parameterFileName,...
'parameters_\d+_(\d+)\.mat'));

But there is no regexptokens function, at least there isn't in R2007a or R2007b or R2008a. My solution is to replace this with the following lines (although I'm sure someone could come up with something more robust and/or elegent).

fileNrCell = regexp(parameterFileName,'parameters_\d+_(\d+)\.mat', 'tokens');

fileNr = str2double(fileNrCell{1});

The program is excellent though, thanks again!

Markus Buehren

Markus Buehren (view profile)

I have opened a discussion group for the Multicore package on Yahoo. Please join and discuss with other users!

http://groups.yahoo.com/group/multicore_for_matlab/

Moody

Moody (view profile)

I've been using multicore for a while now and its absolutely excellent. I'm running on 5 dual xeon x5460 as well as a couple of quad core boxes.

I was wondering if anyone compared performance of this toolbox with the parfor parallel computing matlab toolbox. Are they comparable?

I believe I'm bottlenecked now due to the hard disk I/O, so I was looking at the in memory possibilities of this or potentially upgrading my hd's to solid state to reduce the overhead.

BTW, I also tried precreating all the mat files once instead of doing multiple loops to reduce the I/O. Unfortunately, that didn't help as much as I hoped.

Don't get me wrong though, this is much, much, much faster than single threading, but as always we need to keep pushing :).

Andrew Scott

Jordi Arnabat

Thanks for this great contribution, it's very useful.

Correction:
I've used in under a grid of computers running different OS: GNU/Linux, Mac and Windows (XP). When the shared folder is on a network computer (not mapped to a local drive, for example: \\servername\sharedfolder); Windows systems fail trying to delete the semaphores, causing the master process run forever.

The solution I found is to slightly modify compsep.m and concatpath.m as follows:

_______________________________________________________
function str = chompsep(str)
unix_sep = '/';
pc_sep = '\';

if isunix && str(1)==pc_sep
    str = strrep(str, pc_sep, unix_sep);
elseif ispc && str(1)==unix_sep
    str = strrep(str, unix_sep, pc_sep);
end

if ~isempty(str) && (str(end) == unix_sep || str(end) == pc_sep)
str(end) = '';
end

_______________________________________________________
function str = concatpath(varargin)
unix_sep = '/';
pc_sep = '\';

str = '';
for n=1:nargin
curStr = varargin{n};
str = fullfile(str, chompsep(curStr));
end

if isunix && str(1)==pc_sep
    str = strrep(str, pc_sep, unix_sep);
elseif ispc && str(1)==unix_sep
    str = strrep(str, unix_sep, pc_sep);
end

Thank you! This works just fine and now I have 100% CPU usage on all 4 cores! By measuring the elapsed time it seems that it runs about 3.4 times faster than with a single matlab doing all the work, so that's about 85% efficient on my quad-core

now, if only this could run on a single multi-core machine with only one matlab instance running

Arturo Serrano

Arturo Serrano (view profile)

Arturo Serrano

Arturo Serrano (view profile)

I got the same problem as Rohaly. When master ends the computation, and there is a slave still working, the master computes the remaining job again, yielding an extra iteration.
The solution is to set MAXMASTEREVALUATIONS, with all its drawbacks, since i understand that this isn't a bug rather than a problem not knowing if the slave got interrupted.

BTW, it works like a charm.

Marcio Sales

I tested the functions on two dual core machines. I had great gain when paralellizing processing between the processors of dual core machine or between the two machines using a single processor in each. However, I had no significant gain when trying to using both processors on both machines. Is that because the gain of using more processors is being reduced by increased load of data recording when you add more processors/machines? Also, there are times when I get an error in which one of the workers can't delete the temporary files and this seems to happen more frequently when you increase the number of workers. Is anyone having the same issues? My two machines run Vista 64bits.

Janos Rohaly

It seems there is the possibility for master and slave to simultaneously evaluate the same set of parameters. For example, if slave starts evaluating a slow process, master can catch up, and there is nothing to prevent it to start the very same computation since slave hasn't generated the result file yet. There is also a bug in setfilesemaphore.m. dirStruct(k).datenum in line 78 should be datenum(dirStruct(k).date).

Bruno Cordani

Great!!!

M H

Its absolutely excellent. Am using it now on my quad core machine and am probably going to buy another quad core just to see my models run so quickly. :)

Robert Turner

Brilliant library. Works like a charm

uju jbl

Jun Kim

This process works very well. For my model estimation, I was able to cut the execution time in a drastic manner. Also Markus was kind and was very responsive to my question.

Markus Buehren

> An option for avoiding the use of the
> master core is desirable

The option is already existing: You can use the input parameter MAXMASTEREVALUATIONS and set it to zero.

Igor S

An option for avoiding the use of the master core is desirable
Indeed if one slave process terminates or crushes the entiere process continues but if by chance the iteration that crushes is on the master core all the process is compromised

igor scardanzan

great , just some difficulties to kill the slave : the process persists and CNTR C does not work . one should await the execution end

A. S.

Thanks, very useful. Synchronization is not optimal (for example, the master shouldn't start working on a task if a slave is already working on it), but still a great program.

Andrea Soldan

very useful and it works excellently.
much more than the distribuited computing toolbox provided by Matlab (which is very hard to use).
my SO is Linux, and i'm working with 4 workers ( 2 dual-core processors)

David Brown

Works excellently. It would be nice to see this turned into a fully-guided setup to use this.

Huy Bui

Multicore works quite well overall. I got a bug though. I make a mistake in parameterCell. The slave processes all die because of that. When I tried to exit all slave processes & start again, the same error causing slave processes dead appears again & again.

Kevin Thom

Awesome program ... I have run it successfully using anywhere from 4 processors on one machine to 10 processors on 5 machines ... it makes a whole range of computationally expensive projects feasible in MATLAB.

happy matlabuser

Ran it with 8 processors across 6 machines, and it works beautifully. Unfortunately, if you kill one or more processors, the master processor MUST do that job. Since the jobs are long for my problem, it's better to kill everything in that situation and finish everything at the command line. The master starts at the top of the list, and the slaves start at the bottom. When they meet somewhere in the middle, the master will always redo one of the jobs. I don't think the code will be efficient for a very large number of very small jobs (correct me if I'm wrong); so, I recommend making jobs medium length, (total run time)/(10*(# of processors)) for example. For my problem, these are fairly minor concerns. The author did a great job with this code -- easily saving me hours of work -- thank you!

Markus Buehren

Yes and no: The slave process should create the directory if it is not existing (I have updated that). However, you can start the slave processes whenever you like! You can also interrupt and restart them while the master is running.

Darren 3M

Brilliant use of the filesystem to share the load. It's not quite 2x increase but on a quick cluster test I got an increase of 4x with 5CPU's so 80% efficient.

schena gianni

on dual-processor Intel Xeon based machines it halfs calculation time i.e. it cuts by a factor 2 !

Michal Kvasnicka

Wow!!! After the few hours of hard experimentation with MULTICORE I finally learned how to use this code for parallel running on my four core PC. This is realy good and useful tool for distributed computing!!!

1. Informational text in the demo must be extended to detailed description how to run "tesfun" in parallel regime.
2. Some work must be done to minimize interprocess communication overhead, which may be very itensive (25% of the overall load) in some cases.

Good work!!!

Michal Kvasnicka

I am not able to reproduce any improvment from sequential to parallel (multicore) realization of the testing demo code.

I would go as far as to say, that parallel (multi-core) realization is slower then sequential in some cases.

William Renaud

Would it be possible to make this compatible with Octave? Due to licensing restrictions it is difficult for many people to have numerous Matlab instances available.

Zhijun Wang

Corrections:

I tested the programn to see whether this program can improve parallel processing in a single machine with the following code:

______________________
N=20;
for m = 1:10
tic
  for z = 1:N
      testfun(z)
  end
 toc
end
__________________________

Results show the program do not improve anything comparing with my code in a single machine!

So I rate again

Zhijun Wang

Parallel processing on multiple cores in a single machine is very useful!

lucia Come

long due as standard matlab capability
does it work also with scrits or only with functions ?

Michal Kvasnicka

The demo should be more selfdescriptive. Comparison with single core demo running is very important for parallelization impact evaluation.

Updates

1.39

Performance improvement especially for projects using many slave processes.

1.37

Added parfor-loop to demo.

1.35

Performance improvement - especially when using a large number of slaves

1.33

New features:
1. Slave settings can be set via command line argument.
2. Slave Matlab process can be quit after a given time in seconds.

1.32

Bugfix: calling startmulticoremaster.m without settings works now.

1.31

Typo fixed.

1.30

Typo fixed.

1.29

Overhead resulting from expanding the function handle cell reduced.

1.28

Description changed again.

1.27

Links in help lines corrected.

1.26

Description modified to make it more concise.

1.24

Only E-mail changed in html document.

1.23

Bugfix.

1.22

Small changes to documentation and gethostname.m

1.21

Structure being passed to post-processing function changed (still undocumented feature)

1.20

Estimation of time left changed, post-processing function introduced.

1.19

Bug fixed.

1.18

File displayerrorstruct.m was missing.

1.16

Call to "clear functions" now in master and slaves, bug fixed.

1.15

In each multicore run, "clear functions" is now called once to ensure that changes to m-files take effect.

1.14

Two bugs fixed, one regarding the waitbar, one regarding the semaphore mechanism.

1.13

Using system-dependent file separators in paths again. Waitbar shows progress during parameter file generation now.

1.12

Added estimation of time left in waitbar.

1.11

Added an optional waitbar.

1.10

If a slave is killed during working on a job, the master will now generate the parameter file of that job again instead of working on the file himself. This will increase performance in certain situations.

1.9

File regexptokens.m added.
Dicussion group created: http://groups.yahoo.com/group/multicore_for_matlab

1.8

Another change to the semaphore mechanism.

1.7

I have nearly re-written both master and slave in order to make the package even more robust and to reduce the overhead for inter-process communication.

1.5

Introduced parameter EVALSATONCE which causes the multicore package to do several function evaluations after each other before saving/loading and thus reducing the communication overhead. Demo function MULTICOREDEMO heavily commented.

1.4

Semaphore mechanism improved.

1.3

Forgot to include file chompsep.m

1.2

Semaphore stuff improved.

1.1

Subfunction datenum2 was not needed.

Old e-mail address removed from help comments of m-files.

Update of contact information in documentation.

A file was missing - sorry.

Improved support for small numbers of very long function evaluations.

Yet another small update.

There was a subfunction missing (which is only executed after a write error).

Slave process will now create the temporary directory if it is not existing.

Update of documentation contained in zip file, no changes to source code.

Another note about multithreading

Updated info due to new Matlab multicore functionality.

Informational text added to demo, improvements in file access organization.

bad line breaks in description...

MATLAB Release
MATLAB 7.5 (R2007b)
Acknowledgements

Inspired: Batch Job

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video