4.62069

4.6 | 32 ratings Rate this file 210 downloads (last 30 days) File Size: 41.98 KB File ID: #13775

Multicore - Parallel processing on multiple cores

by Markus Buehren

 

26 Jan 2007 (Updated 19 Jun 2009)

BSD License  

This package provides some functions realizing parallel processing on multiple cores/machines.

Download Now | Watch this File

File Information
Description

The latest Matlab releases (starting with R2007a) include support for multiple cores. However, Matlab will only take advantage of multiple cores in evaluating certain computations like an FFT or a FIR filtering operation. Matlab will never be able to determine if, for example, consecutive function calls in a for-loop are independent of each other.  
 
With this package, I provide some MATLAB-functions realizing parallel processing on multiple cores on a single machine or on multiple machines that have access to a common directory.  
 
If you have multiple function calls that are independent of each other, and you can reformulate your code as  
 
resultCell = cell(size(parameterCell));  
for k=1:numel(parameterCell)  
  resultCell{k} = myfun(parameterCell{k});  
end  
 
then, replacing the loop by  
 
resultCell = startmulticoremaster(@myfun, parameterCell);  
 
allows you to evaluate your loop in parallel in several processes. All you need to do in the other Matlab processes is to run  
 
startmulticoreslave;  
 
No special toolboxes are used, no compilation of mex-files is necessary, everything is programmed in plain and platform-independent Matlab. If one of your slave processes dies - don't care, the master process will go on working on the given task.  
 
Please consider that the communication between the processes, which is done by saving/loading temporary files, causes some overhead. Thus, if your function calls only need fractions of a second, the overhead will eat the advantage of the parallelization or can even lead to an increase of computation time. However, the multicore package can do several function evaluations after each other in every process before communicating again. In this way the overhead percentage can be lowered. See the help of function startmulticoremaster.m for more details.  
 
Note: The Matlab multithreading capability (R2007a and higher) might terminate all the advantage gained by using the multicore package. So make sure that you UNcheck "Enable multihreaded computation" under File/Preferences/General/Multithreading in all involved Matlab sessions.  
 
Dicuss with others users here: http://groups.yahoo.com/group/multicore_for_matlab  
 
Keywords: Parallel processing, distributed computing, multiple core.

MATLAB release MATLAB 7.5 (R2007b)
Zip File Content  
HTML Files Multicore
Other Files
chompsep.m,
concatpath.m,
datenum2.m,
deletewithsemaphores.m,
displayerrorstruct.m,
existfile.c,
existfile.m,
findfiles.m,
generateemptyfile.m,
getfiledate.m,
gethostname.m,
getusername.m,
license.txt,
mbdelete.m,
mbload.m,
mbtime.m,
multicoredemo.m,
postprocessdemo.m,
regexptokens.m,
removefilesemaphore.m,
setfilesemaphore.m,
startmulticoremaster.m,
startmulticoreslave.m,
tempdir2.m,
testfun.m,
textwrap2.m,
translatedatestr.m,
userfeedback.m
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (37)
28 Jan 2007 Michal Kvasnicka The demo should be more selfdescriptive. Comparison with single core demo running is very important for parallelization impact evaluation.
28 Jan 2007 lucia Come long due as standard matlab capability  
does it work also with scrits or only with functions ?
28 Jan 2007 Zhijun Wang Parallel processing on multiple cores in a single machine is very useful!
29 Jan 2007 Zhijun Wang Corrections:  
 
I tested the programn to see whether this program can improve parallel processing in a single machine with the following code:  
 
______________________  
N=20;  
for m = 1:10  
tic  
  for z = 1:N  
      testfun(z)  
  end  
 toc  
end  
__________________________  
 
Results show the program do not improve anything comparing with my code in a single machine!  
 
So I rate again
29 Jan 2007 William Renaud Would it be possible to make this compatible with Octave? Due to licensing restrictions it is difficult for many people to have numerous Matlab instances available.
30 Jan 2007 Michal Kvasnicka I am not able to reproduce any improvment from sequential to parallel (multicore) realization of the testing demo code.  
 
I would go as far as to say, that parallel (multi-core) realization is slower then sequential in some cases.
30 Jan 2007 Michal Kvasnicka Wow!!! After the few hours of hard experimentation with MULTICORE I finally learned how to use this code for parallel running on my four core PC. This is realy good and useful tool for distributed computing!!!  
 
1. Informational text in the demo must be extended to detailed description how to run "tesfun" in parallel regime.  
2. Some work must be done to minimize interprocess communication overhead, which may be very itensive (25% of the overall load) in some cases.  
 
Good work!!!
02 Mar 2007 schena gianni on dual-processor Intel Xeon based machines it halfs calculation time i.e. it cuts by a factor 2 !
21 May 2007 Darren 3M Brilliant use of the filesystem to share the load. It's not quite 2x increase but on a quick cluster test I got an increase of 4x with 5CPU's so 80% efficient.
14 Jun 2007 Markus Buehren Yes and no: The slave process should create the directory if it is not existing (I have updated that). However, you can start the slave processes whenever you like! You can also interrupt and restart them while the master is running.
26 Aug 2007 happy matlabuser Ran it with 8 processors across 6 machines, and it works beautifully. Unfortunately, if you kill one or more processors, the master processor MUST do that job. Since the jobs are long for my problem, it's better to kill everything in that situation and finish everything at the command line. The master starts at the top of the list, and the slaves start at the bottom. When they meet somewhere in the middle, the master will always redo one of the jobs. I don't think the code will be efficient for a very large number of very small jobs (correct me if I'm wrong); so, I recommend making jobs medium length, (total run time)/(10*(# of processors)) for example. For my problem, these are fairly minor concerns. The author did a great job with this code -- easily saving me hours of work -- thank you!
27 Aug 2007 Kevin Thom Awesome program ... I have run it successfully using anywhere from 4 processors on one machine to 10 processors on 5 machines ... it makes a whole range of computationally expensive projects feasible in MATLAB.
05 Oct 2007 Huy Bui Multicore works quite well overall. I got a bug though. I make a mistake in parameterCell. The slave processes all die because of that. When I tried to exit all slave processes & start again, the same error causing slave processes dead appears again & again.
21 Oct 2007 David Brown Works excellently. It would be nice to see this turned into a fully-guided setup to use this.
07 Dec 2007 Andrea Soldan  
 
very useful and it works excellently.  
much more than the distribuited computing toolbox provided by Matlab (which is very hard to use).  
my SO is Linux, and i'm working with 4 workers ( 2 dual-core processors)  
 
18 Dec 2007 A. S. Thanks, very useful. Synchronization is not optimal (for example, the master shouldn't start working on a task if a slave is already working on it), but still a great program.
14 Jan 2008 igor scardanzan great , just some difficulties to kill the slave : the process persists and CNTR C does not work . one should await the execution end
04 Feb 2008 Igor S An option for avoiding the use of the master core is desirable  
Indeed if one slave process terminates or crushes the entiere process continues but if by chance the iteration that crushes is on the master core all the process is compromised
08 Feb 2008 Markus Buehren > An option for avoiding the use of the  
> master core is desirable  
 
The option is already existing: You can use the input parameter MAXMASTEREVALUATIONS and set it to zero.  
 
25 May 2008 Jun Kim This process works very well. For my model estimation, I was able to cut the execution time in a drastic manner. Also Markus was kind and was very responsive to my question.
30 Jun 2008 uju jbl  
14 Jul 2008 Robert Turner Brilliant library. Works like a charm
03 Sep 2008 M H Its absolutely excellent. Am using it now on my quad core machine and am probably going to buy another quad core just to see my models run so quickly. :)
15 Sep 2008 Bruno Cordani Great!!!
19 Dec 2008 Janos Rohaly It seems there is the possibility for master and slave to simultaneously evaluate the same set of parameters. For example, if slave starts evaluating a slow process, master can catch up, and there is nothing to prevent it to start the very same computation since slave hasn't generated the result file yet. There is also a bug in setfilesemaphore.m. dirStruct(k).datenum in line 78 should be datenum(dirStruct(k).date).
21 Dec 2008 Marcio Sales I tested the functions on two dual core machines. I had great gain when paralellizing processing between the processors of dual core machine or between the two machines using a single processor in each. However, I had no significant gain when trying to using both processors on both machines. Is that because the gain of using more processors is being reduced by increased load of data recording when you add more processors/machines? Also, there are times when I get an error in which one of the workers can't delete the temporary files and this seems to happen more frequently when you increase the number of workers. Is anyone having the same issues? My two machines run Vista 64bits.
07 Jan 2009 Arturo Serrano I got the same problem as Rohaly. When master ends the computation, and there is a slave still working, the master computes the remaining job again, yielding an extra iteration.  
The solution is to set MAXMASTEREVALUATIONS, with all its drawbacks, since i understand that this isn't a bug rather than a problem not knowing if the slave got interrupted. 
 
BTW, it works like a charm.
12 Jan 2009 Arturo Serrano  
24 Jan 2009 Vasilis Kapetanidis Thank you! This works just fine and now I have 100% CPU usage on all 4 cores! By measuring the elapsed time it seems that it runs about 3.4 times faster than with a single matlab doing all the work, so that's about 85% efficient on my quad-core 
 
now, if only this could run on a single multi-core machine with only one matlab instance running
25 Jan 2009 Jordi Arnabat Thanks for this great contribution, it's very useful. 
 
Correction: 
I've used in under a grid of computers running different OS: GNU/Linux, Mac and Windows (XP). When the shared folder is on a network computer (not mapped to a local drive, for example: \\servername\sharedfolder); Windows systems fail trying to delete the semaphores, causing the master process run forever. 
 
The solution I found is to slightly modify compsep.m and concatpath.m as follows: 
 
_______________________________________________________ 
function str = chompsep(str) 
unix_sep = '/'; 
pc_sep = '\'; 
 
if isunix && str(1)==pc_sep 
    str = strrep(str, pc_sep, unix_sep); 
elseif ispc && str(1)==unix_sep 
    str = strrep(str, unix_sep, pc_sep); 
end 
 
if ~isempty(str) && (str(end) == unix_sep || str(end) == pc_sep) 
str(end) = ''; 
end  
 
_______________________________________________________ 
function str = concatpath(varargin) 
unix_sep = '/'; 
pc_sep = '\'; 
 
str = ''; 
for n=1:nargin 
curStr = varargin{n}; 
str = fullfile(str, chompsep(curStr)); 
end 
 
if isunix && str(1)==pc_sep 
    str = strrep(str, pc_sep, unix_sep); 
elseif ispc && str(1)==unix_sep 
    str = strrep(str, unix_sep, pc_sep); 
end 
30 Jan 2009 Andrew Scott  
05 Feb 2009 Moody I've been using multicore for a while now and its absolutely excellent. I'm running on 5 dual xeon x5460 as well as a couple of quad core boxes.  
 
I was wondering if anyone compared performance of this toolbox with the parfor parallel computing matlab toolbox. Are they comparable?  
 
I believe I'm bottlenecked now due to the hard disk I/O, so I was looking at the in memory possibilities of this or potentially upgrading my hd's to solid state to reduce the overhead.  
 
BTW, I also tried precreating all the mat files once instead of doing multiple loops to reduce the I/O. Unfortunately, that didn't help as much as I hoped.  
 
Don't get me wrong though, this is much, much, much faster than single threading, but as always we need to keep pushing :).
10 Feb 2009 Markus Buehren I have opened a discussion group for the Multicore package on Yahoo. Please join and discuss with other users! 
 
http://groups.yahoo.com/group/multicore_for_matlab/
18 Feb 2009 Richard Crozier Fantastic program, and particularly suited to my work with genetic algorithms. There is one mnor error I've noticed though. In startmulticoreslave, if you activate debug mode you reach the following line (77):  
 
fileNr = str2double(regexptokens(parameterFileName,...  
'parameters_\d+_(\d+)\.mat'));  
 
But there is no regexptokens function, at least there isn't in R2007a or R2007b or R2008a. My solution is to replace this with the following lines (although I'm sure someone could come up with something more robust and/or elegent).  
 
fileNrCell = regexp(parameterFileName,'parameters_\d+_(\d+)\.mat', 'tokens');  
 
fileNr = str2double(fileNrCell{1});  
 
The program is excellent though, thanks again!
02 Apr 2009 Richard Brilliant tool! Great being able to sit back and watch a progress bar sliding along as a room full of computers gets to work doing your simulation. 
 
I have a question though - what dimensions are typically used for parameterCell? I've tried doing some large multidimensional runs (e.g. 4x150x10) and things seemed to grind to a halt - I'm still looking into it, just wondered if the dimensions I'm using are typical or too large. 
 
Cheers. Great work, 
Richard.
02 Jun 2009 dpb10  
03 Jun 2009 Thomas Great tool!
Please login to add a comment or rating.
Updates
26 Jan 2007 bad line breaks in description...
29 Jan 2007 Informational text added to demo, improvements in file access organization.
21 May 2007 Updated info due to new Matlab multicore functionality.
29 May 2007 Another note about multithreading
30 May 2007 Update of documentation contained in zip file, no changes to source code.
15 Jun 2007 Slave process will now create the temporary directory if it is not existing.
22 Jun 2007 There was a subfunction missing (which is only executed after a write error).
22 Jun 2007 Yet another small update.
25 Sep 2007 Improved support for small numbers of very long function evaluations.
12 Oct 2007 A file was missing - sorry.
14 Nov 2007 Update of contact information in documentation.
14 Nov 2007 Old e-mail address removed from help comments of m-files.
07 Dec 2007 A subfunction that is only executed on certain systems was missing.
03 Nov 2008 Subfunction datenum2 was not needed.
15 Dec 2008 Semaphore stuff improved.
17 Dec 2008 Forgot to include file chompsep.m
21 Dec 2008 Semaphore mechanism improved.
07 Jan 2009 Introduced parameter EVALSATONCE which causes the multicore package to do several function evaluations after each other before saving/loading and thus reducing the communication overhead. Demo function MULTICOREDEMO heavily commented.
18 Jan 2009 I have nearly re-written both master and slave in order to make the package even more robust and to reduce the overhead for inter-process communication.
27 Jan 2009 Another change to the semaphore mechanism.
22 Feb 2009 File regexptokens.m added.  
Dicussion group created: http://groups.yahoo.com/group/multicore_for_matlab
09 Mar 2009 If a slave is killed during working on a job, the master will now generate the parameter file of that job again instead of working on the file himself. This will increase performance in certain situations.
17 Mar 2009 Added an optional waitbar.
20 Mar 2009 Added estimation of time left in waitbar.
05 Apr 2009 Using system-dependent file separators in paths again. Waitbar shows progress during parameter file generation now.
07 Apr 2009 Two bugs fixed, one regarding the waitbar, one regarding the semaphore mechanism.
10 Apr 2009 In each multicore run, "clear functions" is now called once to ensure that changes to m-files take effect.
12 Apr 2009 Call to "clear functions" now in master and slaves, bug fixed.
13 Apr 2009 File displayerrorstruct.m was missing.
15 Apr 2009 Bug fixed.
17 Jun 2009 Estimation of time left changed, post-processing function introduced.
19 Jun 2009 Structure being passed to post-processing function changed (still undocumented feature)
Tag Activity for this File
Tag Applied By Date/Time
distributed processing Markus Buehren 22 Oct 2008 08:58:23
parallel computing Markus Buehren 22 Oct 2008 08:58:23
parallel processing Markus Buehren 22 Oct 2008 08:58:23
distributed computing Markus Buehren 22 Oct 2008 08:58:23
multiple core Markus Buehren 22 Oct 2008 08:58:23
distributed computing Zinedine 09 Nov 2008 08:33:33
parallel computing Zinedine 09 Nov 2008 08:35:55
multicore processing Marcio Sales 21 Dec 2008 02:19:34
multicore processing Douglas 14 Feb 2009 12:01:24

Public Submission Policy

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Disclaimer prior to use.

Contact us at files@mathworks.com