I've found several references which tell that Matlab's filter is multi-threaded since R2007a, e.g. MathWorks:Solution 1-4PG4AN.
x = rand(1e6, 2); x1 = x(:,1); x2 = x(:,2);
% [B, A] = butter(3, 0.2, 'low'); % Butterworth 3rd order lowpass filter B = [0.01809893300751, 0.05429679902254, 0.05429679902254, 0.01809893300751]; A = [1, -1.760041880343, 1.182893262037, -0.2780599176345];
tic; for i=1:100 y = filter(B, A, x); % Matrix % clear('y'); % Avoid smart JIT interferences => same effects! end toc
tic; for i=1:100 y1 = filter(B, A, x1); % Two vectors y2 = filter(B, A, x2); % clear('y1', 'y2'); % No qualitative changes end toc
[EDITED, 12-Dec-2012 22:38 UTC]: Explicite A and B instead of calling butter of SPT
Results on a Windows7/64 Core2Duo:
Matlab 2009b/64: 5.34 sec (matrix) 5.22 sec (two vectors)
Matlab 2009b/64 started with -singleCompThread: 5.23 sec (matrix) 5.24 sec (two vectors)
Matlab 2011b: 4.75 sec (matrix) 4.99 sec (two vectors)
My expectations: 1. The value of a filtered signal to a specific time depends on the complete history for an IIR filter like the Butterworth. Therefore the filtering of a vector cannot take advantage from multi-threading (is this correct?). 2. In opposite to this, filtering a [n x 2] signal should occupy two cores, such that a multi-threaded filter should need approximately the same time as for a [n x 1] signal (is this correct?).
But my double-core processor has a load of 57% during the calculations and the filtering needs nearly the same time, when I start Matlab with the -singleCompThread flag.
My conclusion: It looks like filter is not multi-threaded. Can somebody confirm this impression for 4 or 8 cores? Then with "x = rand(1e6, 8)" and "x1" to "x8". I get equivalent results for FIR filter parameters with A=1. For a 12th order Butterworth the matrix method gets an advantage of 10%.
No products are associated with this question.
I ran this on my 8-core machine using R2009a:
function myFilterTest x = rand(1e6, 8); x1 = x(:,1); x2 = x(:,2); x3 = x(:,3); x4 = x(:,4); x5 = x(:,5); x6 = x(:,6); x7 = x(:,7); x8 = x(:,8); [B, A] = butter(3, 0.2, 'low'); tic; for i=1:100 y = filter(B, A, x); % Matrix % clear('y'); % Avoid smart JIT interferences => same effects! end toc tic; for i=1:100 y1 = filter(B, A, x1); % Eight vectors y2 = filter(B, A, x2); y3 = filter(B, A, x3); y4 = filter(B, A, x4); y5 = filter(B, A, x5); y6 = filter(B, A, x6); y7 = filter(B, A, x7); y8 = filter(B, A, x8);
% clear('y1', 'y2'); % No qualitative changes end toc
And got this:
Elapsed time is 16.865596 seconds. Elapsed time is 16.117599 seconds.
Only one core was active during each test.
I ran this on my 8-core machine using R2011a and got:
Elapsed time is 12.542615 seconds. Elapsed time is 16.268821 seconds.
All eight cores were active for the first test (on the matrix) and only a single core for the seconds test (on individual vectors).
I added this to the bottom of the test:
y_par = zeros(size(x)); matlabpool(8);tic; parfor j = 1:8 for i=1:100 y_par(:,j) = filter(B, A, x(:,j)); end % clear('y_par'); % No qualitative changes end toc; matlabpool close;
And got this when using R2011a:
Elapsed time is 13.305009 seconds. Elapsed time is 16.398203 seconds. Starting matlabpool using the 'local' configuration ... connected to 8 labs. Elapsed time is 3.542021 seconds. Sending a stop signal to all the labs ... stopped.
I gave it a try on my quadcore laptop. Using your test code using MATLAB singlethreaded or multithread indeed nearly makes no difference (in the multithreaded case the code runs with about 15% CPU in contrast to the usual 12.5% (because of hyperthreading) I see for singlethreaded code.
But: if I increase the number of columns I do see a benefit (I changed x to be rand(5e5, 20) and added a loop for the call to filter on the columns. The comparison probably isn't that fair anymore, but at least the CPU runs at about 50% ...
I contacted our development for clarification, my personal impression so far: yes, filter is multithreaded, but does not benefit as strongly as other functions do ...
As a summary answer:
The MATLAB documentation says FILTER is multi-threaded. As of r2011b, it is neither multi-threaded for Nx1 arrays nor NxM arrays. For NxM arrays a parfor loop allows for considerable speedups.
Testing without butter()
R208b on Linux 64, 8 Xeon E5410 processors.
Default (maxNumCompThread is 8)
Elapsed time is 9.219355 seconds. Elapsed time is 9.215591 seconds.
maxNumCompThread = 1
Elapsed time is 9.215270 seconds. Elapsed time is 9.226053 seconds.
Really though the differences in timing are within the margin of error: my various runs had more variance than the difference between the figures I post above.
For me, running 11b on a dual-core MacBook Pro (i5), multi-threading kicks in only if variable x is at least 8 columns wide. Like most other reports in this post, my machine takes ~2.5 seconds per column when the column count is low. If I kick the column count up to 16, the time gets down to under 1.5 seconds per column.
I believe that some functions require large enough sizes (and possibly lengths equal to powers of 2) before they can benefit from multi-threading. As for the complete history part, you can implement filtering with convolution. Convolution is multiplication in the frequency domain. This means you can implement filtering based on the FFT. There are multi-threaded versions of the FFT. I believe these implementations are quite complicated and require message passing and substantial overhead. I believe that the FFT of a column vector is also multi-threaded in MATLAB.