Path: news.mathworks.com!not-for-mail
From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: Multithreading: Negative returns?? Puzzling benchmarks
Date: Wed, 18 Mar 2009 00:46:01 +0000 (UTC)
Organization: University College Dublin
Lines: 43
Message-ID: <gppg89$6v$1@fred.mathworks.com>
References: <3b948ba5-b0c4-4b53-8352-de5641ddd313@o2g2000prl.googlegroups.com>
Reply-To: <HIDDEN>
NNTP-Posting-Host: webapp-05-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1237337161 223 172.30.248.35 (18 Mar 2009 00:46:01 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Wed, 18 Mar 2009 00:46:01 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 87230
Xref: news.mathworks.com comp.soft-sys.matlab:525711


Michael Johnston <mkjohnst@gmail.com> wrote in message <3b948ba5-b0c4-4b53-8352-de5641ddd313@o2g2000prl.googlegroups.com>...
> I just got a new computer with dual Xeon 5450s at 3ghz (8 CPUs,
> total). I decided to see how well the multi-threading in BLAS and
> LAPACK would work, so I ran a simple test: Multiply two matrices
> together a bunch of times, and then do the same thing with the
> division operator. Perform this test for a number of threads up to the
> number of CPU cores. Then plot the percentage change in the execution
> time relative to the single threaded case.
> 
> The result is strange. I certainly expected diminishing returns to
> scale to multi-threading as the number of threads increased. But I
> never expected to see *diminishing* returns to scale. While smaller
> matrices perform relatively worse, presumably as a result of overhead
> from thread creation, my benchmarks indicate that even for reasonably
> sized matrices (e.g., 500-by-500) the returns to multi-threading
> become negative surprisingly quickly.
> 
> I'm very surprised to see this on a new shared-memory system. Has
> anyone else gotten benchmarks like this? I have posted a graph of the
> plot, as well as the benchmark code I wrote, on my web site with more
> information: http://michaelkjohnston.com/perm/mt8bench/
> 
> Any ideas?? Anecdotes? Theories?
> 
> Best regards,
> 
> Michael




Dear Michael,

The matrices used in your test above are tiny : 14x14 and 200x200.

Take a look at these test results on a Dell Precision 690 with dual Xeon 5345s at 2.3GHz, 8GB ram. 
http://www.derekroconnor.net/Software/Benchmarks.htm

These tests show substantial multicore speedups for Matmult and LU Decomp, but very little speedups for SVD or EIG .

Regards,

Derek O'Connor