I am solving a stochastic process model of electron flow in a 3D grid of paths. During the simulation flow shuts-off sequentially in one after another of paths until about 40% are off. In the simulation 90% of the computational time is spent solving a sparse linear equation set, A x =b, over and over and over again.

A is symmetric , banded (7), very sparse and has a real, positive diagonal. The vector b is all zeros except for the b(1) element which never changes. Mldivide works very well on my workstation giving highly accurate solutions in 4 minutes for an A matrix 10^6 by 10^6. At a size of 1.1x10^6 by of 1.1x10^6 the solution bogs down because all RAM is in use and the hard disk is employed. No parallel processing tool box is employed.

spparms('spumoni',2) gives:

sp\: bandwidth = 10000+1+10000.

sp\: is A diagonal? no.

sp\: is band density (0.00) > bandden (0.50) to try banded solver? no.

sp\: is A triangular? no.

sp\: is A morally triangular? no.

sp\: is A a candidate for Cholesky (symmetric, real positive diagonal)? yes.

sp\: is CHOLMOD's symbolic Cholesky factorization (with automatic reordering) successful? yes.

sp\: is CHOLMOD's numeric Cholesky factorization successful? yes.

sp\: is CHOLMOD's triangular solve successful? yes.

Cholesky flop count: 2.128e+013

Nonzeros in L: 1.5914e+009

memory blocks in use: 15

memory in use (MB): 16016.3

peak memory usage (MB): 19036.5

maxrank: update/downdate rank: 8

supernodal control: 1 40 (supernodal if flops/lnz >= 40)

nmethods=0: default strategy: Try user permutation if given. Try AMD.

Select best ordering tried.

method 0: user permutation (if given)

method 1: AMD (or COLAMD if factorizing AA')

prune_dense: for pruning dense nodes: 10

a dense node has degree >= max(16,(10)*sqrt(n))

flop count: 2.128e+013

nnz(L): 1.5914e+009

I am running Matlab 2011b on dual processor hardware (each with 6 cores) and 32 GB ram.

During a single experiment after an initial pre-event solution of the matrix equation, a sequence of 3.6x10^5 events occur. For each event, only four elements in the A matrix change. They become near zero (10^-09 times smaller). These elements consist of only two elements each on two separate rows of the A matrix which are within 1 to 10^4 rows of each other. That is rows that are adjacent or within 10,000 rows of each other. Local values of x in the immediate vicinity of the event change in value the most significantly. The location of the next event in the sequence depends upon a random event, and very importantly the past history and the current, updated values of the x vector. Therefore, x =A\b needs to be recalculated after each event.

THE QUESTIONS:

1. Since 4 minutes times 3.6x10^5 events equals 65.7 years with my primitive program using mldivide, are there ways to speed the solution of x =A\b . Are there special iterative methods, matrix partition methods, symbolic factorization methods or etc. that can produce speed-ups of 100 to 1000 times for this repetitive solution problem. Note that I see ways to reduce the run time spent doing everything but the mldivide by a factor of 1000 (vectorization, allocation, optimization, etc.).

2. Can parallel processing software and hardware (Tesla gpu’s) speed this model by a factor of 1000? Hardware and software costs are a constraint. I cannot go wild here.

3. Are the Matlab software on a Windows workstation the tools I should be working with?

Looking forward to your suggestions.

Thank you,

Bill

"William " <afloemans@comcast.net> wrote in message

news:lkj2d3$ope$1@newscl01ah.mathworks.com...

> Hi all,

> I am solving a stochastic process model of electron flow in a 3D grid of

> paths. During the simulation flow shuts-off sequentially in one after

> another of paths until about 40% are off. In the simulation 90% of the

> computational time is spent solving a sparse linear equation set, A x =b,

> over and over and over again.

> A is symmetric , banded (7), very sparse and has a real, positive

> diagonal. The vector b is all zeros except for the b(1) element which

> never changes. Mldivide works very well on my workstation giving highly

> accurate solutions in 4 minutes for an A matrix 10^6 by 10^6. At a size

> of 1.1x10^6 by of 1.1x10^6 the solution bogs down because all RAM is in

> use and the hard disk is employed. No parallel processing tool box is

> employed.

> spparms('spumoni',2) gives:

*snip*

> I am running Matlab 2011b on dual processor hardware (each with 6 cores)

> and 32 GB ram.

>

> During a single experiment after an initial pre-event solution of the

> matrix equation, a sequence of 3.6x10^5 events occur. For each event,

> only four elements in the A matrix change. They become near zero (10^-09

> times smaller). These elements consist of only two elements each on two

> separate rows of the A matrix which are within 1 to 10^4 rows of each

> other. That is rows that are adjacent or within 10,000 rows of each

> other. Local values of x in the immediate vicinity of the event change

> in value the most significantly. The location of the next event in the

> sequence depends upon a random event, and very importantly the past

> history and the current, updated values of the x vector. Therefore, x

> =A\b needs to be recalculated after each event.

> THE QUESTIONS: 1. Since 4 minutes times 3.6x10^5 events equals 65.7

> years with my primitive program using mldivide, are there ways to speed

> the solution of x =A\b . Are there special iterative methods, matrix

> partition methods, symbolic factorization methods or etc. that can

> produce speed-ups of 100 to 1000 times for this repetitive solution

> problem. Note that I see ways to reduce the run time spent doing

> everything but the mldivide by a factor of 1000 (vectorization,

> allocation, optimization, etc.).

I don't know what kind of speed up you'll find, but perhaps the sparse

iterative solvers can give you a couple of benefits.

http://www.mathworks.com/help/matlab/linear-equations-iterative-methods.html

1) If you can compute A*x and possibly A'*x for a solver-specified x without

actually constructing A most (or all) of these solvers will let you specify

a function handle that evaluates those operations. Even though the matrices

are sparse, they're large enough that they will still consume a good chunk

of memory. That's likely your bottleneck.

2) If solving the system "close enough" is sufficient for your simulation,

try specifying a termination tolerance.

3) Using a preconditioner may help.

http://en.wikipedia.org/wiki/Preconditioner

4) If you're only making minor changes to the coefficient matrix, then

hopefully the solution at the previous iteration is close to the solution

for the current iteration. If that's the case, you could specify the

previous iteration's solution as the initial guess for the current

iteration.

> 2. Can parallel processing software and hardware (Tesla gpus) speed this

> model by a factor of 1000? Hardware and software costs are a constraint.

> I cannot go wild here.

I don't know, but I suspect the answer is no.

--

Steve Lord

slord@mathworks.com

To contact Technical Support use the Contact Us link on

http://www.mathworks.com

You can think of your watch list as threads that you have bookmarked.

You can add tags, authors, threads, and even search results to your watch list. This way you can easily keep track of topics that you're interested in. To view your watch list, click on the "My Newsreader" link.

To add items to your watch list, click the "add to watch list" link at the bottom of any page.

To add search criteria to your watch list, search for the desired term in the search box. Click on the "Add this search to my watch list" link on the search results page.

You can also add a tag to your watch list by searching for the tag with the directive "tag:tag_name" where tag_name is the name of the tag you would like to watch.

To add an author to your watch list, go to the author's profile page and click on the "Add this author to my watch list" link at the top of the page. You can also add an author to your watch list by going to a thread that the author has posted to and clicking on the "Add this author to my watch list" link. You will be notified whenever the author makes a post.

To add a thread to your watch list, go to the thread page and click the "Add this thread to my watch list" link at the top of the page.

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Got questions?

Get answers.

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi test

Learn moreDiscover what MATLAB ^{®} can do for your career.

Opportunities for recent engineering grads.

Apply TodayThe newsgroups are a worldwide forum that is open to everyone. Newsgroups are used to discuss a huge range of topics, make announcements, and trade files.

Discussions are threaded, or grouped in a way that allows you to read a posted message and all of its replies in chronological order. This makes it easy to follow the thread of the conversation, and to see what’s already been said before you post your own reply or make a new posting.

Newsgroup content is distributed by servers hosted by various organizations on the Internet. Messages are exchanged and managed using open-standard protocols. No single entity “owns” the newsgroups.

There are thousands of newsgroups, each addressing a single topic or area of interest. The MATLAB Central Newsreader posts and displays messages in the comp.soft-sys.matlab newsgroup.

**MATLAB Central**

You can use the integrated newsreader at the MATLAB Central website to read and post messages in this newsgroup. MATLAB Central is hosted by MathWorks.

Messages posted through the MATLAB Central Newsreader are seen by everyone using the newsgroups, regardless of how they access the newsgroups. There are several advantages to using MATLAB Central.

**One Account**

Your MATLAB Central account is tied to your MathWorks Account for easy access.

**Use the Email Address of Your Choice**

The MATLAB Central Newsreader allows you to define an alternative email address as your posting address, avoiding clutter in your primary mailbox and reducing spam.

**Spam Control**

Most newsgroup spam is filtered out by the MATLAB Central Newsreader.

**Tagging**

Messages can be tagged with a relevant label by any signed-in user. Tags can be used as keywords to find particular files of interest, or as a way to categorize your bookmarked postings. You may choose to allow others to view your tags, and you can view or search others’ tags as well as those of the community at large. Tagging provides a way to see both the big trends and the smaller, more obscure ideas and applications.

**Watch lists**

Setting up watch lists allows you to be notified of updates made to postings selected by author, thread, or any search variable. Your watch list notifications can be sent by email (daily digest or immediate), displayed in My Newsreader, or sent via RSS feed.

- Use a newsreader through your school, employer, or internet service provider
- Pay for newsgroup access from a commercial provider
- Use Google Groups
- Mathforum.org provides a newsreader with access to the comp.soft sys.matlab newsgroup
- Run your own server. For typical instructions, see: http://www.slyck.com/ng.php?page=2

You can also select a location from the following list:

- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)