Why does matlab stop "executing" code after a long time?
Show older comments
I am using matlab on a Windows 10 machine with 8 gb of ram.
The system I am simulating has a degree of randomness in it which is why I need to make different runs of the same code in a loop and average over all of them. An outline of my code is like:-
% create and initialise variables 'mat' and others
for run=1:20
currMatForRun = includeRandomness(mat)
%processing with currMatForRun
end
One run takes around 2 hours.
The memory consumed seems to be constant at around 70%(I checked with resource monitor) and I don't think there is a memory leak. I am comfortably able to use my computer for browsing etc even when the code is running.
But after running for about 14-15 hrs, the code execution sort of freezes up, I get nothing on the console anymore. (I get an update roughly every two minutes). I tried waiting for 8 more hours but still nothing.
The memory and cpu usage in the resource monitor go down almost to zero and I get no error in the matlab console.
I tried the entire process 5 times, the time of freeze seemed to be similar for each time.
I am using a standalone version of matlab, I can run it without the internet. (So, I don't think there is a license polling problem, not completely sure though).
This is my first post, so not sure if I have given enough information, please tell me if you need anything else.
Edit:- I did another run and it stopped at a similar time. I did monitor resource consumption this time. Attached image. Legend is as follows
The blue is the processor use. Yellow is Working set, Red is private working set, green is page file bytes (virtual memory) and pink/magenta? is IO Data in Bytes per second.
I waited for a while before stopping the recording, and the time where the process stops can be seen clearly from the processor curve dip. If you think monitoring any other parameter might be useful, please tell me.
Things seem to be as I expected them (atleast on the surface), so still not sure what's wrong, ideas/thoughts appreciated!
Edit 2: I did two more things to check.
- I took out the random part of the code and ran the basic part of the code meaning I was pretty much running the same code over and over again with no changes.Things were completely deterministic. The program still got stuck after a similar amount of time.
- I ran a 10000 variable equation solving (as suggested by John in the comments) 5400 times which is ~ 15 hours and this ran without a hitch.
Overall the confusion has deepened, but hopefully the night is darkest before dawn.
15 Comments
John D'Errico
on 10 Jun 2017
We don't have your code, and have no clue what you are doing inside there. The problem could be multiple things. I might guess at virtual memory usage, so disk thrashing, or some sort of memory leak. Maybe it might be graphics related. Does your code involve user written Mex files?
When your code runs, check a monitor. Is it using all 8 GB of RAM? How much free disk space do you have?
When this happens, you might check to see if there is actual activity, in the form of disk thrashing, even though it seems nothing is happening. But really, you would need to provide some useful information, and even if you did provide code to test, someone might need to spend a day or so testing your code.
C V Ambarish
on 10 Jun 2017
dpb
on 10 Jun 2017
Is there an iteration internal that could be triggering an infinite loop with just the right combination of values, perhaps...
What about for debugging purposes
- instrument the routines more thoroughly to see where is in code execution and
- use a consistent RNG seed so can reproduce runs when trying to track down what's causing the (apparent) hangup.
John D'Errico
on 10 Jun 2017
There are no things that stand out in my mind. Beyond watching the state of the system using a monitor, I'd want to add in sufficiently many writes to the command window, that you can identify where it is hanging. Not a lot of information, so it slows things to a crawl, but enough that you can see what exactly it may be doing when it decides to freeze.
The consistent time to failure suggests to me that you are running out of some resource. But what is running dry is not obvious from what you have said.
Since there is randomness in each iteration, suppose you FIXED the random seed Before each iteration? So that it is doing exactly the same computations in each iteration? If it still fails at the same point, then you know for sure there is some resource that has run dry.
It really sucks that this takes 15 hours just to see it go dead. Perhaps my best suggestion is to give up after 14 hours. Save everything. Then restart MATLAB, and restart the computation, getting another 14 hours. That is the give up option of course. :)
C V Ambarish
on 12 Jun 2017
dpb
on 12 Jun 2017
Where in the trace did the apparent freeze show up?
I'm thinking more in terms of some internal-to-Matlab issue like are you using higher-level data constructions like structures, table, even cell arrays that could be causing nesting issues in iterative loops or somesuch.
Is there any iteration of any sort in the computations or an ode solver or the like? If it were all "just" straight matrix algebra/computation it doesn't seem possible, but with the abstract data structures there's a lot of behind the scenes activity besides just algebra.
I'd've put my money on graphics handles or the like but you say not using graphics--by any chance using one of the routines that has optional graphics that has differing behavior with/without return values? Mayhaps one of those is generating some internal graphics handles even tho no visible plot owing to coding error? Grasping at anything can think of here, obviously...
The instrumented run to try to track down just where it is when it "dies" will probably be the trick...
I don't suppose if you try to profile it that it will run fast enough to be practical at all...
Is there any way you can factor the app and call sub-pieces of the calculation to check on their performance with high numbers of calls before putting the whole thing together so can eliminate pieces?
Or, conversely, can you comment out calls to lower levels and see if can find one that will make the symptom disappear? The latter would seem to necessitate having a second test machine as that time would be lost for any useful output whereas the complete case at least does generate something useful each test.
John D'Errico
on 12 Jun 2017
This is becoming frustrating. For you too I suppose. Is it possible that after 15 hours or so, if the sound is turned on, you hear the voice of either Cleve Moler or Bill Gates saying, "I'm getting bored. Lets go do something else." :)
You have not mentioned any user written MEX code, or something that uses graphics heavily. Those are the things I'd suspect first.
It feels like something hardware related. I recall a utility that would allow you to monitor the temperature of all system components. We used it on a Mac laptop that was getting a bit overheated with heavy use. Can you find something like that for Windows? Is it possible that your machine is getting overheated with heavy use? If that were so, then you would have problems browsing when MATLAB is frozen.
I tried this, watching the memory required on my machine. It goes to about 2.3GB, and stays there until the solve is done.
clear
A = rand(10000);
b = rand(10000,1);
tic,c = A\b;toc
Elapsed time is 18.660448 seconds.
15*60*60/18
ans =
3000
For example, I tried the above on my machine. MATLAB itself uses 0.8 GB for me, with a clear workspace. Solving a 10Kx10K liner system requires about 18 seconds for me, with 4 processors on the job. 3000 such solves should use about 15 hours. (And it would have my system fan running hard for that time.)
So you might run a test like this:
for i = 1:3000,c = A\b;end
Does a similar problem happen? If so, then it suggests a hardware problem.
Or, suppose you run your code on a different machine? Does it crap out in the same way?
C V Ambarish
on 14 Jun 2017
John D'Errico
on 14 Jun 2017
Why do I feel like I'm in a game of "20 questions from hell", where the rules are not only don't we know what question to ask, but nobody knows the answer? :)
C V Ambarish
on 14 Jun 2017
John D'Errico
on 14 Jun 2017
Edited: John D'Errico
on 14 Jun 2017
Ok, so it looks like the problem is not your CPU. If the problem happens on another machine, then I'd bet the issue is in some code, perhaps containers.Map, or something that is written as Mex. My conjecture is the issue is with a bug in some C code, where something is not getting done right. If you are can, your best bet might be to send it into technical support, since this feels to me like a bug in their code.
(Oh, the worst part of 20 questions from hell, is the moderator is Calvin from the cartoon strip Calvin & Hobbes, who reserves the right to change the rules at a whim.)
Tristan Graham
on 21 May 2018
I'm having very similar issues with my script ... did you end up finding any solutions?
Jan
on 21 May 2018
@Tristan Graham: The description of the problem was not exact enough to locate the source of the problem. So how can you know, that you have similar issues? Please open a new thread and post more details. Use e.g. some output to the command window to narrow down where the problem occurs. Store intermediate states of the variables, such that you can start a debugging without the need to wait 15 hours.
KiranKumar Makam
on 18 May 2022
I too am facing the same issues I have timeline to meet and variables calculated are in the matlab its not responding. I am waiting for the one hour after pressing the pause button this is quite not expected. Is there a place where Matlab saves all the variables which we can access and save before we shut down the matlab forcefully. I am using Mac OS X Big Sur.
dpb
on 18 May 2022
Well, MATLAB has memory allocated for everything, yes, but it is not user-accessible from the outside by anything except a system monitor and all that will be is a memory dump...which won't be useful for your purpose.
As the OP of this thread noted, in his case he could write the results of each iteration to an output file during execution(*) and so retrieve what results were available; I'd suggest that would be your option as well -- save what you are computing and need/want periodically to disk.
(*) Which raises a Q? I don't think I recall being asked -- did he monitor that file -- could he have neglected to close file handle on that and that was the resource limit or somesuch?
Answers (0)
Categories
Find more on Graphics Performance in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!