Matlab slow when too many files in directory

I have a large number of data files in the my working directory. About 10000 - 30000. Whenever the number of files gets to be too many (looks like a few thousand will do it), matlab becomes very slow to do anything. Is there a way to prevent this ? Making sub-directories is not an option. The files are on a local disk on the computer, not on a network disk. I only load or save to one file at a time.

9 Comments

Is that data directory on your MATLAB path ?
How are you loading and saving files?
Which operating system are you using?
I'm working in this directory. Its not in my path. I'm using 'load' and 'save' commands in matlab. I'm using Mac OS X 10.8.5.
Are you calling dir()? Are you loading the filenames into a listbox or anything? Try "Run and time" to see what line of code is eating up all the time.
No, I'm not executing any command. As soon as I cd to the directory, matlab takes forever to execute ANY command. Its as if it is trying to collect information on each file in the directory and since there are a lot, it takes forever.
So if you have a simple 2 line script:
cd('c:\massive folder');
fprintf('Now in massive folder\n');
it will take forever to printf that out.
I have the same problem. I have more than 180,000 files in data folder. Matlab becomes very slow after I change the current folder. It is really slow, even just click the sub window title.
Same problem with Matlab R2013a and Ubuntu 14.04.I tried to run with the -nodesktop option, it does help but not for long. If i downgrade (R2012b, R2011b, R2010a or R2009a) will i still get the same problem?
I have encountered the same problem with R2013a under Ubuntu 12.04. But it is easy to resolve. Don't run Matlab in a directory containing thousands of files. Just place all of the files in a Data subdirectory and give the path. Say I want all of files of the form SPEC.* from the subdirectory MyData. I would set STEM='MyData' and then
d=dir(STEM,'/','SPEC.*');
fname=[STEM,'/',d(1).name];
a=load(filename);
etc...
Make STEM a varargin, set to a default of '.' for flexibility.
"As soon as I cd to the directory..."
And that is the problem right there. Using cd in order to access data files slows MATLAB down, makes debugging harder, and is totally unnecessary. All MATLAB functions that read/write data files accept absolute/relative filenames, so you can simply avoid this whole problem by writing better code:
  1. keep data and code in separate folders,
  2. use absolute/relative filenames instead of cd.
Note that it is recommended to use fullfile to create filenames, rather than string concatenation.

Sign in to comment.

Answers (2)

Matt J
Matt J on 12 Jan 2014
Edited: Matt J on 12 Jan 2014
Do you have the Current Folder window open? Do things improve if you close it (before CD-ing to the directory)?
Every time you change the current directory, MATLAB has to examine each and every file in the new directory to see if there are any m-files, mex-files, etc. If so, these will shadow any other m-files or mex-files etc that are on the MATLAB path. This can take some time for MATLAB to determine. Bottom line is don't have a huge number of files in these directories (e.g., see the last comment above).

1 Comment

Yes, I thought this is what was happening. One additional way to handle this would be to remove the current directory from the search path, but this does not seem to be possible. The current directory seems to be hard-wired, perhaps because of naming issues (i.e. m-files in the current directory take precedent over other m-files with the same name). The bottom line is that this behaviour by Matlab actually serves to enforce good programming practice: don't mix your source codes in with your data.

Sign in to comment.

Categories

Tags

Asked:

on 11 Jan 2014

Commented:

on 13 Jan 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!