parfor (file reading)

Question

AP on 10 Nov 2011

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/20863-parfor-file-reading

Hi all,

I am trying to use parfor in order to speed up the reading of 1000 ascii files. Each file is in the following format:

10 lines describing the data and is the header of the file.
the rest of the lines are in the format '%f %f %f %f' containing the values of x, y, z1, z2 variables. The number of these data are up to 10000.

x and y represents the rectangular domain in which z1 and z2 has been measured. Therefore, the domain remains the same among 1000 files. I want to use parfor and store one vector 10000×1 for x, one vector 10000×1 for y, one array 10000×1000 for z1 and one array 10000×1000 for z2.

I used the following pseudocode:

parfor i=1:1000
   fid=fopen(fname,'r')
   data=textscan(fid,'%f %f %f %f','HeaderLines',10);
   x=data{1}
   y=data{2}
   z1(:,i)=data{3}
   z2(:,i)=data{4}
end

I get the error "The variable z1 in a parfor cannot be classified". The error may arise from the indices which are restricted in parfor loop.

Is there a better way for reading these 1000 files in parallel?

Thanks.

1 Comment
Show -1 older commentsHide -1 older comments

Edric Ellis on 10 Nov 2011

That code should work - in your real code, are you using 'z1' in some other way within the loop?

Sign in to comment.

Sign in to answer this question.

Answer 1

Daniel Shub on 10 Nov 2011

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/20863-parfor-file-reading#answer_27524

I am not sure how exactly MATLAB handles file reading and how hard drives handle multiple read request, but my guess is that distributing a job that is IO limited across multiple processors will not speed it up.

1 Comment
Show -1 older commentsHide -1 older comments

Walter Roberson on 10 Nov 2011

Surprisingly, you can get better performance with parallel reads -- at least if you are using SCSI drives with ENQ (enqueue) turned on which allows the drive to re-order read requests according to which destination is "closest" to where it currently is. In common situations, the performance increases up to four parallel reads; in some data access patterns, the performance can continue to climb beyond four parallel reads, but the performance improvement past 4 is not wonderful (but if you have terabytes to get through, you'll take whatever performance increase you can get.)

It also helps if the file you are reading is not compressed and you use scatter/gather I/O.

I do not have any information on drive queue management in the newer PC drives.

Sign in to comment.

parfor (file reading)

1 Comment
Show -1 older commentsHide -1 older comments

Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

parfor (file reading)

1 Comment Show -1 older commentsHide -1 older comments

Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments

1 Comment
Show -1 older commentsHide -1 older comments