Can we manipulate a file without opening it

5 views (last 30 days)
This question was flagged by Cris LaPierre
I have a question which I explain in bellow. Consider the following loop:
for i=1:10^6
A = Read a csv file;
A = perform some operations on A;
A= save the performed operations;
Apparently, the most time conssumming part is reading the file. If I use A=csvread(); then this is very time consumming. If I use fopen stuff it is
computationally cheaper but still time conssuming.
Do you have an idea to rewduce the computational time for what I intend to do?
I hope there is a way to do the above operations without actually opening any file (updating an existing file and saving the updates to the same file without opening it).
Any idea?
Thanks in advance!
Stephen23 on 25 Nov 2022
Don't read and write the file on every iteration. Just use an array and indexing.

Sign in to comment.

Answers (2)

Matt J
Matt J on 25 Nov 2022
Edited: Matt J on 25 Nov 2022
If you have one single file, the reading and saving of the file should probably happen outside the loop. Use the parfor loop to loop over sections of the data and keep them in Matlab memory until you are ready to save all of the results.
A = Read a csv file;
parfor i=1:10^6
A(i,:) = perform some operations on A(i,:);
A= save the performed operations;
Matt J
Matt J on 25 Nov 2022
Edited: Matt J on 25 Nov 2022
I don't think you should be using files to store and retrieve optimization results. I would structure the loops like this,
for i=1:I %Loop over batches
parfor j=1:J %Do a batch of optimizations in parallel
[x,fval,exitflag]=Run the optimizer
if exiflag<0 %optimization failed
if minf<bestVal

Sign in to comment.

Walter Roberson
Walter Roberson on 25 Nov 2022
I suggest that you switch to using parfeval() . Approximately
while you haven't gotten tired of it all
while number of active workers is less than number of cores
use parfeval() to create a new worker passing in a different initial condition
wait for a worker to finish, using a timeout
if any worker has been active longer than you want, cancel() the worker, end
if any workers have finished, fetch their results and update the notion of best, end
when you get tired of it all, cancel all remaining workers
  1 Comment
Mohammad Shojaei Arani
Mohammad Shojaei Arani on 26 Nov 2022
Hi Walter,
This is a nice approach, indeed! (I did not know about such things like parfeval before)
However, the problem is that in my optimization problem I do not set any termination criteria (it works forever). The reason is that my problem is super complex and it is difficult to know how much time the optimization solver needs and such things are problem-specific. Therefore, these are things that a 'human' should check rather than a 'machine' (well, I am not saying this is impossible to code but I think that at this stage of knowledge about meta-heuristic search algorithms it should be difficult). Sometimes, I get a solution rather fast (if stagnation does not occur) sometimes it is the opposite. It is difficult to tell a code to stop if it gets stagnated as the concepts of 'slow' and 'fast' are relative (sure, for single optimization problem I can come up with an approximative measure of slowness or fastness but my code should solve any generic problem).
So, for me if there is no way to see the best result (of all workers) in the command window this is not useful. All the solutions being proposed so far assume that there is a 'termination criteria' for a worker and this is the bottleneck which precludes to observe the best outcome in the command window (while all workers are still woirking. Actually they never finish).
I think, at this point I admit that I cannot solve my problem using a single csv file. Therefore, I use several csv files (= number of workers).
Thanks a lot!

Sign in to comment.


Find more on Parallel for-Loops (parfor) in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!