fopen / fprintf: Write to same file from independent processes
15 views (last 30 days)
I have several Matlab instances running which all shall write to the same log file (text, line by line). Unfortunately I did not find a way yet to do that without data loss.
My assumption was that the fopen or at least the fprintf functions throw an error on attempting to open a file which had been opened by another Matlab instance resp. on attempting to write into a file which is currently being written by another Matlab instance.
Unfortunately this is not the case. Even the nbytes feedback of the fprintf function cannot be used to detect a failed writing attempt because it always states the number of bytes even if they have not been physically written.
So is there any simple solution for that?
I did not mention that I also tried ferror but even that does not help as it seems. Below is the test code I use. I call the function from one Matlab instance with no offset. Without disturbance it writes the numbers 1 to 200000 to the text file whch takes about 10 seconds. But if I call the function from a second instance (using an offset of e.g. 1000000) while the first function is still running, the file contains alternating numbers from both processes (which can be distinguished based on the offset). This would be perfectly fine if no data would be missing. But instead of 400000 lines I get 394942 lines only which contain 2450 blank lines, thus 392492 data lines only which means that 7508 data lines are missing (throughout both instances). Actually. In fact I even have 32 duplicated lines for some reason (all from one instance only). So the complete process is crap in the current implementation.
function sError = FileTest(sFileName_, iOffset)
[fID, e] = fopen(sFileName_, 'at'); % try to open text file for writing
sError = e;
if ~exist('iOffset', 'var')
iOffset = 0;
end % if
for i = 1:200000
fprintf(fID, '%s\n', num2str(i + iOffset));
[m, e] = ferror(fID);
if e ~= 0 % is never true although there is data which is not successfully written
disp([num2str(i + iOffset) ': ' m]);
end % for
end % FileTest
Guillaume on 28 Nov 2018
Your problem is not restricted to matlab. You would have the same problem in any other programming language. You basically have several processes writing to the same resource at the same time. Even in a multithreaded contest it's not trivial to manage, in a multiprocess contest it's a nightmare. You need to introduce concepts of locks, mutexes, semaphores, and for multiprocesses, interprocess communication, to manage that properly. With file IO, you also have to contend with the fact that the OS will do some buffering and will probably have a buffer per process and you don't really have much control over when that buffer gets actually written to the file (at least in matlab).
You need to review your design. The simplest is to have one log file per process (matlab instance). Easy to code, no interprocess communication to manage.
If you do need one common log file, then you must have a single matlab instance whose job is to write to the log file. For simplicity, it should be its only job. All the other matlab processes then communicate with that writer process to tell it what to write, so you need some sort of interprocess communication. I've never done that in matlab and there's not many tools available in matlab to develop that efficiently. In matlab, you possibly could do the inter-process communication via udp or tcp over the loopback adapter, or you could use memmapp files to emulate shared memory (see for example Share memory between application. Whatever method you use, it's going to be a lot of work so I'd go with option 1: one log per process.