fgetl drops part of line

3 views (last 30 days)
Todd
Todd on 26 Mar 2013
I've been having an issue with the fgetl function where it occasionally drops part of a line. I'm reading a 10-15 MB ASCII text file across a LAN. The file is opened in a binary read mode. Each line in the text file is a variable length and in composed of both numeric and character data. There are approximately 100K lines per file. Each line has a character string associated with it. Based on this string, there's a check to make sure the line has the correct number of fields. If it doesn't, an error is generated. I get this error once approximately 5% of the time I read one of these files. If I read the file in again, it works fine. It appears completely random. I don't get any errors from the fgetl function when this happens.
Has anyone ever seen something like this? I don't know if this is a MATLAB problem, a Windows problem (it's XP Pro), or a LAN/server problem.
I thought it might be related to LAN loading, but it happens in the middle of the night as well as during the day. I'm in the process of copying the files to a local drive before reading them to see if this makes any difference. Should I be opening the file using the 'rt' option?
Any help would be greatly appreciated. Thanks.
  1 Comment
Wouter
Wouter on 26 Mar 2013
strange problem; you could try to first copy the entire file using movefile and then reading it using your function.

Sign in to comment.

Answers (7)

Image Analyst
Image Analyst on 26 Mar 2013
Why are you not opening your ASCII text file in text mode instead of binary mode? I'd open it in text mode if you know it's all text. If the problem remains, upload your file and code.
  1 Comment
Jan
Jan on 26 Mar 2013
Opening a text file in binary mode is less susceptible for the side-effects of non-printables. E.g. it is much easier to control the line breaks, avoid stopping at ^Z characters, the i.th file position is the i.th byte without considering CHAR([13,10]) as 1 character, etc. Therefore accessing files in binary mode is faster.

Sign in to comment.


Todd
Todd on 26 Mar 2013
I tried using fopen with the 'rt' option. Still get the same problem. I've always used fopen with the default mode for text files for years without any apparent issues. But, they were usually smaller files and didn't have any error checking. So, if there was a problem, I probably wouldn't have known.
Doing some runs where we move the file from the LAN drive to a local drive before reading it. We'll see how that goes after a few hours.

Jan
Jan on 26 Mar 2013
The strangeness of the problem implies, that there is a deterministic reason. Most (to be exact all) magic problems I've seen, had such a reason.
E.g. does another process write to the files while you are reading them?
Or do you use MEX-files, which can corrupt the memory manager? You can provoke any kind of non-reproducable errors by C-Mex functions...
5% of all files are concerned, which is a high value. You can check the network connection by calculating the MD5 of some large (GB) files repeatedly.
When the reading fails, print out the read line, go back the required number of bytes, read the line again and print it also. Then post the difference between these lines. Does this reveal any pattern? E.g. does it depend on the line number or the contents? Do you get any entries in the error log of the operating system?

Cedric
Cedric on 26 Mar 2013
Edited: Cedric on 26 Mar 2013
Could you copy/paste here the first 10-20 lines of one of these files? Have you tried to read the full file in one shot and then process it line y line, instead of reading it line by line?
The fact that it is working with the same file if you re-launch the import after a failure seems to indicate that it's not a problem of file content (e.g. partly binary data that screw FGETL), and my first reaction would be to check that there no process that updates these files in background while you are trying to read them (you'd have to manage this with a lock/semaphore mechanism).

Todd
Todd on 26 Mar 2013
No other processes or users are accessing the files while I'm reading them.
I do have a .dll that is used, but not until after the file is read in. That's not to say something bad is happening with it.
The 5% is over all the files. It's more like 1 line will have the problem out of 1-2 million lines read.
When it happens, I've been able to go back the required number of bytes and re-read the line with fgetl and the line is read correctly. It happens on different lines and at different points within the line.
  1 Comment
Cedric
Cedric on 26 Mar 2013
Edited: Cedric on 26 Mar 2013
Looks like some network issue; did you try to copy files on your hard drive a read them from there? If you have one file that is fine and another that is not, you can repeat their processing e.g. 1000 times and see whether the outcome is always fine on the good file and always wrong on the bad one. If it is the case, then either something is updating these files in the background (which is not the case up to what you said), or there is something wrong with your network .. If you find no bad file after copying them on your hard drive (which should not happen if you have a network issue), or significantly fewer of them than when you slowly read them line by line, I would still think that some process is accessing the remote files while you are reading them.

Sign in to comment.


Todd
Todd on 27 Mar 2013
Made the change to copy the text files from the LAN drive to a local drive before reading them. Read over 3M lines overnight without any errors. Typically, we'd hit the error within 1M lines. So, that looks promising.
We are creating a simple test that we can run on two different LANs to see if it's LAN specific, or something related to MATALB reading the file across a LAN. Hopefully, we'll have some results in a couple of days.
  1 Comment
Cedric
Cedric on 27 Mar 2013
Well, good luck; these are definitely complicated and annoying issues!

Sign in to comment.


Todd
Todd on 29 Mar 2013
We were able to run 10M lines on two different LANs without any issues. So, it's something on the LAN we've been working on. Moved the data files from the server to another PC and read the files from there. Still had the problem. So, it's not the server. Then, we disabled the Windows firewall on the PC that had the data files. We haven't been able to repeat the problem. So, it looks like it has something to do with the Windows firewall. Don't know if it's an underlying timing/latency issue with fgetl (really fgets) and the Windows firewall or what. The other LANs we ran on weren't using the Windows firewall, but McAfee HIPS.
The next test will be to disable the Windows firewall on the server. But, finding time to take the server down is always a challenge.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!