Why is the fopen function needed? (I want to know the way the programming language works.)

6 views (last 30 days)
I am decently experienced with Matlab (I am not a novice), but the way I learned it was pretty haphazard and mostly self-taught, rather than following some kind of logical education path. I also took a C++ class, but I don't remember this point being explained. My question is the following.
I don't understand why the fopen function needs to be used first if I want to use certain file reading functions (and then followed by the fclose command after finishing extracting the info). Other commands in Matlab like xlsread extract text and numbers from excel files directly without needing to bother with fopen first. I was watching a Matlab tutorial, and the guy said that "All functions that can read part of a file must be preceded by the fopen function to return the file identifier." And then the file identifier is used in whatever function to actually extract the data. But, like I said, the xlsread function doesn't need this; you just specify the file path. And it certainly does read a part of the file. Why can't all the other various Matlab functions that extract text/numbers from files also do it directly like that without using fopen first? Or can they? Is this just an example of one of multiple ways of accomplishing the same thing? Are there any situations where there is no way to get around using fopen?

Answers (1)

Walter Roberson
Walter Roberson on 17 May 2012
xlsread() and similar routines deal with a complete file at a time. The internal code for those routines does the equivalent of fopen() before dealing with the file, and fclose() afterwards.
In the simpler case, fopen() is an optimization: it keeps the operating system from having to repeatedly go right through the path parts and checking permissions and locating the file headers on disk and interpreting them. fopen() causes the operating system to create a block of data in the OS that remembers information about the file so that the program can access it quickly.
In more complex situations, the permissions might be changed while the file is in use. If the operating system had to start from scratch for each I/O operation, it might find that there is no longer permission to do something that had already been started. Instead, when fopen() is used, the permitted access types are recorded at the time of fopen(), and the routine is allowed to continue to use the original permissions until the file is fclose()'d. This can turn out to be important in security.
As well as the permissions perhaps being changed, the directories or file could get renamed, possibly even with a different file (with a different owner even!) being moved into place. It would be a PITA for the program to have to check those things for every read or write operation, to make sure that the file that is there now is the same file as was there one instruction before. Instead, when fopen() is used, the operating system data block that is created has information about the file no matter what its current name or location is, so that it is no longer necessary to check each time. This can be quite important for security.
When fopen is used, the operating system keeps track of where in the file the access is occurring. If the file access had to be started from scratch each time, then the program would have to remember the position information and pass it in to every read and write operation. Getting it wrong would lead to file corruption. It would also be necessary for every write operation to be completed before the next instruction could happen (because you are postulating that no fclose() would b provided to tidy the file up). This turns out to be a substantial efficiency issue: when the operating system gets to remember a bunch of output and write it out when convenient (or when the file is closed), the "buffering" strategy decreases the disc access by an amazing amount. There is a similar efficiency gain when the operating system is permitted to read-ahead in the file, to "buffer" the input. That efficiency gain would not be possible if every read or write operation had to be independent, because it would have to be assumed that some other program might have written into the next portion of the file. The fopen() / fclose() design allows operating system rules to be established that allow reasonable efficiency and reduce the problem of multiple programs accessing the same file.
  2 Comments
Alex
Alex on 18 May 2012
Awesome, thank you. My C++ class had very little of this kind of background. Where/how do you learn these details?
Walter Roberson
Walter Roberson on 18 May 2012
"One has to know these things if one is to be King."
I have been programming for 35 years. I've picked up a few things here and there :-)

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!