No BSD License  

Highlights from
Robust Data File Reading utility (RDFREAD)

5.0

5.0 | 3 ratings Rate this file 4 Downloads (last 30 days) File Size: 7.86 KB File ID: #6090

Robust Data File Reading utility (RDFREAD)

by Michael Boldin

 

20 Oct 2004 (Updated 01 Nov 2004)

Imports data from a comma- or tab-delimitted file.

| Watch this File

File Information
Description

RDFREAD handles comma-delimitted and tab-delimitted files and expects, but does not require the variable (column) names to be in the first line. Data may have missing values and character values mixed in the numeric data columns. All non-numerics in data lines are changed to NaN. This function does not properly read character variable columns, but instead turns such data into an entire column of missing values.

If the firstline has no numerics, it is assumed to be a header line with variable names. In this case a structure class is created with the fields name.varnames & name.data. If there is no headerline, a simple data matrix is created.

If there is no output argument (nargout=0), the results are put into base memory using filename as name.

       examples
         RDFREAD -- no input args, uses GUI file selection (uigetfiles)
         RDFREAD(1) -- command line version
         RDFREAD('file_name','C:\data_path\') -- works (path is optional)
         xdat= RDFREAD('file_name') -- results placed in xdat

        Below is an example of a messy data file that this utilty fucntion can read.

          SECID,DATE,PRC,X1
          10001,19970131,8.625,1
          10002,1997215,13,
          10003,,-15,4
          1099a,o.0,.T,NaN
          xd,,.,1.111
          999,2,3,6

       In this case, the RDFREAD 'data' results are

        10001 19970131 8.625 1
        10002 1997215 13 NaN
        10003 NaN -15 4
          NaN NaN NaN NaN
          NaN NaN NaN 1.111
          999 2 3 6

       and filename.varnames=

       'SECID' 'DATE' 'PRC' 'X1'

       Note that this function is considerably slower than i/o routines
       such as LOAD and DLMREAD because it reads and parses one line
       at a time, but it works in a much more robust manner as long as you
       do not need to read in 'character' columns.

10/25/2004 -- added block pre-allocation step to speed processing of large files

MATLAB release MATLAB 7.0.1 (R14SP1)
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (4)
01 Mar 2005 Francis Pieraut

It work pretty well except if values are string, they are replace by NaN considering it return an array of double.

20 May 2005 Kat F

It makes Matlab newbie's lives (like myself) a lot easier.

03 Dec 2007 Tom Stafford

Worked great for me! Neat

25 Jun 2008 Andreas Dein

Awesome! Worked very well! And very robust, too!

Please login to add a comment or rating.
Updates
26 Oct 2004

10/25/2004 -- added block pre-allocation step to speed processing of large files

01 Nov 2004

Added block pre-allocation step to speed processing of large files. M-file updated.

Tag Activity for this File
Tag Applied By Date/Time
data import Michael Boldin 22 Oct 2008 07:33:29
data export Michael Boldin 22 Oct 2008 07:33:29
import Michael Boldin 22 Oct 2008 07:33:29
data Michael Boldin 22 Oct 2008 07:33:29
comma Michael Boldin 22 Oct 2008 07:33:29
tab Michael Boldin 22 Oct 2008 07:33:29
reading Michael Boldin 22 Oct 2008 07:33:29
utility Michael Boldin 22 Oct 2008 07:33:29
robust Michael Boldin 22 Oct 2008 07:33:29

Contact us at files@mathworks.com