Does IMPORTDATA or any other File I/O function allow me to read in an indeterminate size data into a single cell array?

1 view (last 30 days)
I have a comma-delineated file of indeterminate size, where some of the entries are numbers and some are text and I don't know in advance which are which.
How can I pull that into a single cell array that combines text and numbers?

Accepted Answer

MathWorks Support Team
MathWorks Support Team on 11 Nov 2015
The ability to import a file of indeterminate size and mixed data type in to a single cell array is not available through IMPORTDATA in MATLAB 7.8 (R2009a).
There are a few workarounds depending on the construct of the file.
1) If the number of columns are known, then the TEXTSCAN function can read each individual element as a string. For example:
fid = fopen('data.txt')
A = textscan(fid,'%s%s%s%s%s%s','delimiter',',') %assuming columns are known
fclose(fid)
Regardless of the data in the file this will read everything as a string into the cell array A. The string can then be converted to numeric values after being imported into MATLAB. For more information about TEXTSCAN, please refer to the documentation at:
<http://www.mathworks.com/access/helpdesk/help/techdoc/ref/textscan.html>
2) If the number of columns are not known or each column has mix data types then one of the workarounds would be to use FGETL to read the file one line at a time. This will require additional MATLAB programming to parse the data. This option will give you detailed control, but will require more programming and will execute slower depending on the level of parsing. For more information about FGETL, please refer to the documentation at:
<http://www.mathworks.com/access/helpdesk/help/techdoc/ref/fgetl.html>

More Answers (1)

Walter Roberson
Walter Roberson on 12 Nov 2015
To read a csv file with variable format and convert to numeric what can be converted:
filecontents = fileread('NameOfYourFileGoesHere.txt');
filelines = regexp(filecontents, '\r?\n', 'split');
splitfile = regexp(filelines(:), ',', 'split');
output = arrayfun( cellfun(@numeric_if_can, splitfile{K}, 'Uniform', 0), (1:length(splitfile)).', 'Uniform', 0);
function r = numeric_if_can(S)
%we expect empty or a string
if isempty(S)
r = [];
elseif ~ischar(S) | ndims(S) ~= 2 | size(S,1) ~= 1
r = S;
else
r = str2double(S);
if ~isnumeric(r) | isnan(r)
r = S;
end
end
end
The "output" of this will be a cell column array, each entry of which will a cell array that represents one line. This second-level cell array will have one entry for each item on the line. The entry will have been converted to numeric form if the source text could be interpreted as numeric, and otherwise the entry will be left as a string. Places that had adjacent commas (that is, no data was present) or comma at the beginning or end of a line will show up as [] (the empty numeric array), but places that had whitespace such as blanks will be left as strings (emptiness successfully converts to numeric emptiness but whitespace fails numeric conversion.)
Note: in this version, if you happened to have a numeric value immediately followed by 'i' or 'j' then the code will interpret that as a complex component. For example '-1-3i' will become complex(-1,-3)

Categories

Find more on Characters and Strings in Help Center and File Exchange

Products


Release

R2009a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!