Convert CSV / Array Cell to Dataset

3 views (last 30 days)
Hi,
The goal is to put the attached into dataset: a) Directly from the file and b) From the cell array
For a) issue is there are commas inside one of the fields.
For b) I tried the below but this creates dataset of cell arrays instead of values directly.
I tried a step:
var{i}{j,1} = data2(j+1,i)
With:
data = dataset(var{1:end}, 'VarNames', varnames);
Full code:
function data = importfile1(workbookFile, sheetName, range)
% If no sheet is specified, read first sheet
if nargin == 1 || isempty(sheetName)
sheetName = 1;
end
% If no range is specified, read all data
if nargin <= 2 || isempty(range)
range = '';
end
%%Import the data
[~, ~, data2] = xlsread(workbookFile, sheetName, range);
data2(cellfun(@(x) ~isempty(x) && isnumeric(x) && isnan(x),data2)) = {''};
% get headers
strings = char(data2{1,:})
varnames = {}
for i=1: length(strings(:,1))
varnames{i} = strtrim(strings(i,:))
end
var = {}
for i=1: length(data2(1,:))
for j=1: length(data2(2:end,2))
var{i}{j,1} = data2(j+1,i) %  (!!!!!!) how to process this
end
end
% export into dataset
data = dataset(var{1:end}, 'VarNames', varnames);
end

Accepted Answer

Stephen23
Stephen23 on 16 Nov 2015
Edited: Stephen23 on 16 Nov 2015
Solution
Reading the file is easy using textscan (because .csv files are not Excel files, they are text files).
opt = {'HeaderLines',1, 'Delimiter',',', 'CollectOutput',true};
fid = fopen('data.csv','rt');
C = textscan(fid,'%f%s%s%f%f%f%f%f%f%q', opt{:});
fclose(fid);
Or if you really need those header strings too:
opt = {'Delimiter',',', 'CollectOutput',true};
fid = fopen('data.csv','rt');
H = textscan(fid,'%s%s%s%s%s%s%s%s%s%s', 1, opt{:});
C = textscan(fid,'%f%s%s%f%f%f%f%f%f%q', opt{:});
fclose(fid);
Output
Here are the first five rows of the data that it imports (in numeric arrays and cell arrays of strings):
>> C{1}(1:5,:)
ans =
1
2
3
4
5
>> C{2}(1:5,:)
ans =
'c' 'a'
'a' 'a'
'a' 'a'
'c' 'c'
'a' 'a'
>> C{3}(1:5,:)
ans =
1270 9 2008 0 NaN NaN
570 11 2007 1 13 2010
4130 12 2006 1 14 2011
620 9 2009 0 NaN NaN
29910 4 2015 0 NaN NaN
>> C{4}(1:5,:)
ans =
''
'Jan,Apr,Jul,Oct'
'Jan,Apr,Jul,Oct'
''
''
  3 Comments
rafal jankowski
rafal jankowski on 16 Nov 2015
In fact - it is simple enough to update the file first to replace ','s with ';' for instance and then load up into dataset. What about converting the cell array you created into a dataset though?

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!