How can I load target data on Student Alcohol Consumption?
2 views (last 30 days)
Show older comments
0 Comments
Answers (2)
Star Strider
on 25 Apr 2016
I didn’t, because I don’t need to. You will have to extract it yourself, then put the data in the appropriate directory so you can access it with your code.
0 Comments
Walter Roberson
on 25 Apr 2016
The below requires R2013b or later in order to handle the transformation of the various strings into numeric codes. It was, however, written to avoid needing the "%q" format of textscan which is newer than R2013b.
The data is in an odd mix of formats, with double-quoted strings and integers, and then right near the end there are two columns which are double-quoted integers followed by an integer. Putting in all of the format items by hand is too error prone so I built code to build the format specifier.
%number of columns of each format
N = [2, 1, 3, 2, 4, 3, 8, 7, 2, 1];
%the various types of format
S = '%[^;\n]';Q = '"%[^"]"'; D = '%f';QD = '"%f"';
%repeat a format a given number of times
R = @(F,cnt) repmat({F},1,cnt);
%create format for the data
fmt = strjoin([R(Q,N(1)), R(D,N(2)), R(Q,N(3)), R(D,N(4)), R(Q,N(5)), R(D,N(6)), R(Q,N(7)), R(D,N(8)), R(QD,N(9)), R(D,N(10))],';');
%create format for the header
hfmt = strjoin(R(S,33),';');
%read the data into cell array
fid = fopen('student-mat.csv', 'rt');
header = textscan(fid, hfmt, 1);
datacell = textscan(fid, fmt);
fclose(fid);
%convert a series of columns of strings each into numeric codes, by converting to
%categorical array and then extracting the indices; smash the whole together into array
c = @(col) cell2mat(arrayfun(@(IDX) double(categorical(datacell{IDX})), col, 'Uniform', 0));
%convert a series of numeric columns into numeric array
d = @(col) cell2mat(datacell(col));
%convert group number into range of columns
cN = [0, cumsum(N)];
C = @(P) cN(P)+1:cN(P+1);
%build the numeric array
datan = [c(C(1)), d(C(2)), c(C(3)), d(C(4)), c(C(5)), d(C(6)), c(C(7)), d(C(8)), d(C(9)), d(C(10))];
Now that you have the numeric array, you will probably want to extract portions of it to train on, and portions to use as the target.
You will probably need to transpose rows and columns as you use the routines. Also, if you are doing classification or pattern recognition, you will probably want to use ind2vec() on the target codes.
0 Comments
See Also
Categories
Find more on Deep Learning Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!