please help me in loading data to matlab

4 views (last 30 days)
Hi all,
I have some problems in loading file Adult.data into MATLAB. When I try:
>> load adult.data
it displays:
??? Error using ==> load Unknown text on line number 1 of ASCII file C:\Users\Documents\MATLAB\adult.data "Self-emp-not-inc".
A line in the file:
50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K
I don't know why. I have tried fopen and scan but it is still impossible. Please help me. Thank you so much.

Accepted Answer

Matt Tearle
Matt Tearle on 22 Feb 2011
Similar to what Andrew said, but I'd go with
fid = fopen('Adult.data');
A = textscan(fid,'%f%s%f%s%f%s%s%s%s%s%f%f%f%s%s,'delimiter',',');
fclose(fid);
It looks uglier, but it will import the 15 columns as separate cells, and the numeric values will be imported as numeric arrays.
But, that said, if you're doing any kind of statistical analysis on this kind of data, you probably want (and/or may already have) Statistics Toolbox. In which case, use dataset to import this data directly into a dataset array. This will make your life easier. In particular, you can use nominal to turn things like "bachelors", "white", and "male" into nominal arrays
  1 Comment
Andrew Newell
Andrew Newell on 22 Feb 2011
Your approach gets my vote - present pain for future gain!

Sign in to comment.

More Answers (6)

Andrew Newell
Andrew Newell on 22 Feb 2011
You can use textscan to read in the data as a comma-delimited set of strings:
fid = fopen('Adult.data');
A = textscan(fid,'%s','delimiter',',');
fclose(fid)
This reads the data into a cell containing a one-dimensional cell array. The next command will change it to a format that is probably more useful to you (there are 15 fields on each line):
A = reshape(A{:},15,[]);

Matt Tearle
Matt Tearle on 23 Feb 2011
BTW, best practice is to accept the answer that solved your initial question, then start a new question for the follow-up. That way, others can follow what's happening (for the benefit of others who might have similar questions). Anyway...
It depends, are you using Stats TB? If not, the base MATLAB way is
nnz((age > 50) & strcmpi(sex,'male'))
I'm assuming age is a numeric array, and sex is a cell array of strings. The > and strcmpi (case-insensitive string comparison) both create logical variables, which are combined using &. Applying the nnz function returns the number of true values.
If you have Stats TB and have the data in a dataset array, with sex as a nominal variable,
nnz((data.age > 50) & (data.sex == 'male'))

Matt Tearle
Matt Tearle on 23 Feb 2011
Would you be surprised if I suggested that what you really need is Statistics Toolbox?
But, the brute-force way in MATLAB could be done something like this:
clist = unique(country);
Nctry = length(clist);
num_richbuggers = zeros(Nctry,1);
for k = 1:Nctry
num_richbuggers(k) = nnz(strcmpi(assets,'>50K') & ...
strcmpi(country,clist{k}));
end

love
love on 23 Feb 2011
Thank you so much for your quick response. Now, I need to count the number of male who is over 50. How can I do it with MatLab.
For example:
39, State_gov,77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K
50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Female, 0, 0, 13, United-States, <=50K
58, Private, 215646, HS-grad, 9, Divorced, Handlers-cleaners, Not-in-family, White, Male, 0, 0, 40, United-States, <=50K
53, Private, 234721, 11th, 7, Married-civ-spouse, Handlers-cleaners, Husband, Black, Male, 0, 0, 40, United-States, <=50K
There are two males who are over 50. The result is 2. Please help me. I am a dummy in Matlab, I don't know how to do with that. Thank you.

love
love on 23 Feb 2011
Hi Matt, thanks for your answer. It works perfectly.
How 's about the grouping and counting in MatLab? For example, I need to group the countries and count the people who have more than 50K in each country. Thank you so much.

love
love on 23 Feb 2011
Fantastic, it works, Matt. Matlab is great, you are great. Thank you very much.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!