How to impute missing values using mean for a table

10 views (last 30 days)
>> a=xlsread('pill.xls')
a =
Columns 1 through 7
3.0000 58.0000 41.0000 6.7000 14.0000 17.0000 7.0000
3.0000 58.0000 41.0000 6.7000 15.0000 17.0000 7.0000
3.0000 61.0000 43.0000 6.7100 15.0000 19.0000 8.0000
NaN 61.0000 43.0000 6.7100 14.0000 19.0000 8.0000
2.0000 62.0000 43.0000 6.7100 15.0000 18.0000 8.0000
2.0000 50.0000 39.0000 6.6700 14.0000 16.0000 8.0000
4.0000 63.0000 44.0000 6.6300 15.0000 18.0000 7.0000
8.0000 63.0000 44.0000 6.7100 15.0000 18.0000 7.0000
8.0000 65.0000 46.0000 6.6200 16.0000 20.0000 8.0000
8.0000 70.0000 49.0000 6.6500 18.0000 22.0000 9.0000
26.0000 74.0000 52.0000 6.6500 20.0000 21.0000 8.0000
Columns 8 through 14
2.0000 NaN NaN 0.2000 0.0100 2.0000 8.0000
2.0000 NaN NaN NaN NaN 2.0000 8.0000
2.0000 NaN NaN 0.2000 0.0100 2.0000 7.0000
2.0000 NaN NaN NaN NaN 2.0000 8.0000
2.0000 NaN NaN 0.2000 0.0100 2.0000 8.0000
2.0000 NaN NaN 0.2000 0.0100 2.0000 7.0000
2.0000 NaN NaN 0.2000 0.0100 2.0000 8.0000
2.0000 0.2000 NaN 0.2000 0.0100 2.0000 8.0000
2.0000 0.2000 NaN 0.2000 0.0100 2.0000 8.0000
2.0000 0.2000 NaN 0.0200 0.0100 2.0000 8.0000
2.0000 1.0000 NaN 0.2000 0.0100 2.0000 8.0000
Columns 15 through 18
0.3000 2.0000 0.2000 2.4600
0.3000 3.0000 NaN 0.4000
0.3000 2.0000 0.2000 2.4600
0.3000 2.0000 NaN 2.4400
0.3000 2.0000 NaN 0.4400
0.3000 2.0000 0.2000 0.4400
0.3000 2.0000 0.2000 2.5200
0.3000 4.0000 0.2000 2.4600
0.3000 2.0000 0.2000 2.4800
0.3000 2.0000 0.2000 2.4800
0.3000 2.0000 0.2000 0.4800
>> notnan=~isnan(a)
notnan =
Columns 1 through 12
1 1 1 1 1 1 1 1 0 0 1 1
1 1 1 1 1 1 1 1 0 0 0 0
1 1 1 1 1 1 1 1 0 0 1 1
0 1 1 1 1 1 1 1 0 0 0 0
1 1 1 1 1 1 1 1 0 0 1 1
1 1 1 1 1 1 1 1 0 0 1 1
1 1 1 1 1 1 1 1 0 0 1 1
1 1 1 1 1 1 1 1 1 0 1 1
1 1 1 1 1 1 1 1 1 0 1 1
1 1 1 1 1 1 1 1 1 0 1 1
1 1 1 1 1 1 1 1 1 0 1 1
Columns 13 through 18
1 1 1 1 1 1
1 1 1 1 0 1
1 1 1 1 1 1
1 1 1 1 0 1
1 1 1 1 0 1
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
>> howmany=sum(notnan)
howmany =
Columns 1 through 12
10 11 11 11 11 11 11 11 4 0 9 9
Columns 13 through 18
11 11 11 11 8 11
>> data(~notnan)=0
data =
Columns 1 through 12
0 0 0 0 0 0 0 0 0 0 0 0
Columns 13 through 24
0 0 0 0 0 0 0 0 0 0 0 0
Columns 25 through 36
0 0 0 0 0 0 0 0 0 0 0 0
Columns 37 through 48
0 0 0 0 0 0 0 0 0 0 0 0
Columns 49 through 60
0 0 0 0 0 0 0 0 0 0 0 0
Columns 61 through 72
0 0 0 0 0 0 0 0 0 0 0 0
Columns 73 through 84
0 0 0 0 0 0 0 0 0 0 0 0
Columns 85 through 96
0 0 0 0 0 0 0 0 0 0 0 0
Columns 97 through 108
0 0 0 0 0 0 0 0 0 0 0 0
Columns 109 through 120
0 0 0 0 0 0 0 0 0 0 0 0
Columns 121 through 132
0 0 0 0 0 0 0 0 0 0 0 0
Columns 133 through 144
0 0 0 0 0 0 0 0 0 0 0 0
Columns 145 through 156
0 0 0 0 0 0 0 0 0 0 0 0
Columns 157 through 168
0 0 0 0 0 0 0 0 0 0 0 0
Columns 169 through 180
0 0 0 0 0 0 0 0 0 0 0 0
Column 181
0
>> columTot=sum(a)
columTot =
Columns 1 through 7
NaN 685.0000 485.0000 73.4600 171.0000 205.0000 85.0000
Columns 8 through 14
22.0000 NaN NaN NaN NaN 22.0000 86.0000
Columns 15 through 18
3.3000 25.0000 NaN 19.0600
>> colMean=columTot ./howmany
colMean =
Columns 1 through 7
NaN 62.2727 44.0909 6.6782 15.5455 18.6364 7.7273
Columns 8 through 14
2.0000 NaN NaN NaN NaN 2.0000 7.8182
Columns 15 through 18
0.3000 2.2727 NaN 1.7327

Answers (3)

Andrei Bobrov
Andrei Bobrov on 25 Jun 2013
Edited: Andrei Bobrov on 25 Jun 2013
t = ~isnan(a);
a1 = a;
a1(~t) = 0;
s = sum(t);
s(s == 0) = 1;
out = sum(a1)./s;
or as said Iain:
out = nanmean(a);
  2 Comments
Visa
Visa on 25 Jun 2013
Edited: Visa on 25 Jun 2013
thank you sir,both u and lain is working same.
Andrei Bobrov
Andrei Bobrov on 25 Jun 2013
Edited: Image Analyst on 21 Sep 2015
Please accept either Iain's answer or mine and give credit to me ... or vice versa ;)

Sign in to comment.


Iain
Iain on 25 Jun 2013
Use nansum, or nanmean, as required.
  1 Comment
Visa
Visa on 25 Jun 2013
Thank you mam. >> y=nanmean(b)
y =
Columns 1 through 7
6.7000 62.2727 44.0909 6.6782 15.5455 18.6364 7.7273
Columns 8 through 14
2.0000 0.4000 NaN 0.1800 0.0100 2.0000 7.8182
Columns 15 through 18
0.3000 2.2727 0.2000 1.7327
>> z=nansum(b)
z =
Columns 1 through 7
67.0000 685.0000 485.0000 73.4600 171.0000 205.0000 85.0000
Columns 8 through 14
22.0000 1.6000 0 1.6200 0.0900 22.0000 86.0000
Columns 15 through 18
3.3000 25.0000 1.6000 19.0600

Sign in to comment.


Hanan
Hanan on 21 Sep 2015
Edited: Hanan on 21 Sep 2015
m = nanmean(x); % the mean
mean_value_at_nan = bsxfun(@times, isnan(x), m); % mean value just at nans, otherwise zero
xfull = x; % copy
xfull(isnan(x))=0; % turn nan to zero
xfull = xfull + mean_value_at_nan; % add the mean

Categories

Find more on Performance and Memory in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!