Truncated normal distribution not giving positive values

3 views (last 30 days)
Anant
Anant on 15 Sep 2012
Hi,
I have a code in which I am sampling from a normal distribution with a mean and standard deviation I give. There are two parameters following a normal distribution: ah and muv. In order to ensure that a negative value does not occur in any of the sampled values, three loops are introduced. The first loop finds out those row numbers of the randomly generated vector that have a negative value and assign those row numbers in a column vector called tag. The first loop is given below: l=0; % to the letter small "l" number zero is assigned for k1=1:dim % if ah(k1)<0 if ah(k1) < (0.0001) % k1 % mu(k1) l=l+1; tag(l)=k1; % assigns the row index of negative values to vector tag end end
The second loop again randomly samples from the normal distribution for those row indices that have a negative value. The second loop is given below: if l>0 for k2=1:l ah(tag(k2)) = random(name5,par_ah,std_ah,1,1); % VECTOR % will the above scaler necessarily be a poistive number ? % mu(tag(k2)) end end
The third loop checks if the numbers randomly sampled for the second time are positive or not. I do not get a problem for the parameter ah However for the parameter muv, I still end up getting negative values from the truncated normal distribution. This can also be seen from the graph generated.
The script file is copied and pasted below:
{ % All values are for DDT and p=0.04 clear all Nh=33898857; Nc=21571585; H=7933615; p=0.04; % p = percent increase in natural sand fly death rate t=90;% model is run with time after spray as 90 days % Type of Analysis analysis=1; % For Uncertainity analysis % analysis=2; % For Sensitivity analysis
if analysis==1 dim=1e5; % dim=30; reali=10; end
if analysis==2 dim=1e4; reali=2; end
model='Sand flies killed'; % examine the effect of Sand flies killed with different ...
% response from my model is the optimal values of x and y
x_rlz=zeros(dim,reali); % a zero matrix called "R0 realizations" of dim rows and reali columns
y_rlz=zeros(dim,reali);
% 6 input parameters for getting the above response
beta_all=zeros(dim,reali);
ah_all=zeros(dim,reali);
muv_all=zeros(dim,reali);
Ct0_all=zeros(dim,reali); % DDT
% Ct0_all=zeros(dim,reali); % Deltamethrin
b1_all=zeros(dim,reali);
b2_all=zeros(dim,reali);
beta=zeros(dim,1);
ah=zeros(dim,1);
muv=zeros(dim,1);
Ct0=zeros(dim,1);
b1=zeros(dim,1);
b2=zeros(dim,1);
beta_vec=zeros(reali,6);
ah_vec=zeros(reali,6);
muv_vec=zeros(reali,6);
Ct0_vec=zeros(reali,6);
b1_vec=zeros(reali,6);
b2_vec=zeros(reali,6);
x_vec=zeros(reali,6);
y_vec=zeros(reali,6);
tag=zeros(dim,1);
for j=1:reali % reali=10 because we are repeatedly sampling the I/P parameters using Monte carlo simluation 10 times for robustness
j % Iteration.............
% names of distributions to be used
% for each input parameter to the model
name1='exponential';
name2='beta';
name3='gamma';
name4='uniform';
name5='normal';
% I/P parameter, ah has Normal Distribution, mean is 0.1792 and variance is 0.00396 par_ah=0.1792; % var_mu=0.00396344; std_ah=0.028; ah=random(name5,par_ah,std_ah,dim,1); % VECTOR
ah_all(:,j)=ah;
% To have all non-negative values of ah we choose again values which
% are negative
% THIS BELOW CODE MAKES IT A TRUNCATED NORMAL distribution
l=0; % to the letter small "l" number zero is assigned
for k1=1:dim
% if ah(k1)<0
if ah(k1) < (0.0001)
% k1
% mu(k1)
l=l+1;
tag(l)=k1; % assigns the row index of negative values to vector tag
end
end
if l>0
for k2=1:l
ah(tag(k2)) = random(name5,par_ah,std_ah,1,1); % VECTOR
% will the above scaler necessarily be a poistive number ?
% mu(tag(k2))
end
end
% Below Loop introduced by Kaushik on 9_14_2012 to check if negative
% values do occur in the column vector ah, even after executing the
% above two loops
for k3=1:dim
if ah(k3)<0
fprintf(2,['In row number ',num2str(k3),' ah is negative ,',num2str(ah(k3)),'\n']);
end
end
ah_vec(j,1)=mean(ah);
ah_vec(j,2)=median(ah);
ah_vec(j,3)=std(ah);
ah_vec(j,4)=var(ah);
ah_vec(j,5)=min(ah);
ah_vec(j,6)=max(ah);
% % Steady state values of state ac=1-ah; % VECTOR Q=(Nh.*ah)./(ah.*Nh+ac*Nc); % VECTOR % a=h./m2; % VECTOR
% I/P parameter muv also has a normal distribution just as ah above
par_muv = 1/(0.47*30);
% var_mu=0.00396344;
std_muv = 1/(0.42*30);
muv=random(name5,par_muv,std_muv,dim,1); % VECTOR
muv_all(:,j)=muv; % to the j th column of array muv, the column vector muv is assigneds
% To have all non-negative values of muv we choose again values which
% are negative
l=0; % to the letter small "l" number zero is assigned
for k1=1:dim
if muv(k1) < (0.0001)
% k1
% mu(k1)
l=l+1;
tag(l)=k1;
end
end
if l>0
for k2=1:l
muv(tag(k2))=random(name5,par_muv,std_muv,1,1); % VECTOR
% will the above scaler necessarily be a poistive number ?
% mu(tag(k2))
end
end
for k3=1:dim
if muv(k3)<0
fprintf(2,['In row number ',num2str(k3),' muv is negative ,',num2str(muv(k3)),'\n']);
end
end
muv_vec(j,1)=mean(muv);
muv_vec(j,2)=median(muv);
muv_vec(j,3)=std(muv);
muv_vec(j,4)=var(muv);
muv_vec(j,5)=min(muv);
muv_vec(j,6)=max(muv);
%%%%% % Uniform distribution % low_gam=0.2673; % for gamma=0.2838 % high_gam=0.3005; % for gamma=0.2838 low_Ct0= 0.4941; % for gamma=0.5 high_Ct0=0.5858; % for gamma=0.5 % low_gam=0.8053; % for gamma=0.8343 % high_gam=0.8630; % for gamma=0.8343 Ct0=random(name4,low_Ct0,high_Ct0,dim,1); % VECTOR % gamma=random(name5,0.8343,0.01,dim,1); % VECTOR par_Ct0=(low_Ct0+high_Ct0)/2; var_Ct0 = (high_Ct0-low_Ct0)^2/12; std_Ct0 = sqrt(var_Ct0);
Ct0_all(:,j)=Ct0;
Ct0_vec(j,1)=mean(Ct0);
Ct0_vec(j,2)=median(Ct0);
Ct0_vec(j,3)=std(Ct0);
Ct0_vec(j,4)=var(Ct0);
Ct0_vec(j,5)=min(Ct0);
Ct0_vec(j,6)=max(Ct0);
%%%%%%
% Uniform distribution
% low_gam=0.2673; % for gamma=0.2838
% high_gam=0.3005; % for gamma=0.2838
low_b1 = 0.0086794; % for gamma=0.5
high_b1= 0.04332; % for gamma=0.5
% low_gam=0.8053; % for gamma=0.8343
% high_gam=0.8630; % for gamma=0.8343
b1=random(name4,low_b1,high_b1,dim,1); % VECTOR
% gamma=random(name5,0.8343,0.01,dim,1); % VECTOR
par_b1=0.026; %par_b1=(low_b1+high_b1)/2;
% var_b1 = (high_b1-low_b1)^2/12;
std_b1 = 0.01; %sqrt(var_b1);
b1_all(:,j)= b1;
b1_vec(j,1)=mean(b1);
b1_vec(j,2)=median(b1);
b1_vec(j,3)=std(b1);
b1_vec(j,4)=var(b1);
b1_vec(j,5)=min(b1);
b1_vec(j,6)=max(b1);
%%%%%%
% % Steady state values of state % m1=(1-Ct0)*m; % VECTOR
%%%%%%
% Uniform distribution
% low_gam=0.2673; % for gamma=0.2838
% high_gam=0.3005; % for gamma=0.2838
low_b2 = 0.026358; % for gamma=0.5
high_b2= 0.095641; % for gamma=0.5
% low_gam=0.8053; % for gamma=0.8343
% high_gam=0.8630; % for gamma=0.8343
b2=random(name4,low_b2,high_b2,dim,1); % VECTOR
% gamma=random(name5,0.8343,0.01,dim,1); % VECTOR
par_b2=0.061; %par_b2=(low_b2+high_b2)/2;
%var_b2 = (high_b2-low_b2)^2/12;
std_b2 =0.02; %sqrt(var_b2);
b2_all(:,j)= b2;
b2_vec(j,1)=mean(b2);
b2_vec(j,2)=median(b2);
b2_vec(j,3)=std(b2);
b2_vec(j,4)=var(b2);
b2_vec(j,5)=min(b2);
b2_vec(j,6)=max(b2);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % I/P parameter, beta has exponential Distribution, mean is 4.6 and s.d. is 2.6 par_beta=4.6; % var_mu=0.00396344; std_beta=2.6; % beta=random(name1,par_beta,std_beta,dim,1); % VECTOR beta=0.0001+random(name1,par_beta,[dim,1]); %Anuj said add 0.0001 % beta=random(name1,1/par_beta,dim,1); % beta=generate_exponential(1/par_beta);
% x = 0:0.1:10; y = exppdf(x,par_beta); plot(x,y) % beta=exprnd(par_beta,dim,1);
beta_all(:,j)=beta;
% MAKE SURE THAT NO VALUES SAMPLED FROM THE EXPONENTIAL DISTRIBUTION ARE CLOSE
% TO ZERO, JUST AS IT WAS DONE IN THE TRUNCATED NORMAL ABOVE
% To have all non-negative values of ah we choose again values which
% are negative
l=0; % to the letter small "l" number zero is assigned
for k1=1:dim
if beta(k1)<0.0001
% k1
% mu(k1)
l=l+1;
tag(l)=k1;
end
end
if l>0
for k2=1:l
beta(tag(k2))=random(name1,par_beta,1); % VECTOR
% will the above scaler necessarily be a poistive number ?
% mu(tag(k2))
end
end
beta_vec(j,1)=mean(beta);
beta_vec(j,2)=median(beta);
beta_vec(j,3)=std(beta);
beta_vec(j,4)=var(beta);
beta_vec(j,5)=min(beta);
beta_vec(j,6)=max(beta);
Z=Nc./beta; % VECTOR
L1=(Q.*Ct0).*(exp(-b1.*t));
L2=((1-Q).*Ct0).*(exp(-b2.*t));
K1=0.533*7933615; %K1=Ih*H; Iz for DDT is 0.533
K2=(0.533/2).*Z; %K2=Iz*Z;
kk1=L2./K2; kk2=L1/K1; kk3=p.*muv;
for i=1:dim
if (kk1(i) < kk2(i)) && (kk3(i) <= L1(i))
x_rlz(i)= K1*p.*muv(i)/(Nh*L1(i));
y_rlz(i)=0;
elseif ( kk1(i) < kk2(i) ) && ( L1(i) < p*muv(i) <= (L1(i)+L2(i)) )
x_rlz(i)= K1/Nh;
y_rlz(i)= K2(i)*(p.*muv(i)-L1(i))/(Nc*L2(i));
elseif ( kk2(i) < kk1(i) ) && ( kk3(i) <= L2(i) )
x_rlz(i)= 0;
y_rlz(i)= K2(i)*p*muv(i)/(Nc*L2(i));
elseif (kk2(i) < kk1(i)) && ( L2(i) < p*muv(i) <= L1(i)+L2(i) )
x_rlz(i)= K1(i)*(p*muv(i)-L2(i))/(Nh*L1(i));
y_rlz(i)= K2*(p*muv(i)-L1(i))/(Nc*L2(i));
end
end
end
% #################################################################
% Sensitivity Analysis starts here........
if analysis==2
[sorted_beta, order_beta] = sort(beta_all(:,reali));
[sorted_ah, order_ah] = sort(ah_all(:,reali));
[sorted_muv, order_muv] = sort(muv_all(:,reali));
[sorted_Ct0, order_Ct0] = sort(Ct0_all(:,reali));
[sorted_b1, order_b1] = sort(b1_all(:,reali));
[sorted_b2, order_b2] = sort(b2_all(:,reali));
% y (R0) is rank transformed
[sorted_x, order_x]=sort(x_rlz(:,reali));
[sorted_y, order_y]=sort(y_rlz(:,reali));
for i=1:dim beta_rank(order_beta(i))=i; ah_rank(order_ah(i))=i; muv_rank(order_muv(i))=i; Ct0_rank(order_Ct0(i))=i; b1_rank(order_b1(i))=i; b2_rank(order_b2(i))=i; x_rank(order_x(i))=i; y_rank(order_y(i))=i; end
% Fixing first the realization we want rank DATA (done above) FROM say last one "realz" % Compute regression coefficients for a linear model with an interaction term:
X_x_beta = [ones(size(ah_rank')) ah_rank' muv_rank' Ct0_rank' b1_rank' b2_rank']; X_y_beta = [ones(size(ah_rank')) ah_rank' muv_rank' Ct0_rank' b1_rank' b2_rank']; % building linear regression model with y (R0) as response and remaining independent % variables as predictor, see equation (2)on page 181 of the Kirschner % reserach paper [b_x_beta, bint_x_beta, r_x_beta]= regress(x_rank', X_x_beta); [b_y_beta, bint_y_beta, r_y_beta]= regress(y_rank', X_y_beta); % Removes NaN data % building linear regression model with xj(gamma) as response and the % remaining remaining independent variables as the predictors [b_beta_x, bint_beta_x, r_beta_x]= regress(beta_rank', X_x_beta); [b_beta_y, bint_beta_y, r_beta_y]= regress(beta_rank', X_y_beta);
[prcc_x_beta, p_val_x_beta]= corrcoef(r_beta_x, r_x_beta);
[prcc_y_beta, p_val_y_beta]= corrcoef(r_beta_y, r_y_beta);
prcc(1)=prcc_x_beta(1,2); % computing prcc index p_val(1)=p_val_x_beat(1,2); % computing p vlaue
prcc(2)=prcc_y_beta(1,2); % computing prcc index p_val(2)=p_val_y_beat(1,2); % computing p vlaue
X_x_ah = [ones(size(beta_rank')) beta_rank' muv_rank' Ct0_rank' b1_rank' b2_rank']; X_y_ah = [ones(size(beta_rank')) beta_rank' muv_rank' Ct0_rank' b1_rank' b2_rank'];
[b_x_ah, bint_x_ah, r_x_ah]= regress(x_rank', X_x_ah); % Removes NaN data [b_ah_x, bint_ah_x, r_ah_x]= regress(ah_rank', X_x_ah); [b_y_ah, bint_y_ah, r_y_ah]= regress(y_rank', X_y_ah); % Removes NaN data [b_ah_y, bint_ah_y, r_ah_y]= regress(ah_rank', X_y_ah);
%R = corrcoef(X) returns a matrix R of correlation coefficients calculated %from an input matrix X whose rows are observations and whose columns are variables. [prcc_x_ah, p_val_x_ah]= corrcoef(r_ah_x, r_x_ah); [prcc_y_ah, p_val_y_ah]= corrcoef(r_ah_y, r_y_ah);
prcc(3)=prcc_x_ah(1,2); p_val(3)=p_val_x_ah(1,2); prcc(4)=prcc_y_ah(1,2); p_val(4)=p_val_y_ah(1,2);
X_x_muv = [ones(size(beta_rank')) beta_rank' ah_rank' Ct0_rank' b1_rank' b2_rank']; X_y_muv = [ones(size(beta_rank')) beta_rank' ah_rank' Ct0_rank' b1_rank' b2_rank']; % WHAT does R0 stand for in below line? [b_x_muv, bint_x_muv, r_x_muv]= regress(x_rank', X_x_muv); % Removes NaN data % what does Rd below line do ? % [b_muv_rd, bint_muv_rd, r_muv_rd]= regress(muv_rank', X_Rd_muv); [b_muv_x, bint_muv_x, r_muv_x]= regress(muv_rank', X_x_muv);
[b_y_muv, bint_y_muv, r_y_muv]= regress(y_rank', X_y_muv); % Removes NaN data [b_muv_y, bint_muv_y, r_muv_y]= regress(muv_rank', X_y_muv);
[prcc_x_muv, p_val_x_muv]= corrcoef(r_muv_x, r_x_muv);
[prcc_y_muv, p_val_y_muv]= corrcoef(r_muv_y, r_y_muv);
prcc(5)=prcc_x_muv(1,2); p_val(5)=p_val_x_muv(1,2);
prcc(6)=prcc_y_muv(1,2); p_val(6)=p_val_y_muv(1,2);
X_x_Ct0 = [ones(size(mu_rank')) beta_rank' ah_rank' muv_rank' b1_rank' b2_rank']; X_y_Ct0 = [ones(size(mu_rank')) beta_rank' ah_rank' muv_rank' b1_rank' b2_rank'];
[b_x_Ct0, bint_x_Ct0, r_x_Ct0]= regress(x_rank', X_x_Ct0); % Removes NaN data [b_Ct0_x, bint_Ct0_x, r_Ct0_x]= regress(Ct0_rank', X_x_Ct0);
[b_y_Ct0, bint_y_Ct0, r_y_Ct0]= regress(Ct0_rank', X_y_Ct0); [b_Ct0_y, bint_Ct0_y, r_Ct0_y]= regress(Ct0_rank', X_y_Ct0);
[prcc_x_Ct0, p_val_x_Ct0]= corrcoef(r_Ct0_x, r_x_Ct0);
[prcc_y_Ct0, p_val_y_Ct0]= corrcoef(r_Ct0_y, r_y_Ct0);
prcc(7)=prcc_x_Ct0(1,2); p_val(7)=p_val_x_Ct0(1,2); prcc(8)=prcc_y_Ct0(1,2); p_val(8)=p_val_y_Ct0(1,2);
X_x_b1 = [ones(size(beta_rank')) beta_rank' muv_rank' Ct0_rank' ah_rank' b2_rank']; X_y_b1 = [ones(size(beta_rank')) beta_rank' muv_rank' Ct0_rank' ah_rank' b2_rank'];
[b_x_b1, bint_x_b1, r_x_b1]= regress(x_rank', X_x_b1); % Removes NaN data [b_b1_x, bint_b1_x, r_b1_x]= regress(b1_rank', X_x_b1); [b_y_b1, bint_y_b1, r_y_b1]= regress(y_rank', X_y_b1); % Removes NaN data [b_b1_y, bint_b1_y, r_b1_y]= regress(b1_rank', X_y_b1);
%R = corrcoef(X) returns a matrix R of correlation coefficients calculated %from an input matrix X whose rows are observations and whose columns are variables. [prcc_x_b1, p_val_x_b1]= corrcoef(r_b1_x, r_x_b1); [prcc_y_b1, p_val_y_b1]= corrcoef(r_b1_y, r_y_b1);
prcc(9)=prcc_x_b1(1,2); p_val(9)=p_val_x_b1(1,2); prcc(10)=prcc_y_b1(1,2); p_val(10)=p_val_y_b1(1,2);
X_x_b2 = [ones(size(beta_rank')) beta_rank' muv_rank' Ct0_rank' ah_rank' b1_rank']; X_y_b2 = [ones(size(beta_rank')) beta_rank' muv_rank' Ct0_rank' ah_rank' b1_rank'];
[b_x_b2, bint_x_b2, r_x_b2]= regress(x_rank', X_x_b2); % Removes NaN data [b_b2_x, bint_b2_x, r_b2_x]= regress(b2_rank', X_x_b2); [b_y_b2, bint_y_b2, r_y_b2]= regress(y_rank', X_y_b2); % Removes NaN data [b_b2_y, bint_b2_y, r_b2_y]= regress(b2_rank', X_y_b2);
%R = corrcoef(X) returns a matrix R of correlation coefficients calculated %from an input matrix X whose rows are observations and whose columns are variables. [prcc_x_b2, p_val_x_b2]= corrcoef(r_b2_x, r_x_b2); [prcc_y_b2, p_val_y_b2]= corrcoef(r_b2_y, r_y_b2);
prcc(11)=prcc_x_b2(1,2); p_val(11)=p_val_x_b2(1,2); prcc(12)=prcc_y_b2(1,2); p_val(12)=p_val_y_b2(1,2);
end
% Sensitivity Analysis ends here........ % #################################################################
% ################################################################# % Uncertainity Analysis starts here........
if analysis==1 % % mean,median,std,iqr %
mean_r0i_x= mean(x_rlz)';
median_r0i_x=median(x_rlz)';
mean_r0i_y= mean(y_rlz)';
median_r0i_y=median(y_rlz)';
std_r0i_x=std(x_rlz)';
iqr_r0i_x=iqr(x_rlz)';
std_r0i_y=std(y_rlz)';
iqr_r0i_y=iqr(y_rlz)';
for k=1:length(x_rlz(1,:)) % can write as k=1:reali ??? t1=x_rlz(:,k); % pg1r0i(k)=length(t1(find(t1<=2)))/length(t1); % INTRODUCED UNDERSCORE X AS I HAVE ONE LOOP BELOW FOR Y AS WELL pg1r0i_x(k)=length(t1(find(t1<=2.5)))/length(t1); % Use this when mean(gamma)=0.23 end % You can use a logical expression to define X. For example, find(X > 2) % returns linear indices corresponding to the entries of X that are greater than 2. pg1r0i_x=pg1r0i_x'; pg1minusr0_x=1-pg1r0i_x;
%%mean over all realizations
for k=1:length(y_rlz(1,:)) % can write as k=1:reali ??? t1=y_rlz(:,k); % pg1r0i(k)=length(t1(find(t1<=2)))/length(t1); % INTRODUCED UNDERSCORE Y BELOW pg1r0i_y(k)=length(t1(find(t1<=2.5)))/length(t1); % Use this when mean(gamma)=0.23 end % You can use a logical expression to define X. For example, find(X > 2) % returns linear indices corresponding to the entries of X that are greater than 2. pg1r0i_y=pg1r0i_y'; pg1minusr0_y=1-pg1r0i_y;
%%mean over all realizations
mean_alli_x=mean([mean_r0i_x median_r0i_x iqr_r0i_x std_r0i_x pg1r0i_x pg1minusr0_x]);
mean_alli_y=mean([mean_r0i_y median_r0i_y iqr_r0i_y std_r0i_y pg1r0i_y pg1minusr0_y]);
%
%standard error
%
se1_x=std([mean_r0i_x median_r0i_x iqr_r0i_x std_r0i_x pg1r0i_x pg1minusr0_x]);
ser1_x=se1_x/sqrt(length(mean_r0i_x));
se1_y=std([mean_r0i_y median_r0i_y iqr_r0i_y std_r0i_y pg1r0i_y pg1minusr0_y]);
ser1_y=se1_y/sqrt(length(mean_r0i_y));
%
%variation coefficient
%
cv1_x=se1_x./mean_alli_x;
cv1_y=se1_y./mean_alli_y;
end
if analysis==1
figure(1)
subplot(3,3,1); hist(beta,100) title('Distribution of \beta','fontsize',22);
subplot(3,3,2);
hist(ah,100)
title('Distribution of a_h','fontsize',22);
subplot(3,3,3);
hist(muv,100)
title('Distribution of \mu_v','fontsize',22);
subplot(3,3,4);
hist(Ct0,100)
title('Distribution of C_{t0}','fontsize',22);
subplot(3,3,5);
hist(b1,100)
title('Distribution of b_1','fontsize',22);
subplot(3,3,6); hist(b2,100) title('Distribution of b_2','fontsize',22);
subplot(3,3,7);
hist(x_rlz(:,reali),100)
title('Distribution of x','fontsize',22)
subplot(3,3,8);
hist(y_rlz(:,reali),100)
title('Distribution of y','fontsize',22)
end
}
Can someone kindly suggest why I get negative values (type:min(muv) ) for muv’s distribution whereas for ah, all randomly sampled values are positive, as seen in the graph.
Thanks.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!