# How to compute lognormal distribution

Valerio Gianforte on 26 May 2020
Commented: Valerio Gianforte on 29 May 2020
Hi everyone,
I have 19 years of data and I have to compute lognormal distribution of each month of each year. I've already computed monthly average values and monthly standard deviation on each year. I know the more quickly way to do this kind of operation is for loop but I'm a beginner on Matlab and i don't know how to use for loop i these situations. I think that I need of two for loop, one with index on the years and the other with index on the months but I don't know. I've done the code below that it is manual and too much long. I need to make it authomatic. Thank you in advance!
format long g
folder = 'D:\Valerio\data\IPCC_midcent\RCP4.5\BCC_CSM\BCC_CSM.xlsx';
dt = datetime([file(:,1:3) file(:,4)/1E4 repmat([0 0],size(file,1),1)]);
tt = timetable(dt, file(:,5:end));
data = tt.Var1;
% file(:, 3:4) = []; % delete day and hour columns as they are not important for yearly mean
[grps, Years, Months] = findgroups(file(:,1), file(:,2));
result_mean = splitapply(@(x) mean(x, 1), file(:,5:end), grps);
result_mean = [Years Months result_mean];
result_std = splitapply(@(x) std(x, [], 1), file(:,5:end), grps);
result_std = [Years Months result_std];
TR1 = timerange('01-Jan-2026 00:00:00','01-Feb-2026 00:00:00');
tt_TR1 = tt(TR1,:);
Hs1 = tt_TR1.Var1(:,1);
Tp1 = tt_TR1.Var1(:,2);
d_Hs1 = lognpdf(Hs1,result_mean(1,3),result_std(1,3));
d_Tp1 = lognpdf(Tp1,result_mean(1,4),result_std(1,4));
TR2 = timerange('01-Feb-2026 00:00:00','01-Mar-2026 00:00:00');
tt_TR2 = tt(TR2,:);
Hs2 = tt_TR2.Var1(:,1);
Tp2 = tt_TR2.Var1(:,2);
d_Hs2 = lognpdf(Hs2,result_mean(2,3),result_std(2,3));
d_Tp2 = lognpdf(Tp2,result_mean(2,4),result_std(2,4));
TR3 = timerange('01-Mar-2026 00:00:00','01-Apr-2026 00:00:00');
tt_TR3 = tt(TR3,:);
Hs3 = tt_TR3.Var1(:,1);
Tp3 = tt_TR3.Var1(:,2);
d_Hs3 = lognpdf(Hs3,result_mean(3,3),result_std(3,3));
d_Tp3 = lognpdf(Tp3,result_mean(3,4),result_std(3,4));
TR4 = timerange('01-Apr-2026 00:00:00','01-May-2026 00:00:00');
tt_TR4 = tt(TR4,:);
Hs4 = tt_TR4.Var1(:,1);
Tp4 = tt_TR4.Var1(:,2);
d_Hs4 = lognpdf(Hs4,result_mean(4,3),result_std(4,3));
d_Tp4 = lognpdf(Tp4,result_mean(4,4),result_std(4,4));
TR5 = timerange('01-May-2026 00:00:00','01-Jun-2026 00:00:00');
tt_TR5 = tt(TR5,:);
Hs5 = tt_TR5.Var1(:,1);
Tp5 = tt_TR5.Var1(:,2);
d_Hs5 = lognpdf(Hs5,result_mean(5,3),result_std(5,3));
d_Tp5 = lognpdf(Tp5,result_mean(5,4),result_std(5,4));

Srivardhan Gadila on 29 May 2020
Edited: Srivardhan Gadila on 29 May 2020
You can try something like below:
format long g
folder = 'BCC_CSM.xlsx';
dt = datetime([file(:,1:3) file(:,4)/1E4 repmat([0 0],size(file,1),1)]);
tt = timetable(dt, file(:,5:end));
data = tt.Var1;
% file(:, 3:4) = []; % delete day and hour columns as they are not important for yearly mean
[grps, Years, Months] = findgroups(file(:,1), file(:,2));
result_mean = splitapply(@(x) mean(x, 1), file(:,5:end), grps);
result_mean = [Years Months result_mean];
result_std = splitapply(@(x) std(x, [], 1), file(:,5:end), grps);
result_std = [Years Months result_std];
monthNames = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"];
count = 1;
for i = 1:numel(Years)
allMonths{count} = "01-"+monthNames(Months(i))+"-"+num2str(Years(i))+" 00:00:00";
count = count+1;
end
for i = 1:count-2
TR{i} = timerange(allMonths{i},allMonths{i+1});
tt_TR{i} = tt(TR{i},:);
Hs{i} = tt_TR{i}.Var1(:,1);
Tp{i} = tt_TR{i}.Var1(:,2);
%If the above variables TR, tt_TR, Hs & Tp are temporary only then don't use
% cell arrays for them
d_Hs{i} = lognpdf(Hs{i},result_mean(i,3),result_std(1,3));
d_Tp{i} = lognpdf(Tp{i},result_mean(i,4),result_std(1,4));
end
The above code covers the timeranges at the year end like '01-Dec-2026 00:00:00','01-Jan-2027 00:00:00' , If you don't want this timerange then simply use the below code;
Years = ["2026", "2027", "2028"];
Months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"];
i = 1;
for year = Years
for monthNum = 1:11
TR{i} = timerange("01-"+Months(monthNum)+"-"+year+" 00:00:00","01-"+Months(monthNum+1)+"-"+year+" 00:00:00");
tt_TR{i} = tt(TR{i},:);
Hs{i} = tt_TR{i}.Var1(:,1);
Tp{i} = tt_TR{i}.Var1(:,2);
%If the above variables TR, tt_TR, Hs & Tp are temporary only then don't use
% cell arrays for them
d_Hs{i} = lognpdf(Hs{i},result_mean(i,3),result_std(1,3));
d_Tp{i} = lognpdf(Tp{i},result_mean(i,4),result_std(1,4));
i = i + 1;
end
end
Valerio Gianforte on 29 May 2020
Thank you so much, now it works. Now I have to plot the results on one plot for each year and every single plot should to contain the log normal distribution of each month of that year. I tryed to put the scatter plot in the for loop but obviously I've obtained 217 plots that represent all months in 19 years. Have you some idea? Thank you

