Extract all data for the June months over the years from a timetable

18 views (last 30 days)
I want to pull all the data only for the month of june out of this time table but I do not want to group it all into a single bin "june". Ultimately I want to know how many days there was precipitation in June of 1948, June of 1949, ect. I have been able to use the groupsummary function to almost get there but it lumps all the years data into a single June bin. My data spans 60 years and I want 60 junes worth of data basically. I am new to using the timetables in matlab and I have read many of the help pages but cannot seem to figure out how to just pull out a single month. Thanks for any help!
whos -file preciptimedata.mat
Name Size Bytes Class Attributes preciptimedata - 352359 timetable

Accepted Answer

dpb
dpb on 2 Sep 2022
Edited: dpb on 2 Sep 2022
Piece 'o cake -- granted, it takes a little while to get used to using grouping variables and all, but...
load https://www.mathworks.com/matlabcentral/answers/uploaded_files/1115475/preciptimedata.mat
Error using load
Unable to read file 'https://www.mathworks.com/matlabcentral/answers/uploaded_files/1115475/preciptimedata.mat'. If it is a Version 7 or earlier MAT-file, consider saving your data afresh in Version
7.3 MAT-files to access it from a remote location.
tP=preciptimedata; clear preciptimedata % I like shorter variable names...
tP.Properties.VariableNames={'Rainfall'}; % make meaningful variable name for data
tP=addvars(tP,month(tP.precipdates),year(tP.precipdates), ... % create grouping variables
'NewVariableNames',{'Month','Year'},'Before','Rainfall');
tG=groupsummary(tP,{'Month','Year'},"nnz","Rainfall"); % and do the work..
tJun=tG(tG.Month==6,:)
Oh, bummer...same code from home system (R2020b) yields..
>> tJun=tG(tG.Month==6,:);
>> head(tJun)
ans =
8×4 table
Month Year GroupCount nnz_Rainfall
_____ ____ __________ ____________
6 1949 30 21
6 1950 30 13
6 1951 30 24
6 1952 30 8
6 1953 30 17
6 1954 30 10
6 1955 30 16
6 1956 30 11
>>
Of course, you don't have to make new table to see June, just did here so could use head as shorthand to display. tG holds all months summary counts so can now just look up any month desired. @Walter Roberson's solution contains same data; just slightly different way to get there by using external variables and findgroups/splitapply workflow instead of sticking with the original timetable. No real advantage one way or t'other that I see although with grouping variables in the table, you've already got 'em at hand to apply with any other analysis that comes up...
  1 Comment
dpb
dpb on 3 Sep 2022
Edited: dpb on 6 Sep 2022
No idea where the geographical area for these data is; clearly somewhere pretty wet (NOT SW KS). It caught my attention and is interesting, however, that the years above of '52-'56 are on average lower than the initial three average and in particular 1952 was very dry. Those years coincide with the great drought of the '50s in the US central plains; there were a couple of those years which were even drier on the family farm here than any of those during the "Dirty Thirties".
May be only coiincidence, but is perhaps reflective of global weather pattern relationships -- it is known that our dry periods are correlated/in part caused by the La Nina/El Nino cycles off the S American coast...I've often thought if had data it would be interesting to see if, indeed, the 30s were also in synch with that phenomenon. These also don't go back that far, though -- I've never found such data readily accessible although I've not done truly extensive searching.

Sign in to comment.

More Answers (2)

Walter Roberson
Walter Roberson on 2 Sep 2022
tested...
pcts = load('preciptimedata.mat');
pct = pcts.preciptimedata;
[y,m,d] = ymd(pct.Properties.RowTimes);
junepct = pct(m == 6, :);
juneyear = y(m == 6);
G = findgroups(juneyear);
rain_days = splitapply(@(year,var1) [year(1),nnz(var1)], juneyear, junepct.Var1, G);
rain_days = timetable(rain_days(:,2), 'VariableNames', {'days of rain'}, 'RowTimes', datetime(rain_days(:,1),1,1,'Format','uuuu'));

Isaac Lammers
Isaac Lammers on 6 Sep 2022
Edited: Isaac Lammers on 6 Sep 2022
So I went out of town for labor day weekend and was away from service but brought my computer. Thank you all for the answers! Here is what I came up with after much trial and error. I think I will probably implement one of your solutions that are above though as they seem more elegant than a for loop haha.
In regards to the data that I have, the geographical location is Boulder, CO. I am fairly certain it is a data set from one of my professors research however I am just taking a data analysis course so I am not focused so much on what the data is, as much as I am learning to manipulate it! Thanks for the help!
My code below probably would need some of the additional variable I have defined further up in the homework set but it gets point across I think...
year = transpose(1949:2008);
for i = 1:1:size(year) %range of years of data
date1 = datetime(year(i),06,01); %Method for pulling only the june data
date2 = datetime(year(i),07,01);
TR = timerange(date1,date2);
precipmonthjune = preciptimedata(TR,:);
yearjuneprecip = groupsummary(precipmonthjune, 'precipdates','year','nnz');
yearjuneprecip = table2array(yearjuneprecip(:,3));
juneprecipdays(i,:) = yearjuneprecip;
end
  2 Comments
Walter Roberson
Walter Roberson on 6 Sep 2022
numel(year) not size(year) because size with a single input always returns a vector.
dpb
dpb on 6 Sep 2022
Edited: dpb on 6 Sep 2022
"the geographical location is Boulder, CO"
AHA! Not all that far away from us...but close enough to the mountains to have the summer monsoons that we do not have the luxury of...but also probably affected somewhat similarly and would find fairly high correlation with our area over time.

Sign in to comment.

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!