MATLAB Answers

0

I'm trying to organize data so it can easily be averaged by date

Asked by Susan Santiago on 13 Oct 2018
Latest activity Answered by Peter Perkins
on 17 Oct 2018
I have a cell of data where the first column is the full date. I'm being asked to organize it in a matrix so that if something like H(1,2,:), it would return all the data from the second day of January. The dates are in the format yyyymmddHHMM, if that helps. I'm kind of just looking for guidance on how to achieve something like this. Any help is appreciated

  2 Comments

I uploaded my workspace because there are many files and they're all in .dat which can't be uploaded here. And I think that is probably more clear anyway. The cell I'm concerned with is named C. Each row of C represents on data file.

Sign in to comment.

2 Answers

Answer by jonas
on 13 Oct 2018
Edited by jonas
on 13 Oct 2018
 Accepted Answer

"...something like H(1,2,:), it would return all the data from the second day of January."
Not very good in my opinion. How do you deal with the fact that different months have different number of days? By padding with NaNs?
It is much easier to put all your data in a timetable. You can then easily access specific days.
t = datetime(2000,1,1):days(1):datetime(2001,1,1);
TT = timetable(t,zeros(length(t),1))
You want to access data for a specific date? Easy:
TT('2001-1-1',:)
ans =
timetable
Time Var1
___________ ____
01-Jan-2001 0

  32 Comments

I can answer this on my phone :) the class of the data is wrong. You have to run this line first
T.Var1 = datetime(num2str(T{:,1}),'inputformat','yyyyMMddHHmm');
But change Var1 to TIMESTAMP_START and make sure the data is converted to datetime format
Thanks! One last thing, is there any way to change my weirdo code so it's not just giving a daily average but all the results from the day? One of the main uses with this matrix is gonna be plotting the data. Thanks again. And if you don't mind, how would I get just one variable from matrix?
1. Yes and no. You have data every half hour if I remember correctly. You could make a fourth dimension of the matrix, and enter the "hour of day". Still, it would not work because you have half hours. You could make a fifth dimension called "minute of day"... you realize how absurd this method is becoming, especially since most 98% of minutes would be NaNs.
2. All variables are stored in the third dimension:
A(1,1,5)
outputs the "fifth" variable form the first of January. What is the fifth variable? You would have to compare with some kind of table every time you want to extract one variable.
I fully understand that you want to comply with your professors instructions. However, if you show him these two methods and explain the advantages of using tables (indexing by variable names, options for interpolation, easier access to specific dates, possibility of storing different classes, easier to plot as well as a variety of table-specific options that we have not even talked about) he/she would be crazy stubborn to opt for the array.

Sign in to comment.


Answer by Peter Perkins
on 17 Oct 2018

"return all the data from the second day of January"
Imagine having this timetable:
>> tt = array2timetable(rand(100,2),'RowTimes',datetime(2018,1,1,0:8:792,0,0));
>> head(tt)
ans =
8×2 timetable
Time Var1 Var2
____________________ _______ ________
01-Jan-2018 00:00:00 0.85071 0.55903
01-Jan-2018 08:00:00 0.56056 0.8541
01-Jan-2018 16:00:00 0.92961 0.34788
02-Jan-2018 00:00:00 0.69667 0.44603
02-Jan-2018 08:00:00 0.58279 0.054239
02-Jan-2018 16:00:00 0.8154 0.17711
03-Jan-2018 00:00:00 0.87901 0.66281
03-Jan-2018 08:00:00 0.98891 0.33083
In recent versions of MATLAB (R2018a and later IIRC), you can do this:
>> tt(timerange('01-Jan-2018','day'),:)
ans =
3×2 timetable
Time Var1 Var2
____________________ _______ _______
01-Jan-2018 00:00:00 0.85071 0.55903
01-Jan-2018 08:00:00 0.56056 0.8541
01-Jan-2018 16:00:00 0.92961 0.34788
>> tt(timerange(datetime(2018,1,1),'day'),:)
ans =
3×2 timetable
Time Var1 Var2
____________________ _______ _______
01-Jan-2018 00:00:00 0.85071 0.55903
01-Jan-2018 08:00:00 0.56056 0.8541
01-Jan-2018 16:00:00 0.92961 0.34788
In earlier versions, you can do the same thing, with a bit more typing:
>> tt(timerange(datetime(2018,1,1),datetime(2018,1,2)),:)
ans =
3×2 timetable
Time Var1 Var2
____________________ _______ _______
01-Jan-2018 00:00:00 0.85071 0.55903
01-Jan-2018 08:00:00 0.56056 0.8541
01-Jan-2018 16:00:00 0.92961 0.34788

  0 Comments

Sign in to comment.