MATLAB Answers

Robert
0

How a large timetable (10 million rows by 15 columns) could be modified to a smaller table?

Asked by Robert
on 2 Jun 2018
Latest activity Answered by Peter Perkins
on 4 Jun 2018
Hello,
Here is an example of my 10 million x 15 columns timetable...
Time Par_Name1 Par_Name2 ... Par_Name12 UniqueID Value
01-Jan-1980 AA X1 ... ... 18 109
02-Jan-1980 AA X1 ... ... 18 23
03-Jan-1980 AA X1 ... ... 18 12
... AA X1 ... ... 18 ...
31-Dec-2017 AA X1 ... ... 18 234
01-Jan-1980 AA X1 ... ... 75 10
02-Jan-1980 AA X1 ... ... 75 21
03-Jan-1980 AA X1 ... ... 75 5
... AA X1 ... ... 75 ...
31-Dec-2017 AA X1 ... ... 75 34
01-Jan-1980 BB X1 ... ... 03 190
02-Jan-1980 BB X1 ... ... 03 12
03-Jan-1980 BB X1 ... ... 03 18
... BB X1 ... ... 03 ...
31-Dec-2017 BB X1 ... ... 03 237
01-Jan-1980 DD X7 ... ... 99 100
02-Jan-1980 DD X7 ... ... 99 21
03-Jan-1980 DD X7 ... ... 99 12
... DD X7 ... ... 99 ...
31-Dec-2017 DD X7 ... ... 99 299
... ... ... ... ... ... ...
... ... ... ... ... ... ...
... ... ... ... ... ... ...
...to a table like this...
Par_Name1 Par_Name2 Par_Name... Par_Name12 UniqueID 01-Jan-1980 02-Jan-1980 03-Jan-1980 ... 31-Dec-2017
AA X1 ... ... 18 109 23 12 ... 234
AA X1 ... ... 75 10 21 5 ... 34
BB X1 ... ... 03 190 12 18 ... 237
DD X7 ... ... 99 100 21 12 ... 299
... ... ... ... ... ... ... ... ... ...
My goals here are: 1) Reduce the size of the table by removing redundancy, and 2) still be able to generate statistics based on a 'Par_Name??' conditions or due to time periods like per Year or Season or a mix of the previous options

  1 Comment

Could you provide a part of the table (. mat) to work with?

Sign in to comment.

1 Answer

Answer by Peter Perkins
on 4 Jun 2018

This looks like an application of unstack.
Not sure what you'll be doing with that table, so I can't say if that "wide" version is a convenient organization, but you are correct that it should reduce the memory footprint by removing redundant data. It might also be possible to create a timetable with one row for each parameter combination (as you've show), but instead of
>> days(datetime(2017,12,31) - datetime(1980,1,1))
ans =
13879
variables, you'd have one variable that's a cell array, each cell containing one 13879x1 timetable. You'd have some unnecessary redundancy (the time vectors), but that organization might be more convenient. Or maybe not.

  0 Comments

Sign in to comment.