Creating a matrix after giving initial conditions

Question

Damith on 12 Mar 2015

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/182710-creating-a-matrix-after-giving-initial-conditions

Commented: dpb on 31 Mar 2015

Accepted Answer: dpb

Open in MATLAB Online

Hi,

I need to create a matrix when given the begin year, month, day, time and end year, month, day and time.

for example; (I do not need to see the header in the matrix)

I would like to enter:

begin year = 1998
begin month = 1
begin day  = 1
begin time = 0
end year = 1998
end month = 2
end day  = 15
end time = 12
(as shown in the table below)

But, if I enter leap years (e.g. 2000,2004, etc) it should include Feb 29 and do the same time steps (0 to 21) as shown in the table.

Then, the program should generate the output for the time periods given above. See the sample below.

Year   Month   Day  Time
   1      1     0
   1      1     3
   1      1     6
   1      1     9
   1      1    12
   1      1    15
   1      1    18
   1      1    21
   1      2     0
.
.
. 
   2      15    12

Any help is appreciated.

Thanks in advance.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

dpb on 12 Mar 2015

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/182710-creating-a-matrix-after-giving-initial-conditions#answer_171030

Edited: dpb on 12 Mar 2015

Open in MATLAB Online

 yr=2000;
 dn=datevec(datenum(yr,1,1,[0:3:24*(365+isleapyr(yr))-1].',0,0))
>> [dn(1:5,:);dn(470:485,:);dn(length(dn)-10,:)]
ans =
         1           1           0           0           0
         1           1           3           0           0
         1           1           6           0           0
         1           1           9           0           0
         1           1          12           0           0
      ...
         2          28          15           0           0
         2          28          18           0           0
         2          28          21           0           0
         2          29           0           0           0
         2          29           3           0           0
         2          29           6           0           0
         2          29           9           0           0
         2          29          12           0           0
         2          29          15           0           0
         2          29          18           0           0
         2          29          21           0           0
         3           1           0           0           0
         3           1           3           0           0
         3           1           6           0           0
         3           1           9           0           0
         3           1          12           0           0
      ...
        12          30          15           0           0
        12          30          18           0           0
        12          30          21           0           0
        12          31           0           0           0
        12          31           3           0           0
        12          31           6           0           0
        12          31           9           0           0
        12          31          12           0           0
        12          31          15           0           0
        12          31          18           0           0
        12          31          21           0           0
>>

I introduced the ellipses in the output to show the breaks clearly...

Salt to suit...

Oh, isleapyr is one of my utilities...

function is=isleapyr(yr)
%  returns T for input year being a leapyear
is=(datenum(yr+1,1,1)-datenum(yr,1,1))==366;

there may be a TMW-supplied equivalent since R2012b which lacks it.

11 Comments
Show 9 older commentsHide 9 older comments

dpb on 12 Mar 2015

Open in MATLAB Online

The timestep is in hours; nd above is the number of 3-hr intervals between to two dates. So, use

dn=datevec(datenum(yr,1,1,[0:3:nd*3].',0,0));

or don't divide by 3 earlier. Again was trying to demonstrate a principle instead of writing code...

Changing variable names to be more reflective and "real" code--

>> nHrs=(datenum(1998,2,15,12,0,0)-datenum(1998,1,1,0,0,0))*24;
>> dv=datevec(datenum(yr,1,1,[0:3:nHrs].',0,0));
>> datestr(dv(end,:))
ans =
15-Feb-1998 12:00:00
>>

Again, the principle is that datenum is smart enough to wrap the lower time fields correctly accounting for leap years and such; the caveat is that for consistency you must use the integer steps rather than a floating point difference of, say, 3/24 fractional days or the floating point roundoff will cause discrepancies if try to do precise date comparisons; the floating point deltas will inevitably be off a bit or two and an exact comparison will fail. If the integer steps are used, they are exact and internally the rounding will always occur the same way so the net result is consistency.

dpb on 12 Mar 2015

Edited: dpb on 15 Mar 2015

Open in MATLAB Online

No problem; sometimes I tend to err on the pedagogical side--it comes from mentoring too many new hires for too long where I was more concerned in getting them to think through how to solve the problem than just answering the question. :)

Anyhoo, to add here, the really bad thing (tm) about the date comparisons if use floating point deltas (as seems perfectly logical to do) is that most of the results will be ok, it's only that occasional case where the rounding is different by that last bit or two so it's possible either to entirely miss that the algorithm isn't exactly correct or it seems infuriatingly near random as to why some/most dates are selected but a few just always get missed. The above eliminates that as long as the generation is consistent.

ADDENDUM

The really, really difficult thing is that when the divisor can be/is exactly represented in floating point, then the two techniques are, in fact, identical. In this case 3/24-->0.125 is, in fact exactly representable and so if remove the datevec cast and return date numbers instead for dv and then

>> df=(datenum(yr,1,[1:3/24:nHrs*10/24].',0,0,0));
>> all(df==dv(1:length(df)))
ans =
   1
>>

we do see in this case could've gotten by. This of course, inevitably leads to changing to a dt of 2 hours instead and then the above scenario striketh and there's seemingly no reason why to the initiate.

Damith on 16 Mar 2015

Open in MATLAB Online

Hi again,

I know this could be frustrating. Thanks for your willingness to understand and I appreciate that. But, I am trying to achieve my ultimate goal of creating a netCDF file using MATLAB. Since I have not done that in MATLAB, I am trying to figure it out through reading online materials.

Let me elaborate here: The first task I realized is to create a 3D array of time, lat, lon of precipitation field. To achieve that, I have hundreds of .csv files (one file is for time (3hrs) in year folders. For example in year 1998 Jan 01 00 time step file name: TRMM_1998_01_0100_newntcl.csv. I have year folders from 1998 to 2014. The last 2 digits on the file name are for 3hr time intervals. Next time is, TRMM_1998_01_0103_newntcl.csv and so on. So, I was able to figure out to read multiple .csv files in one folder in MATLAB and stored them in a cell array.

That is my "out" cell matrix.

My first objective is to create netCDF file year by year. So, I first picked year 1998. First, I need to sum 3hr time intervals for a day e.g. from 00 to 21. That's what I was trying to do through my original post.

In 1998 folder, I have 365 csv files, but I am interested in summing 3hr file for a complete day. Since, I have complete 3hr time interval files upto Feb 14 (TRMM_1998_02_1421_newntcl.csv) that's why I entered the end data here: (See the code below)

   yr=1998;
   nd=(datenum(1998,2,14,21,0,0)-datenum(1998,1,1,0,0,0))*24/3;
   dn=datevec(datenum(yr,1,1,[0:3:nd*3].',0,0));
   dn(:,5:6)=[];
   dnT=dn(:,4)'; % dnT = 360 here

then the code below to get the sum

out = vertcat(out{:});
A = cell2mat(out(:,3));
B = reshape(A(1:360,[],8);
C=sum(B,2);

When I do the vertcat for the original "out" cell array, below image shows how it looks.

Array A has all 3rd col of "out" and has 18863565 rows. But when I do

B = reshape(A(1:360),[],8);

it only creates 45 x 8 of for the 3rd col of first row of "out". But, I need to have 45 x 8 matrices for all the rows (3rd col) in out matrix. I need to have 1 x 365 cell matrix with each cell has 45 x 8 cells.

Hope I explained well here. Any help is appreciated.

Thanks again.

dpb on 17 Mar 2015

OK, so the answer is "ignore days that are incomplete".

I've gotta' do some other things here just now, but I think I have a plan... :) I'll try to get back again in the morning and outline what I think'll work pretty well. Sometimes one only knows how well when actually working out the code if don't write it out on paper and for these exercises, what's the fun in that??? :)

Damith on 18 Mar 2015

yes, ignore the days that are incomplete.

Will see whether your plan works or not :)

Sign in to comment.

Answer 2

dpb on 18 Mar 2015

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/182710-creating-a-matrix-after-giving-initial-conditions#answer_171813

Edited: dpb on 20 Mar 2015

Open in MATLAB Online

Since dir returns a sorted list, I think I'd just process it similarly as you are but simply test the days within the loop by parsing the date/time string within a double loop. One could, as mentioned, get fancy and do some serious attempts at matching datenums but just doesn't seem to be needed here...

I'd start with something like the following which builds an array of the summed third column value in matching days by year. Each day is in a given column, only a small additional logic is needed to keep track of which day is present, a month and day row filled in by column while in the loop...

d=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([d.name], ...
            'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
            'collectoutput',true));
idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
N=length(idx);
idx=[idx;length(ymdh)+1];
precip=nan(51681,fix(N)); % preallocate on num whole days
k=0;
for i=1:length(idx)-1
  isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
  if ~isOK, continue, end % either aren't eight or it's a wrong day mixed in
  data=zeros(51681,8);      % preallocate based on num lat/lon points
  for j=0:7  % ok, process each of these, they're one day's worth
    data(:,j+1)=csvread(d(idx(i)+j).name,0,2);  % add hourly data
  end
  k=k+1;
  precip(:,k)=sum(data,2);  % the daily sums are in an array 
  fn=d(idx(i)).name;  fn(16:17)=[];  % fn(1:1)='O';
  disp(fn)
  csvwrite(fn,precip(:,k))  
end
precip(:,k+1:end)=[];   % remove columns to leave complete days only

This snippet reads only the third column on the assumption you can easily add the constant position columns as you see fit.

I've never used a netCDF file so on casting it into that form I'm not of much use but the above should provide the basic building block to get the data consolidated by day. Note I specifically avoided cell arrays here and did the summing in place while accumulating the data to keep total memory down to minimal amount...as you've already done, use this inside the higher level loop to traverse the directories.

Unless there aren't multiple partial days, the final value of k will be the same as the floor(L/8) calculation. If there are there will be that many zero columns that can simply remove.

Hope that helps...

30 Comments
Show 28 older commentsHide 28 older comments

Damith on 18 Mar 2015

Edited: Damith on 18 Mar 2015

Open in MATLAB Online

OK. I changed "ix1" to "idx" in the code (See below), and modified

if ~OK, continue, end

to

if ~isOK, continue, end

as OK was undefined.

but I still get an error in the csv read part:

Error using csvread
Too many input arguments.

any idea?

d=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([d.name], ...
            'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
            'collectoutput',true));
idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
precip=zeros(51681,fix(length(idx/8))); % preallocate on num whole days
k=0;  % counter for valid day column
for i=idx:length(idx)
  isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
  if ~isOK, continue, end % either aren't eight or it's a wrong day mixed in
  for j=0:7  % ok, process each of these, they're one day's worth
    data=zeros(51681,8);      % preallocate based on num lat/lon points
    data(:,j+1)=csvread(d(idx+j).name,0,2);  % add hourly data; reads only 3rd column
  end
  k=k+1;
  precip(:,i)=sum(data,2);  % the daily sums are in an array 
end

dpb on 18 Mar 2015

Edited: dpb on 18 Mar 2015

Open in MATLAB Online

Seems functional here...I copied your file from the link to local...

>> dir TRM*.csv

TRMM_1998_01_0100_newntcl.csv

>> 
>> d=dir('TRM*.csv');
>> for i=1:length(d),data=csvread(d(i).name,0,2);end
>> whos data
Name          Size             Bytes  Class     Attributes
data      51681x1             413448  double              
>> d.name
ans =
TRMM_1998_01_0100_newntcl.csv

How about the section inline from the command window showing everything in context? Maybe can spot something that way I can't see from the above...certainly nothing sticks out to me.

ADDENDUM

for i=idx:length(idx)

Above should be

for i=1:length(idx)

Since idx is a vector, this reassigned the loop index i to that vector so then the argument of d(i) is a multiple group of results instead of just one.

That's a remnant of my initial planning on use i1:i2 as a range and I didn't catch in the cleanup either. The loop is processing each entry in the index vector; the index vector is those locations in the directory list for which the time field is identically zero.

dpb on 19 Mar 2015

Edited: dpb on 19 Mar 2015

Open in MATLAB Online

Oh, yeah, the last 00hrs element won't have a subsequent to test. I don't have time to really think too much about it at the moment but run the loop from 1:length(idx)-1 instead of over the full length.

That change handles all except the end case. So, after the loop finishes on the next-to-last, add one more section that checks to see if the length of the directory listing is greater than idx(end); if so, and is seven more it's another complete (assuming again the days are all the same so it's not just some mishmash of random dates/times that happens to add up to the right number of entries) then run the j loop one more time and increment k one more time and accumulate the sum as well.

Alternatively, you could add an additional if...end clause in the loop for the last iteration and change the test of isOK to the above instead of the existing for that...or you could add one more record as a dummy at the end of the list altho that's probably more effort than either of the other two choices.

Anyway, it's just an end case that needs handling just a little differently than the rest...

ADDENDUM

Woke up this morning and the (almost) trivial fixup came to me...the modification of the index vector is the easiest after all. When create it, simply augment it with the value of length(ymdh)+1 as if were another 00hr entry, but run the outer loop over the original length. Then the +1 index doesn't run over the end on the spacing check but you also don't try to run the loop too many times. See the modification in the ANSWER above...

The changes are...

Add the additional to the end of the index vector...
idx=[idx; length(ymdh)+1];
and change the loop upper limit to length(idx)-1
for i=1:length(idx)-1

and should work.

Damith on 19 Mar 2015

Edited: dpb on 19 Mar 2015

Open in MATLAB Online

test_22.zip

OK. I considered it and the modfied code is below. I tried with reduced number of csv files (18) in attached "test_22" attachment to see whether it sum the precipitation across days. Unfortunately, it does not. "test_22" folder has 18 files starting days from 1998 Jan 01, 1998 Jan 2 and 1998 Jan 3 (incomplete time intervals).

For example, I modified the precipitation data in col 3 for Jan 01 for of 8 csv files. So, the sum of precipitation should be 3.5.

First row -40.13 29.88 of 
1998 Jan 01 (00, 03, 06, 09, 12, 15, 18, 21) = sum(0.5,0.2,0.2,0.1,0.5,1.0,0.8,0.2 ) = 3.5

Please see the below image to see what the code is producing. Code is producing 0.2 (1st col) that represents Jan 01 which is wrong.

I modified here

data=zeros(51681,fix(length(idx/8)));

but still producing the wrong results. Any idea?

d=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([d.name], ...
            'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
            'collectoutput',true));      
idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
%idx=[idx; length(ymdh)];
precip=zeros(51681,fix(length(idx/8))); % preallocate on num whole days
k=0;  % counter for valid day column
for i=1:length(idx)-1
  isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
  if ~isOK, continue, end % either aren't eight or it's a wrong day mixed in
  for j=0:7  % ok, process each of these, they're one day's worth
    data=zeros(51681,fix(length(idx/8)));      % preallocate based on num lat/lon points
    data(:,j+1)=csvread(d(idx(i)+j).name,0,2);  % add hourly data; reads only 3rd column
  end
  k=k+1;
  precip(:,i)=sum(data,2);  % the daily sums are in an array 
end

dpb on 19 Mar 2015

Edited: dpb on 19 Mar 2015

Open in MATLAB Online

Well, how stupid...the data array is getting reinitialized to zeros every pass thru the j loop so only the last column is ever saved. DOH!!! Can't believe I hadn't noticed that misplacement long before now. Problem with "air code"; often one doesn't pick up on the obvious.

OK, I made a couple of other minor cleanups, and tested it on your subset of data...

>> damith
precip(1:10,:)
ans =
5000    3.0000
5000    3.0000
5000    3.0000
5000    3.0000
5000    3.0000
5000    3.0000
5000    3.0000
5000    3.0000
5000    3.0000
5000    3.0000
>>

That look "more better"???

I updated the script in the Answer section to keep it all together; I just replaced the previous in its entirety rather than try to keep all the iterations...

ADDENDUM

BTW, it seemed to take longer than I'dve thunk it should here so I checked on whether the column-specific read in csvread was the culprit...turns out it's quicker with it than without. I didn't try any other alternative input functions, will leave that as "exercise for the student". :)

I didn't convert the script to a function so that all JIT effects are active, either; that may fix a multitude of sins (or again, may not).

Damith on 19 Mar 2015

Edited: Damith on 19 Mar 2015

Open in MATLAB Online

This is better. I achieved the task like the way below.

Now, is there a way to employ the "csvwrite" function to write the "precip" columns (one by one) corresponding to the "length(idx)-1"? First and second columns (similar to input files) of output files are identical. Or is there any other efficient method to do this?

Basically filename would be 'TRMM_1998_01_01_newntcl.csv' without the time interval.

d=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([d.name], ...
            'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
            'collectoutput',true));      
idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
idx=[idx; length(ymdh)];
precip=zeros(51681,length(idx/8)-1); % preallocate on num whole days
k=0;  % counter for valid day column
data=zeros(51681,fix(length(idx/8))); 
for i=1:length(idx)-1
  isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
  if ~isOK, continue, end % either aren't eight or it's a wrong day mixed in
  for j=0:7  % ok, process each of these, they're one day's worth
    data(:,j+1)=csvread(d(idx(i)+j).name,0,2);  % add hourly data; reads only 3rd column
  end
  k=k+1;
  precip(:,i)=sum(data,2);  % the daily sums are in an array 
end
precip(:,k+1:end)=[];   % remove columns to leave complete days only

dpb on 19 Mar 2015

Open in MATLAB Online

"I achieved the task like the way below"

...
idx=[idx; length(ymdh)];

The above is wrong; see my code. This will fail if the last day is full because the length computed won't be eight but seven. To prove it to yourself try it with your sample subdirectory if you remove the last two entries and leave 16 instead of 18; you'll only get one day's result instead of two.

If you don't have any reason to have the daily sums as a group there's no point in saving the whole array; I had presumed having those in one location for further analyses would be a primary point of the exercise; continuing to have so many files with so much repetitive data seems counter productive to the max. But, sure, inside the loop on i simply use a substring operation on the file name to remove the time section; doesn't matter which one so may as well use the last when the j loop has completed...

fnameout=d(idx(i)+j).name;  % save to a temporary
fnameout(16:17)=[];         % remove the two hour digits from ddhh
csvwrite(fnameout,[xy precip(:,i)]  % write to a csv file

xy is the array of coordinates saved from an initial read of one file previously, of course, since they're constant across all files.

Again, if this is all you're going to be doing with the sums I'd not even bother to save the precip array at all; just use the result of the sum directly in the concatenation.

I know you've said your goal is a netCDF file but if you're really going to be doing something with these data in Matlab a .mat file would be much smaller and faster keeping stuff together in a larger conglomerate.

Damith on 19 Mar 2015

Open in MATLAB Online

Yes. You are quite correct. I removed the last two files and it produced only the first one. So, I corrected the code like yours. Thanks for correcting me. This shows how ignorant I am in MATLAB.

I tried again to write the output to csv files since I have a another piece of code written in python to convert to a netCDF.

IS your method producing csv file for each day? I need to have a separate csv file for each day? (Only for complete days) For example 'TRMM_1998_01_01_newntcl.csv', 'TRMM_1998_01_02_newntcl.csv'

Anyways, I tried this but give me an error.

fnameout=d(idx(i)+j).name;  % save to a temporary
fnameout(16:17)=[];         % remove the two hour digits from ddhh
csvwrite(fnameout,[xy precip(:,i)])  % write to a csv file
Index exceeds matrix dimensions.

.....................................................

    xy=xlsread('TRMM_1998_01_0100_newntcl.csv', '', 'A1:B51681');
    d=dir('TRMM_*_newntcl.csv');
    ymdh=cell2mat(textscan([d.name], ...
                'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
                'collectoutput',true));
    idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
    N=length(idx);             % the unaugmented length
    idx=[idx;length(ymdh)+1];
    precip=nan(51681,fix(N)); % preallocate on number starting days
    k=0;
    for i=1:N
      isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
      if ~isOK, continue, end % either aren't eight or a wrong day mixed in
      data=zeros(51681,8);      % preallocate based on num lat/lon points
      for j=0:7  % ok, process each of these, they're one day's worth
        data(:,j+1)=csvread(d(idx(i)+j).name,0,2);  % add hourly data
      end
      k=k+1;
      precip(:,k)=sum(data,2);  % the daily sums are in an array 
    end
    precip(:,k+1:end)=[];   % remove columns to leave complete days only
fnameout=d(idx(i)+j).name;  % save to a temporary
fnameout(16:17)=[];         % remove the two hour digits from ddhh
csvwrite(fnameout,[xy precip(:,i)])  % write to a csv file

Damith on 20 Mar 2015

Edited: Damith on 20 Mar 2015

Open in MATLAB Online

Bingo, it worked.:) I modified your code a bit. See the code below.

But one problem, I just noticed when I ran this code your for the folder where I had 365 files from 1998 Jan 01 to 1998 Feb 14 (for complete days) this code only produced the output csv files only upto 1998 Jan 31. Code ran smoothly but no error message but no daily output from 1998 Feb 01 to 1998 Feb 14.

Could not figure out why? Any idea?

If I need to obtain the monthly and yearly sum, how can I modify within the loop. Again, I need csv output for each month and year.? Rememver I have years from 1998 to 2014.

    xy=xlsread('TRMM_1998_01_0100_newntcl.csv', '', 'A1:B51681');
    d=dir('TRMM_*_newntcl.csv');
    ymdh=cell2mat(textscan([d.name], ...
                'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
                'collectoutput',true));
    idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
    N=length(idx);             % the unaugmented length
    idx=[idx;length(ymdh)+1];
    precip=nan(51681,fix(N)); % preallocate on number starting days
    k=0;
    for i=1:N
    isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
    if ~isOK, continue, end
    data=zeros(51681,8);
    for j=0:7
      data(:,j+1)=csvread(d(idx(i)+j).name,0,2);
    end
    fname=d(idx(i)+j).name; fname(16:17)=[];
    csvwrite(fname,[xy,sum(data,2)])
    end

dpb on 20 Mar 2015

Well, the first thing to do is to check what the list of files returned is and the length of the index vector--firstly, does it have the proper number of entries and then secondly is all(diff(idx)==8) true (1) or false (0)? If false, how many are true (sum(diff(idx)))? That's how many would be at least candidates. Double check that the directory returned is, indeed, sorted and that the names are such that the sorting is correct.

After that, don't see a particular reason otomh, but it surely can't be too complex if you do some diagnostic checking.

As for monthly and yearly sums, you're changing ground rules--I don't know how many times I mentioned "if this is all you're going to do" with the data and you never said word boo! it wasn't... :(

In that case you've got to start accumulating the daily sums again and keeping them such that you can then sum over them over the proper dimensions. Question I'd have is, however, how do you define what to sum over for a month and a year--do you need complete data sets for all days of the month, for example, since we've got cases here where we've got missing days since we ignored days for which didn't have a complete set of 3-hour observations.

dpb on 20 Mar 2015

"...to obtain the monthly and yearly sum ..."

OK, it's just an extension of the above idea but as noted above you've got to decide what it is you want to do about the missing cases. Assuming the data are complete-enough you can do as above and ignore the missing entirely, something like the following logic should work...

Within each subdirectory with this loop you've got a year's worth of data if it is complete.

While processing the above loop for each day, we've built an array of preciptation values by day for complete days but incomplete days were skipped over entirely and the array has no space for them. If, otoh, we built a 3D array that is 31 columns wide for the longest months and 12 planes deep for each month and stored the valid daily data in the proper column then can simply sum over the 2nd dimension (columns) for the monthly sums and over the planes or the yearly. Now whether either of these is valid is dependent upon whether there are any "holes" in it or not and how you want to treat those.

Unless you also want to aggregate over multiple years, that's all you have to do other than simply iterate the above over the various subdirectories.

dpb on 23 Mar 2015

Edited: dpb on 24 Mar 2015

Open in MATLAB Online

Maybe once you accepted the first on the dates it won't let you accept a 2nd answer, too, maybe? I hadn't tried, I only respond rarely ask questions (and the ones I have have been rather esoteric so don't necessarily get answers... :) )

That was discussed way up earlier in response to your first question after the posting of the code:

"...Does it removes/disregard partial days"

Yes, that's what the logic test checks for--that there are eight sets with the same day consecutively in the grouping. If not, the continue goes on to the next 0 hours location.

It's looking at the 3rd column (days) of the ymdh array and checking that there are no differences >0 amongst those in the group between the current 0-hr entry and the next and then that there are a total of eight between groups. The first is the same as

all(ymdh(idx(i):idx(i+1)-1,3)==ymdh(idx(i),3))

I just chose to check the differences were all zero instead of that all were the same value as the first, explicitly. It's the same end result.

If you're going to process the daily files for complete days I don't think that's the place to be modifying...you should be able to simply process the daily data from the precip array as mentioned before after accumulating it for each day (again, presuming you keep track as suggested of the actual day of month instead of just incrementing a counter and store in the column based on day.

Damith on 24 Mar 2015

Thanks.

I was not successful in implementing this and simply do not know where and how to start with the precip array. Some coding help/hint will energize me. I need monthly and annual values for complete sets of data.

Your help is appreciated again.

dpb on 24 Mar 2015

Go back an reread the comments starting on Mar 20@17:45 where I described how to build the 3D array of daily totals that you can subsequently simply sum by column for each plane and by plane for monthly and yearly totals for each year. What's the issue in doing that, it's simply modifying the size of the precip array initially and then using the ymdh array info on day and month to set the indices instead of just incrementing a counter k as doing now.

Sign in to comment.

Answer 3

dpb on 24 Mar 2015

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/182710-creating-a-matrix-after-giving-initial-conditions#answer_172554

Edited: dpb on 29 Mar 2015

Open in MATLAB Online

d=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([d.name], ...
          'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
          'collectoutput',true));
idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
N=length(idx);
idx=[idx;length(ymdh)+1];
precip=nan(51681,31,12); % preallocate on num whole days
%fprintf('Building daily sums for %d: \n',ymdh(1,1))
dispstat('','init')
for i=1:length(idx)-1
  isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
  if ~isOK, continue, end % either aren't eight or it's a wrong day mixed in
  %msg=sprintf('  Month: %2d  Day: %2d',ymdh(idx(i),2:3));
  %dispstat(msg,'keepthis')
  data=zeros(51681,8);      % preallocate based on num lat/lon points
  for j=0:7  % ok, process each of these, they're one day's worth
    %msg=sprintf('Reading hourly file: %s: ',d(idx(i)+j).name);
    %dispstat(msg)
    data(:,j+1)=csvread(d(idx(i)+j).name,0,2);  % add hourly data
  end
  mo=ymdh(idx(i),2);  % this month
  da=ymdh(idx(i),3);  % this day
  precip(:,da,mo)=sum(data,2);
  fn=d(idx(i)).name;  fn(16:17)=[]; fn(1:1)='O';
  %disp(fn)
  csvwrite(fn,precip(:,k))  
end
mTot=squeeze(sum(precip,2));   % monthly totals in each column 1-12
yTot=sum(mTot,2);              % yearly total

Now just write mTot and yTot to files as you want them. As noted before, you'll probably want to do some checking also on whether there's data in every day of the month before accumulating that total and then for the months before the yearly total or you may have totals that aren't actually representative of the entire month/year.

Using nan to initialize instead of zero is one way, then any sums with NaN as a value are indicative altho there you've got to fix up the detail of how many days are in each month but that's pretty simple bookkeeping...

30 Comments
Show 28 older commentsHide 28 older comments

Damith on 24 Mar 2015

Open in MATLAB Online

OK. I get this error just after running the code below. I am debugging the code to understand what's happening. I think it's I think it's

precip(:,d,m)=sum(data,2); 
Improper index matrix reference.
clear all
cd ('C:\Users\Desktop\test_1998')
%xy=xlsread('TRMM_1998_01_0100_newntcl.csv', '', 'A1:B51681');
d=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([d.name], ...
            'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
            'collectoutput',true));
idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
N=length(idx);
idx=[idx;length(ymdh)+1];
precip=nan(51681,31,12); % preallocate on max days/mo by 12 mo/yr
k=0;
for i=1:length(idx)-1
  isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
  if ~isOK, continue, end % either aren't eight or it's a wrong day mixed in
  data=zeros(51681,8);      % preallocate based on num lat/lon points
  for j=0:7  % ok, process each of these, they're one day's worth
    data(:,j+1)=csvread(d(idx(i)+j).name,0,2);  % add hourly data
  end
  m=ymdh(idx(i),2);  % this month
  d=ymdh(idx(i),3);  % this day
  precip(:,d,m)=sum(data,2);  % the daily sums in column day of month
  fn=d(idx(i)).name;  fn(16:17)=[];  % fn(1:1)='O';
  disp(fn)
  csvwrite(fn,precip(:,k))  
end
mTot=squeeze(sum(precip,2));   % monthly totals in each column 1-12
yTot=sum(mTot,2);              % yearly total

Damith on 24 Mar 2015

Edited: Damith on 24 Mar 2015

Open in MATLAB Online

I modified the code as shown below and checked the outputs of mTot and yTot and the results seems fine. But I want to write the monthly outputs as csv files. I tried the code below the for loop at the end but did not write the csv file. Any idea?

clear all
cd ('C:\Users\Desktop\test_1999')
dd=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([dd.name], ...
            'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
            'collectoutput',true));
idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
N=length(idx);
idx=[idx;length(ymdh)+1];
precip=nan(51681,31,12); % preallocate on max days/mo by 12 mo/yr
k=0;
kk=0;
for i=1:length(idx)-1
  isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
  if ~isOK, continue, end % either aren't eight or it's a wrong day mixed in
  data=zeros(51681,8);      % preallocate based on num lat/lon points
  for j=0:7  % ok, process each of these, they're one day's worth
    data(:,j+1)=csvread(dd(idx(i)+j).name,0,2);  % add hourly data
  end
  k=k+1;
  m=ymdh(idx(i),2);  % this month
  d=ymdh(idx(i),3);  % this day
  precip(:,d,m)=sum(data,2);  % the daily sums in column day of month
  fn=dd(idx(i)).name;  fn(16:17)=[];  % fn(1:1)='O';
  disp(fn)
  csvwrite(fn,precip(:,k))  
end
for ii=1:length(m)-1
  mTot=squeeze(sum(precip,2));   % monthly totals in each column 1-12
  fnn=dd(idx(ii)).name;  fnn(13:17)=[];  % fn(1:1)='O';
  kk=kk+1;
  disp(fnn)
  csvwrite(fnn,mTot(:,kk))
end
yTot=sum(mTot,2);

Damith on 25 Mar 2015

Edited: Damith on 29 Mar 2015

Open in MATLAB Online

Thanks again.

I have made mistake previously. See the code below. I think I am having a problem here. I tested the code with 'test_2003' data and looked at the mTot matrix (see the image below). The code does not produce the monthly sum for the months of 2 and 4 (see the image below).

Please download 'test_2003' data at :

https://bft.usu.edu/r7ht8

My problem to obtain the file name to loop through with year and month write to csv files from the _mTot_matrix.

clear all
cd ('C:\Users\Desktop\test_2003')
xy=xlsread('TRMM_2003_01_0100_newntcl.csv', '', 'A1:B51681');
d=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([d.name], ...
            'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
            'collectoutput',true));
idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
N=length(idx);
idx=[idx;length(ymdh)+1];
precip=nan(51681,31,12); % preallocate on max days/mo by 12 mo/yr
k=0;
for i=1:length(idx)-1
  isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
  if ~isOK, continue, end % either aren't eight or it's a wrong day mixed in
  data=zeros(51681,8);      % preallocate based on num lat/lon points
  for j=0:7  % ok, process each of these, they're one day's worth
    data(:,j+1)=csvread(d(idx(i)+j).name,0,2);  % add hourly data
  end
  k=k+1;
  mo=ymdh(idx(i),2);  % this month
  da=ymdh(idx(i),3);  % this day
  precip(:,da,mo)=sum(data,2);  % the daily sums in column day of month
  fn=d(idx(i)).name;  fn(16:17)=[];  % fn(1:1)='O';
  disp(fn)
  csvwrite(fn,[xy precip(:,k)])  
end
mTot=squeeze(sum(precip,2));

dpb on 25 Mar 2015

Edited: dpb on 25 Mar 2015

"...does not produce the monthly sum for the months of 2 and 4..."

That would indicate there's missing values at least one day within those months. That's what I outlined just a couple of comments ago that you've got to decide how to treat missing data--

"Again, you'll have to decide how to deal with the issue of missing data somehow; perhaps that's not an issue in general but the samples you've given do have at least one day missing a 3-hr period that was ignored so that means by similarity that that corresponding month/year is missing a day and hence also is missing which means that the year altogether would be missing. That seems like probably too onerous of a criterion but I can't tell you how to treat missing values; that's a problem you've got to address."

As that note says, if you carry thru to the higher levels the same logic of ignoring a missing hour or two meaning to ignore that day, then if that day is missing, the corresponding month isn't complete and so that month will be missing. Consequently, at the next step, that month is missing and so the year will be incomplete and the year will also be missing.

I don't (and can't) know what you want to do about missing values...you can pretend they're zero, or try to impute something for those missing times or you can simply live with the resulting holes. The data are what they are, I don't think there's anything wrong with the code altho not having that particular set of data I can't reproduce it here (and won't, this is, after all in the end, your research, not mine...)

If you want to "just see what happens" you can either

Use zeros instead of nan to initialize the array which is the "assume zero" option, or
Use nansum if have the Stat Toolbox to only sum the finite values (which is, in essence, the same assumption, just computed differently and that you do have the info to know which are and are not complete datasets that gets lost the other way)

Alternatively, you could go back and forget about the "missing hours" test and at least pick up partial days where they do exist, but I expect there will probably still be some holes where a full day may have been uncollected.

It's not necessarily an easy problem to solve cleanly...missing data are a pita albeit a fact of life. Again, I can't tell you what is most appropriate for you to do...

dpb on 25 Mar 2015

Edited: dpb on 26 Mar 2015

"due to NaN for months which do not have 31 days..."

Ah, yes. I didn't include it explicitly but my thinking was that one would use another logical array that overlays the rectangular array that is T for existing days in year by month. It would automagically account for leap year by looking at the year.

I tend to simply assume an implementer/end user will do the dirty work of finishing off a basic idea thrown together here, sorry...

ADDENDUM

I presume you've got this all working by now, and I've not done anything else beyond this point to sand off the rough edges but it came to me that perhaps the "clever" way is while doing the loop on days to zero fill the particular column/plane if there is data; then the summation will work as wanted for existing data but you'll still have the NaN indication for the missing locations. Still don't know how significant that's going to be in the larger scheme of things, but it's a starting point that will let you know rather than hiding a potential problem if simply either all zero fill or just nansum w/o checking.

Damith on 26 Mar 2015

Open in MATLAB Online

@dbp,

I was checking the daily output files again for year 2004 and I just noticed that the daily precipitation values calculated using the previous code (i.e. Code 1 below) is correct opposed to the code posted later with sum for monthly (i.e. Code 2). Therefore, I am extending the code 1 to calculate montly sum. Any comments?

Code 1:

d=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([d.name], ...
            'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
            'collectoutput',true));
idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
N=length(idx);
idx=[idx;length(ymdh)+1];
precip=nan(51681,fix(N)); % preallocate on num whole days
k=0;
for i=1:length(idx)-1
  isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
  if ~isOK, continue, end % either aren't eight or it's a wrong day mixed in
  data=zeros(51681,8);      % preallocate based on num lat/lon points
  for j=0:7  % ok, process each of these, they're one day's worth
    data(:,j+1)=csvread(d(idx(i)+j).name,0,2);  % add hourly data
  end
  k=k+1;
  precip(:,k)=sum(data,2);  % the daily sums are in an array 
  fn=d(idx(i)).name;  fn(16:17)=[];  %fn(1:1)='O';
  disp(fn)
  csvwrite(fn,[xy precip(:,k)])  
end
precip(:,k+1:end)=[];   % remove columns to leave complete days only

Code 2:

d=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([d.name], ...
            'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
            'collectoutput',true));
idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
N=length(idx);
idx=[idx;length(ymdh)+1];
precip=nan(51681,31,12); % preallocate on max days/mo by 12 mo/yr
k=0;
for i=1:length(idx)-1
  isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
  if ~isOK, continue, end % either aren't eight or it's a wrong day mixed in
  data=zeros(51681,8);      % preallocate based on num lat/lon points
  for j=0:7  % ok, process each of these, they're one day's worth
    data(:,j+1)=csvread(d(idx(i)+j).name,0,2);  % add hourly data
  end
  k=k+1;
  mo=ymdh(idx(i),2);  % this month
  da=ymdh(idx(i),3);  % this day
  precip(:,da,mo)=sum(data,2);  % the daily sums in column day of month
  fn=d(idx(i)).name;  fn(16:17)=[];  % fn(1:1)='O';
  disp(fn)
  csvwrite(fn,[xy precip(:,k)])  
end

dpb on 27 Mar 2015

Edited: dpb on 27 Mar 2015

The difference in the two is that the first simply increments a counter for the column irrespective of whether there is or is not data specifically for that day. The latter, otoh, puts the daily data into the column based on the day of the month.

ADDENDUM

Also remember that the former adds a new column for the same day for the subsequent month whereas the latter puts the total for the day in the same column but in the subsequent plane based on the actual month, 1-12.

I'd probably automate some checks--as suggested once very early on, you could add two rows to the first solution and keep the month and day for each column k in them. Then, use those when traversing the columns to look up the column/plane intersection of the second and see that all() of those corresponding elements are the same. I'm virtually certain it'll check out ok. END

I don't see any reason there should be any other difference between the two nor why there should be any problem in storing the data in the 3D array. Use the debugger to step through a few steps to satisfy yourself the indexing is correct but I'd first suspect k isn't the same as the day value is the cause of any discrepancy.

I suppose it's possible there's some other logic flaw but in all the testing I did here I didn't see any sign of it.

dpb on 27 Mar 2015

Edited: dpb on 27 Mar 2015

Open in MATLAB Online

OK, I did the above test...I modified the original version using simply an index by adding

moda=zeros(2,N); % allocate space for da, mo

and inside the loop when the precip() array is computed and stored, added

mo=ymdh(idx(i),2);  % this month
da=ymdh(idx(i),3);  % this day
moda(1,k)=mo; moda(2,k)=da; % save the date

which is the same logic as the second for the month and day so that's consistent with its storage. I then saved this set of precipitation totals as precip1 to not overwrite the 3D precip array and ran the two versions on the largest set of data I have here which is that for 1998.

After that, I wrote the following test at the command line --

>> for i=1:length(moda)
  if ~all(precip1(:,i)==precip(:,moda(2,k),moda(1,k))),
      fprintf('Mo%d Da%d',moda(:,i)),end,end
>>

Note I got no output so that there were no discrepancies found(*); the storage in the 3D array is, commensurate to that in the 2D array and the data in the corresponding vectors when retrieved for the same month and date are identical, as expected.

I see no reason the 3D won't work correctly for every data set as written.

() Just to prove to meself the logic was actually doing what intended, I did things like put a counter in an *else clause to calculate the number of matches and checked that it did in the end equal the length of the array (45 in this case).

Damith on 27 Mar 2015

Edited: Damith on 27 Mar 2015

Open in MATLAB Online

OK. Are you using the daily csv files to read and calculate monthly? or simply not creating a 3D array?

What is precip1 in the abobe code? or what does this do:

for i=1:length(moda)
  if ~all(precip1(:,i)==precip(:,moda(2,k),moda(1,k)))

Can I simply replace precip(:,da,mo)=sum(data,2); by precip(:,moda(2,k),moda(1,k)))

Do you think I can still use the squeeze function to calculate monthly with precip

d=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([d.name], ...
            'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
            'collectoutput',true));
idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
N=length(idx);
idx=[idx;length(ymdh)+1];
precip=nan(51681,31,12); % preallocate on max days/mo by 12 mo/yr
k=0;
moda=zeros(2,N);  % allocate space for da, mo
for i=1:length(idx)-1
  isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
  if ~isOK, continue, end % either aren't eight or it's a wrong day mixed in
  data=zeros(51681,8);      % preallocate based on num lat/lon points
  for j=0:7  % ok, process each of these, they're one day's worth
    data(:,j+1)=csvread(d(idx(i)+j).name,0,2);  % add hourly data
  end
  k=k+1;
  mo=ymdh(idx(i),2);  % this month
  da=ymdh(idx(i),3);  % this day
  moda(1,k)=mo; moda(2,k)=da; % save the date
  precip(:,da,mo)=sum(data,2);  % the daily sums in column day of month
  fn=d(idx(i)).name;  fn(16:17)=[];  % fn(1:1)='O';
  disp(fn)
  csvwrite(fn,[xy precip(:,k)])  
end

dpb on 27 Mar 2015

Edited: dpb on 27 Mar 2015

As said, I ran two scripts...the one that uses the incremented column irrespective of what the day really is and the second that builds the the 3D array; so that I still had both in memory when they were done, I just renamed precip in the former to precip1.

The if simply tests that all values in the column in the 3D array for the specific month and day are identical to those for the same month and day in the 2D array accounting for where the two are stored relative to each other. As noted, the linear column addressing doesn't account for which month/day it is, specifically, it just adds another column if there are data; if, for example, there's a day missing for Jan 2 of the first year then Jan 3 will be in column 2 and everybody else will also move up one. The same thing happens if there's any other missing days; the subsequent then will be that many days shifted from the nominal position of day-of-year. OTOH, the 3D solution has the advantage it does put all the data in the right location automagically, each 3rd day of the month in column 3, each month of the year in plane M for the correct month, 1 thru 12. The indexing in that loop simply extracts the day and month that were saved for each of the k columns and uses that to translate into the 3D array to get the same actual month/day data which are, of course, then shown to be identical.

The result of the test is to show that the logic building the 3D array works correctly; I don't see why you would want to do anything different...whatever the issue you've got has to be related to the fact that the columns you're comparing aren't the ones that should be compared owing to the storage location absolute column number difference outlined.

ADDENDUM

Both scripts computed the results independently from the .csv files; yes, that was the point of the exercise--to prove that the results are, in fact, identical if you compare the correct columns to each other.

Damith on 27 Mar 2015

Edited: Damith on 27 Mar 2015

Open in MATLAB Online

OK. So, I modified the code as shown below. squeeze doesn't do the correct job. I calculated the monthly values separately and the values in mTot is different after 1st month (i.e. Jan).

I used zeros for precip initialization (see the code below) and used sum instead of namsum. I noticed the problem now is the squeeze function. It does not produce the expected outcome. Is there a way to get the sum from a 3D array differently w/out using squeeze function.?

d=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([d.name], ...
            'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
            'collectoutput',true));
idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
N=length(idx);
idx=[idx;length(ymdh)+1];
precip=zeros(51681,31,12); % preallocate on max days/mo by 12 mo/yr
k=0;
moda=zeros(2,N);  % allocate space for da, mo
for i=1:length(idx)-1
  isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
  if ~isOK, continue, end % either aren't eight or it's a wrong day mixed in
  data=zeros(51681,8);      % preallocate based on num lat/lon points
  for j=0:7  % ok, process each of these, they're one day's worth
    data(:,j+1)=csvread(d(idx(i)+j).name,0,2);  % add hourly data
  end
  k=k+1;
  mo=ymdh(idx(i),2);  % this month
  da=ymdh(idx(i),3);  % this day
  moda(1,k)=mo; moda(2,k)=da; % save the date
  %precip(:,da,mo)=sum(data,2);  % the daily sums in column day of month
  precip(:,moda(2,k),moda(1,k))=sum(data,2);  % the daily sums in column day of month
  fn=d(idx(i)).name;  fn(16:17)=[];  % fn(1:1)='O';
  disp(fn)
  csvwrite(fn,[xy precip(:,k)])  
end
mTot=squeeze(sum(precip,2));

dpb on 27 Mar 2015

Open in MATLAB Online

Well, w/o trying to debug what you've modified needlessly, my first guess is that the change you made is wrong. You're now mixing up the k indexing with the 3D and I'm betting it's not consistent that way w/o really working on it which I'm not spending time doing since the original 3D version was shown to be correct above. If you're going to use the 3D array, use it; if not do whatever you want however you see fit.

With the properly aligned 3D array wherein the months are all in the same column on top of each other, squeeze after the summation does "do the right thing"; it collapses the resulting one dimension leaving a subsequent 2D array.

To illustrate, we'll use some dummy data on a smaller scale; the principle is identical...

>> p=repmat([0.1:0.1:0.7],10,1);   % small set of a vector to 2D
>> for i=2:3, p(:,:,i)=i*p(:,:,1); % add two more planes for months
>> p
p(:,:,1) =
1000    0.2000    0.3000    0.4000    0.5000    0.6000    0.7000
1000    0.2000    0.3000    0.4000    0.5000    0.6000    0.7000
1000    0.2000    0.3000    0.4000    0.5000    0.6000    0.7000
    ...
1000    0.2000    0.3000    0.4000    0.5000    0.6000    0.7000
1000    0.2000    0.3000    0.4000    0.5000    0.6000    0.7000
p(:,:,2) =
2000    0.4000    0.6000    0.8000    1.0000    1.2000    1.4000
2000    0.4000    0.6000    0.8000    1.0000    1.2000    1.4000
2000    0.4000    0.6000    0.8000    1.0000    1.2000    1.4000
    ...
2000    0.4000    0.6000    0.8000    1.0000    1.2000    1.4000
2000    0.4000    0.6000    0.8000    1.0000    1.2000    1.4000
2000    0.4000    0.6000    0.8000    1.0000    1.2000    1.4000
p(:,:,3) =
3000    0.6000    0.9000    1.2000    1.5000    1.8000    2.1000
3000    0.6000    0.9000    1.2000    1.5000    1.8000    2.1000
3000    0.6000    0.9000    1.2000    1.5000    1.8000    2.1000
3000    0.6000    0.9000    1.2000    1.5000    1.8000    2.1000
3000    0.6000    0.9000    1.2000    1.5000    1.8000    2.1000
3000    0.6000    0.9000    1.2000    1.5000    1.8000    2.1000
3000    0.6000    0.9000    1.2000    1.5000    1.8000    2.1000
3000    0.6000    0.9000    1.2000    1.5000    1.8000    2.1000
3000    0.6000    0.9000    1.2000    1.5000    1.8000    2.1000
3000    0.6000    0.9000    1.2000    1.5000    1.8000    2.1000
>>

So, above we've got a small test case of 7 days by 3 months...let's create the "monthly" and "yearly" sums...

>> mTot=sum(p,2)
mTot(:,:,1) =
    2.8000
    2.8000
    ...
    2.8000
    2.8000
    2.8000
    2.8000
mTot(:,:,2) =
    5.6000
    5.6000
    5.6000
    ...
    5.6000
mTot(:,:,3) =
    8.4000
    8.4000
    ...
    8.4000
    8.4000
>> whos mTot
  Name       Size             Bytes  Class     Attributes
    mTot      10x1x3              240  double

OK, the above is obviously the correct sum for the seven columns but it's spread out over three planes with the values for each coordinate a vector in that plane. That's inconvenient so instead

>> mTot=squeeze(mTot)
mTot =
8000    5.6000    8.4000
8000    5.6000    8.4000
8000    5.6000    8.4000
8000    5.6000    8.4000
8000    5.6000    8.4000
8000    5.6000    8.4000
8000    5.6000    8.4000
8000    5.6000    8.4000
8000    5.6000    8.4000
8000    5.6000    8.4000
>>

So the squeeze operation removes the singleton second subscript and leaves us with a 2D array of the same values where now the 2nd dimension is that of the months. It's identical in result whether use two steps as above or collapse into one as I did in the original code.

Now, the yearly total is again the sum across the columns which will leave a single vector of length of the number of coordinates.

>> yTot=sum(mTot,2);
yTot =
8000
8000
8000
8000
8000
8000
8000
8000
8000
8000
>>

The size of the arrays is absolutely immaterial to the shape operations as outlined above the only difference in this example and the real thing is the size of the arrays; the resulting shape after the summation and dimension reduction is identical and is, as shown above, indeed what you want/need.

The only "trick" is again treating the shorter months and missing days; as I suggested if you fill the full column with zeros initially for each day that has data, then the only NaN left will be those for missing days and sum will reflect that in the totals.

Damith on 27 Mar 2015

Edited: Damith on 29 Mar 2015

Open in MATLAB Online

@dbp,

So, frustrated, I checked the output using a simple script to observe the daily and monthly outputs. I am sorry to bother you so much but I am kind of frustrated since the code below still gives me the wrong outputs. PLEASE DOWNLOAD THE DATA FROM THE LINK ATTACHED AND TRY IT. (data for year 2004, my intention is to run this code yearly, i.e. in one folder there are no mixed years).

So, please do the necessary changes to code below.

d=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([d.name], ...
            'TRMM_%4d_%2d_%2d%2d_newntcl.csv', ...
            'collectoutput',true));
idx=find(ymdh(:,end)==0);  % get the beginning day 0 hr locations
N=length(idx);
idx=[idx;length(ymdh)+1];
precip=zeros(51681,31,12); % preallocate on max days/mo by 12 mo/yr
k=0;
moda=zeros(2,N);  % allocate space for da, mo
for i=1:length(idx)-1
  isOK=~any(diff(ymdh(idx(i):idx(i+1)-1,3))) & idx(i+1)-idx(i)==8;
  if ~isOK, continue, end % either aren't eight or it's a wrong day mixed in
  data=zeros(51681,8);      % preallocate based on num lat/lon points
  for j=0:7  % ok, process each of these, they're one day's worth
    data(:,j+1)=csvread(d(idx(i)+j).name,0,2);  % add hourly data
  end
  k=k+1;
  mo=ymdh(idx(i),2);  % this month
  da=ymdh(idx(i),3);  % this day
  moda(1,k)=mo; moda(2,k)=da; % save the date
  precip(:,da,mo)=sum(data,2);  % the daily sums in column day of month
  fn=d(idx(i)).name;  fn(16:17)=[];  % fn(1:1)='O';
  disp(fn)
  csvwrite(fn,[xy precip(:,k)])  
end

I checked the daily and monthly outputs using the code below from the precip 3D array and compared with the sum of data directly read from the csv (sub-daily) files. I compared the mTot_Jan, mTot_Feb etc with the values I directly read from the sub-daily csv files. So, the monthly values are WRONG for some months.

    for iJan=1:31
        JanT(:,iJan)=precip(:,iJan,1);
    end
    if(mod(ymdh(1,1),4)==0)
       mend=29;
    else mend=28;
    end   
    for iFeb=1:mend
        FebT(:,iFeb)=precip(:,iFeb,2);
    end
    for iMar=1:31
        MarT(:,iMar)=precip(:,iMar,3);
    end
    for iApr=1:30
        AprT(:,iApr)=precip(:,iApr,4);
    end
    for iMay=1:31
        MayT(:,iMay)=precip(:,iMay,5);
    end
    for iJun=1:30
        JunT(:,iJun)=precip(:,iJun,6);
    end
    for iJul=1:31
        JulT(:,iJul)=precip(:,iJul,7);
    end
    for iAug=1:31
        AugT(:,iAug)=precip(:,iAug,8);
    end
    for iSep=1:30
        SepT(:,iSep)=precip(:,iSep,9);
    end
    for iOct=1:31
        OctT(:,iOct)=precip(:,iOct,10);
    end
    for iNov=1:30
        NovT(:,iNov)=precip(:,iNov,11);
    end
    for iDec=1:31
        DecT(:,iDec)=precip(:,iDec,12);
    end
    mTot_Jan=sum(JanT,2);
    mTot_Feb=sum(FebT,2);
    mTot_Mar=sum(MarT,2);
    mTot_Apr=sum(AprT,2);
    mTot_May=sum(MayT,2);
    mTot_Jun=sum(JunT,2);
    mTot_Jul=sum(JulT,2);
    mTot_Aug=sum(AugT,2);
    mTot_Sep=sum(SepT,2);
    mTot_Oct=sum(OctT,2);
    mTot_Nov=sum(NovT,2);
    mTot_Dec=sum(DecT,2);

dpb on 28 Mar 2015

Edited: dpb on 28 Mar 2015

Open in MATLAB Online

I ran the identical code as that I had in the updated "Answer" http://www.mathworks.com/matlabcentral/answers/182710-creating-a-matrix-after-giving-initial-conditions#answer_172554 with the exception of a progress indicator of which file was being processed and the day being built which have no bearing on the actual logic and looked at a particular location...the first node for all days for the first month...

>> precip(1,:,1)
ans =
Columns 1 through 9
     0  0.0400       0       0       0  0.9400       0       0       0
Columns 10 through 18
     0       0       0  0.3900  0.0100       0       0       0       0
Columns 19 through 27
     0       0  0.1600       0       0  8.8100  0.0200  2.1000       0
Columns 28 through 31
     0  0.0200       0       0

That's the first month...day 24 seems interesting; it's got a sizable total accumulation so let's see if it's ok...

>> d24=dir('TRMM_2004_01_24*_newntcl.csv');  % return all files for Jan24
>> tot=0;for i=1:length(d24)
d=csvread(d24(i).name,0,2); 
tot=tot+d(1); disp([d(1) tot]),end
     0     0
    0.1700    0.1700
    8.6400    8.8100
         0    8.8100
         0    8.8100
         0    8.8100
         0    8.8100
         0    8.8100
>>

Voila!! They match. That a given day/month matches since it's a generic routine seems pretty conclusive to me there's not a logic error in the building of the 3D array nor the summations. I have essentially 100% confidence the same result would hold for any combination of node and day and month. I can still only guess that you've got something amiss in the indexing.

Note there are only six months included and only the first 24 days of that month are complete in the data that was in the download unless if you somehow have additional data other than that and are getting some other values that aren't in the processing by a file-naming issue or such.

ADDENDUM

Oh, the other thing that would screw up a comparison would be to include the values from the incomplete June 25 data in your totals as compared to the processed data that requires a full day (or any other missing days; I glanced at the list as it was processed and it looked complete but I didn't check that religiously; it's possible there's another incomplete day in there that I didn't notice...but if you aren't culling out those incomplete days your comparisons will be different assuming there's any accumulation in one of those hourly files that was discarded because the day wasn't complete.

If that were the case then we're back to the previous discussion of your having to decide how you want to treat them.

ADDENDUM 2

Just to check on the monthly totals from raw data as well...

>> jTot=0;
>> dJan=dir('TRMM_2004_01_*_newntcl.csv');
>> for i=1:length(dJan)
     p=csvread(dJan(i).name,0,2);
     jTot=jTot+p(1);
     if p(1)>0
       disp([dJan(i).name ' ' num2str([p(1) jTot])]),end,end
TRMM_2004_01_0206_newntcl.csv 0.04        0.04
TRMM_2004_01_0612_newntcl.csv 0.94        0.98
TRMM_2004_01_1315_newntcl.csv 0.39        1.37
TRMM_2004_01_1412_newntcl.csv 0.01        1.38
TRMM_2004_01_2115_newntcl.csv 0.16        1.54
TRMM_2004_01_2403_newntcl.csv 0.17        1.71
TRMM_2004_01_2406_newntcl.csv 8.64        10.35
TRMM_2004_01_2515_newntcl.csv 0.02        10.37
TRMM_2004_01_2603_newntcl.csv 1.55        11.92
TRMM_2004_01_2606_newntcl.csv 0.55        12.47
TRMM_2004_01_2909_newntcl.csv 0.02        12.49
>> mTot(1,:)
ans =
Columns 1 through 9
 12.4900   21.4800    3.0000    8.7200   27.4500    9.7000         0         0         0
Columns 10 through 12
       0         0         0
>>

Again, the values add up correctly.

dpb on 28 Mar 2015

Edited: dpb on 28 Mar 2015

Open in MATLAB Online

March looks ok to me...

>> mTot(1,:)
ans =
Columns 1 through 9
 12.4900   21.4800    3.0000    8.7200   27.4500    9.7000 ...
Columns 10 through 12
       0         0         0
>> dMar=dir('TRMM_2004_03_*_newntcl.csv');
>> length(dMar)/8  % check it's got full complement
ans =
  31
>> mrTot=0;
>> for i=1:length(dMar),p=csvread(dMar(i).name,0,2);mrTot=mrTot+p(1);if p(1)>0,disp([dMar(i).name ' ' num2str([p(1) mrTot])]),end,end
TRMM_2004_03_0209_newntcl.csv 0.15        0.15
TRMM_2004_03_0212_newntcl.csv 0.02        0.17
TRMM_2004_03_0218_newntcl.csv 0.02        0.19
TRMM_2004_03_0821_newntcl.csv 0.04        0.23
TRMM_2004_03_1200_newntcl.csv 0.41        0.64
TRMM_2004_03_1306_newntcl.csv 0.17        0.81
TRMM_2004_03_1312_newntcl.csv 0.05        0.86
TRMM_2004_03_1618_newntcl.csv 0.01        0.87
TRMM_2004_03_1709_newntcl.csv 0.23         1.1
TRMM_2004_03_1721_newntcl.csv 0.01        1.11
TRMM_2004_03_2009_newntcl.csv 0.56        1.67
TRMM_2004_03_2115_newntcl.csv 0.09        1.76
TRMM_2004_03_2303_newntcl.csv 1.01        2.77
TRMM_2004_03_2412_newntcl.csv 0.01        2.78
TRMM_2004_03_2709_newntcl.csv 0.1        2.88
TRMM_2004_03_2915_newntcl.csv 0.04        2.92
TRMM_2004_03_3003_newntcl.csv 0.08           3
>>

That matches mTot(1,3) as expected. Try another random point...

>> idx-randi(length(precip))
idx =
      9399
>> dMar=dir('TRMM_2004_03_*_newntcl.csv');
>> mrTot=0;
>> for i=1:length(dMar),p=csvread(dMar(i).name,0,2);mrTot=mrTot+p(idx);
if p(idx)>0,disp([dMar(i).name ' ' num2str([p(idx) mrTot])]),end,end
TRMM_2004_03_0809_newntcl.csv 0.01        0.01
TRMM_2004_03_0815_newntcl.csv 2.83        2.84
TRMM_2004_03_1300_newntcl.csv 0.04        2.88
TRMM_2004_03_1512_newntcl.csv 0.08        2.96
TRMM_2004_03_1806_newntcl.csv 0.02        2.98
TRMM_2004_03_2303_newntcl.csv 11.07        14.05
TRMM_2004_03_2306_newntcl.csv 0.04        14.09
TRMM_2004_03_2309_newntcl.csv 1.43        15.52
TRMM_2004_03_2706_newntcl.csv 0.03        15.55
TRMM_2004_03_3015_newntcl.csv 0.08        15.63
TRMM_2004_03_3103_newntcl.csv 0.29        15.92
>> mTot(idx,3)
ans =
 15.9200
>>

I can't seem to find a value that doesn't compare ok as long as there are full days.

I don't understand your other comment regarding what you prefer; as requested the script ignores all days which do not have all eight 3-hr readings as we've discussed before.

Again, how you want to handle the missing values is a decision you have to make as to what it means to have data recorded for a partial day and then how that is going to be propagated up to monthly and yearly sums. I've no real input on how you should do that or whether you even should or not.

As previously noted, however, the zero-fill and the nansum options both equate to the same thing as presuming the data are complete with no precipitation for any of those hours/days which clearly isn't the right answer in general altho it undoubtedly is for specific hourly observations.

It's why I had suggested NaN for the initial full array allocation and then to zero-fill for a full column that is processed before storing the data. That then only propagates a NaN for sums that do include no day but avoids doing so for the months with less than 31 days without negative consequences on those months totals but will show you where there are holes in you cumulative data. As noted, I fear that may result in having no yearly totals because you probably have no records that are truly complete. If so, yet again, then, you're back to the problem of what's a reasonable alternative that doesn't corrupt the database too badly? I've no klew on what to tell you about that conundrum, unfortunately, sorry...

But, my conclusion is the code is working as intended to process the 3-hr data files to aggregate them; I don't see any case that actually generates a bogus result for the summations as compared to the individual files.

dpb on 28 Mar 2015

Edited: dpb on 28 Mar 2015

"As previously noted, however, the zero-fill and the nansum options both equate to the same thing as presuming the data are complete with no precipitation for any of those hours/days ..."

That is, once you accumulate the sums by either of these ways you've lost the indication of which were or were not full days; that piece of information has been lost. Consequently, it appears from all that anybody could tell by reading your newly-created data files that that is what the results were as measured in toto; it's a false impression, however, we know there are at least some days that have partial records that aren't being reported and we also know there are large chunks missing entirely (or at least with the data I've seen so far). That surely should be made clear to whoever is planning on using these data for anything other than just observing numbers where there happen to be some--surely one could make no claim of an annual total of any accuracy that way or compare to other historical data or whatever one had in mind to do with the data.

dpb on 28 Mar 2015

Open in MATLAB Online

OK, one last try to verify the previous conjecture--let's look at June.

>> juTot=0;
>> for i=1:length(dJun)
    p=csvread(dJun(i).name,0,2);
    juTot=juTot+p;
    if any(p)>0,disp([dJun(i).name ]),end,end
TRMM_2004_06_0100_newntcl.csv
TRMM_2004_06_0106_newntcl.csv
TRMM_2004_06_0112_newntcl.csv
TRMM_2004_06_0118_newntcl.csv
...
TRMM_2004_06_2200_newntcl.csv
TRMM_2004_06_2206_newntcl.csv
TRMM_2004_06_2212_newntcl.csv
TRMM_2004_06_2218_newntcl.csv
TRMM_2004_06_2300_newntcl.csv
TRMM_2004_06_2306_newntcl.csv
TRMM_2004_06_2312_newntcl.csv
TRMM_2004_06_2318_newntcl.csv
TRMM_2004_06_2400_newntcl.csv
TRMM_2004_06_2406_newntcl.csv
TRMM_2004_06_2412_newntcl.csv
TRMM_2004_06_2418_newntcl.csv
TRMM_2004_06_2500_newntcl.csv
TRMM_2004_06_2506_newntcl.csv
TRMM_2004_06_2512_newntcl.csv
TRMM_2004_06_2518_newntcl.csv
>> 
>> max(abs(juTot-mTot(:,6)))  % oooh, there is a problem!!!
ans =
 49.6600
>>

OK, now let's limit the loop to only the full days of 1-24...

>> juTot=0;
>> N=8*floor(length(dJun)/8)
   for i=1:N
     p=csvread(dJun(i).name,0,2);
     juTot=juTot+p;
     if any(p)>0,disp([dJun(i).name ]),end,end
TRMM_2004_06_0100_newntcl.csv
TRMM_2004_06_0106_newntcl.csv
TRMM_2004_06_0112_newntcl.csv
TRMM_2004_06_0118_newntcl.csv
TRMM_2004_06_0200_newntcl.csv
TRMM_2004_06_0206_newntcl.csv
TRMM_2004_06_0212_newntcl.csv
TRMM_2004_06_0218_newntcl.csv
TRMM_2004_06_0300_newntcl.csv
TRMM_2004_06_0306_newntcl.csv
...
TRMM_2004_06_2400_newntcl.csv
TRMM_2004_06_2406_newntcl.csv
TRMM_2004_06_2412_newntcl.csv
TRMM_2004_06_2418_newntcl.csv
>> i
i =
 192
>> max(abs(juTot-mTot(:,6)))
ans =
 8.5265e-14
>> all(abs(juTot-mTot(:,6))<5*eps(max(juTot)))
ans =
   1
>>

OK, they're not absolutely identical owing to the difference in order in the summations between the script and the one-at-a-time processing, but the difference is in the expected magnitude of floating point roundoff.

NB: also that this test is accumulating the full-length summation rather than just a single index so we've proven the contention made earlier than the point index used is immaterial to testing the logic; testing one value is as good as testing them all for storage location.

It also proves the point made earlier that as long as one processes only complete data then the sums being built in the script are ok based on the storage location by column/plane by day/month; it's when you add in the values that are collected for a day that isn't complete that a real discrepancy occurs.

So, it's a decision you've got to make about how to handle the incomplete and missing data.

dpb on 29 Mar 2015

Edited: dpb on 29 Mar 2015

Open in MATLAB Online

That's testing whether there is NaN in location that is valid for the day of month for the given month (remembering leap year of course). That's the idea I outlined earlier of the logic array to select those valid days. How much effort you must go to is dependent upon whether it's possible there are months with "holes" in them or whether it's always only whether there are 1:N<=M where N is the number of days of data and M is the number of days in the given month.

Another possible tack is to add a sum for the number of days that have been accumulated for each month by month in the script at the point the precip array is being populated; then you can simply check that the content of that array is the same as the expected number of days in the month for the given year.

Unfortunately, the simple expedient of zero-ing the day vector for the month doesn't work to prevent the partial months from being totalled; it then would include June in the 2004 sum so have to go to the (slight) extra effort.

The second idea above would be something like

nDays=zeros(12,1); % initialize before the outer loop

and then after the computation of the month,day variables mo, da insert

nDays(mo)=nDays(mo)+1;

This will give you the count of the actual full days that were found which you can compare to the expected number.

Damith on 29 Mar 2015

Edited: Damith on 29 Mar 2015

Open in MATLAB Online

OK. I tried this as suggested. But still mTot gives the exact same result as shown in the Excelwork sheet. I checked the number of sub-daily files and there are no missing files. For examples, from Jan 01 - Jun 24 (176 days) and I have 176 * 8 = 1408 files.

I tried something like this by reading daily csv files (e.g.TRMM_2004_01_01_newntcl.csv and so on). Can you please suggest me a efficient method that I can adopt for all years and counting for months which has complete daily csv files?

Now the total is matching with the Excel file. (See the image below)

d=dir('TRMM_*_newntcl.csv');
ymdh=cell2mat(textscan([d.name], ...
            'TRMM_%4d_%2d_%2d_newntcl.csv', ...
            'collectoutput',true));
idx=find(ymdh(:,end)==1);
N=length(idx);
idx=[idx;length(ymdh)+1];
arr=length(idx)-1:-1:1;
Jan=zeros(51681,31);
Feb=zeros(51681,29);
Mar=zeros(51681,31);
Apr=zeros(51681,30);
May=zeros(51681,31);
Jun=zeros(51681,30);
Jul=zeros(51681,31);
Aug=zeros(51681,31);
Sep=zeros(51681,30);
Oct=zeros(51681,31);
Nov=zeros(51681,30);
Dec=zeros(51681,31);
% January
for ik=1:length(idx)-arr(1)
    for j=0:30  % ok, process each of these, they're one day's worth
    Jan(:,j+1)=csvread(d(idx(ik)+j).name,0,2);  % add hourly data
    end    
end
% February
if(mod(ymdh(1,1),4)==0)
   mend=29;
else mend=28;
end   
for ik=1:length(idx)-arr(2)
    for j=0:mend-1  % ok, process each of these, they're one day's worth
    Feb(:,j+1)=csvread(d(idx(ik)+j).name,0,2);  % add hourly data
    end    
end
% March
for ik=1:length(idx)-arr(3)
    for j=0:30  % ok, process each of these, they're one day's worth
    Mar(:,j+1)=csvread(d(idx(ik)+j).name,0,2);  % add hourly data
    end    
end
% April
for ik=1:length(idx)-arr(4)
    for j=0:30-1  % ok, process each of these, they're one day's worth
    Apr(:,j+1)=csvread(d(idx(ik)+j).name,0,2);  % add hourly data
    end    
end
% May
for ik=1:length(idx)-arr(5)
    for j=0:30  % ok, process each of these, they're one day's worth
    May(:,j+1)=csvread(d(idx(ik)+j).name,0,2);  % add hourly data
    end    
end

dpb on 30 Mar 2015

Edited: dpb on 30 Mar 2015

Open in MATLAB Online

Somewhat against my better judgment (I think you need to develop some more critical thinking of solving the details of the logic problem outlined rather than relying on somebody else to provide complete solutions down to every detail), the following should work for any combination of missing days...

In the early portion of the script after building the ymdh array of dates/times from the file names insert

edpm=bsxfun(@eomday,double(ymdh(1)),[1:12].');

Then, after building the precip array, before the summation for the totals,

precip(isnan(precip(:,:,nDays==edpm)))=0;

Then revert to using builtin sum instead of nansum in

mTot=squeeze(sum(precip,2));

and joy should ensue.

I'll leave it to you to decipher why/how it works; hopefully from that exercise you'll gain a little bit of the thought process of using logical addressing within Matlab which is a key concept to grasp in effective use thereof.

ADDENDUM

The above will ensure complete months, if you do want to include any month in which there is a complete day, change the logic test from "==" to ">0". If you want there to be some minimum number of days in a month, then that would replace the 0, of course. In this case there's no need for the expected number of days/month, it's simply a test of whether there were any (or some number of) days counted.

I'll also reiterate the caution made previously in that if you do this, the data files created need to somewhere contain the information that some month(s) may be incomplete. This information is lost in the final output above; only keeping the result of the logic test for the specific month along with the totals will provide any way for the user of such data to know that is so from that file alone. Doing anything less is, in my opinion, a serious breach of scientific ethics.

One way, at a minimum, if you for some reason can't include the data in the files themselves would be to add an additional numeric code into the file name that would be the nDays(mo) value for the month. Then, at least the user would be able to check that against the expected days/month for complete datasets.

Similar arguments apply to the yearly totals, of course, but one presumes there may not be any of those that are complete in having every day of every month for the entire year. Still, encoding the days that are included would be significant information if can't include in the file the complete record of which days are missing/present.

dpb on 31 Mar 2015

Open in MATLAB Online

BTW, if you haven't yet figured this out, you could save tons of time if you were to run the above script once for each year subdirectory and then

save precip

in that working directory. Then you can simply

load precip

for that year and experiment with how to create and save the monthly and yearly sums to your heart's content and build whatever files from there without having to reread the same values over and over and over which is quite slow.

Sign in to comment.

Creating a matrix after giving initial conditions

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

11 Comments
Show 9 older commentsHide 9 older comments

More Answers (2)

30 Comments
Show 28 older commentsHide 28 older comments

30 Comments
Show 28 older commentsHide 28 older comments

See Also

Categories

Tags

Community Treasure Hunt

Creating a matrix after giving initial conditions

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

11 Comments Show 9 older commentsHide 9 older comments

More Answers (2)

30 Comments Show 28 older commentsHide 28 older comments

30 Comments Show 28 older commentsHide 28 older comments

0 Comments
Show -2 older commentsHide -2 older comments

11 Comments
Show 9 older commentsHide 9 older comments

30 Comments
Show 28 older commentsHide 28 older comments

30 Comments
Show 28 older commentsHide 28 older comments