How to organise data based on a large range of numbers?

2 views (last 30 days)
I have a set of data that needs to be organised by season. Each row represents a different sample with each column representing the different data collected for each sample. Column 8 represents the day of the year the sample was collected so I want to use this to get an estimate of the season in which the sample was collected, ie. if the day of the year is from 1-59 as well as 335-366 the data was collected in summer, if the day of the year the sample was collected was between the 60th and the 151st day of the year then the sample was collected in Autumn, and so on.... As the seasons encompass numerous days of the year I have been finding it difficult to extract organise the data into seasons. I was wondering if anyone knew as to how this could be done? Thanks

Accepted Answer

Cedric
Cedric on 21 Oct 2013
Edited: Cedric on 22 Oct 2013
You can build a look up table, use it to create a vector of season IDs which match column 8 of your data, and then use the latter vector of season IDs to select relevant rows of your data set.
% - Build look up table for seasons, per day: 1: summer, 2: autumn,
% 3: winter, 4: spring.
daySeason_LT = ones(366, 1) ; % Already 1 for summer.
daySeason_LT(60:151) = 2 ; % Autumn.
daySeason_LT(152:243) = 3 ; % Winter (adjust if needed).
daySeason_LT(244:334) = 4 ; % Spring (adjust if needed).
When you index daySeason_LT with the day ID, you get the season ID. For example:
>> daySeason_LT(220)
ans =
3
so season ID = 3 = Winter for day 220. Now you add a column to your data set, or you create a separate vector of season IDs, whose rows are the season IDs matching day IDs (on column 8). Assuming you data set is stored in a numeric array named data, you build a vector of season IDs as follows
% - Build vector of season IDs matching day IDs (column 8).
seasonID = daySeason_LT(data(:,8)) ;
Now you can easily select relevant data by season as follows: assume you want to get the mean of column 5 for winter days (season ID 3):
lid = seasonID == 3 ;
if you look at lid, you'll see that it is a vector of logicals which identify locations of seasonID with a value of 3. You can use this vector of logicals for indexing rows of data (called logical indexing):
% - Compute mean of winter entries of column 5.
values = data(lid,5) ;
theMean = mean( values ) ;
Here, we first extract column 5 of data but only rows "flagged" by lid, and we take the mean of these values. You can also extract a block of data with all columns but only rows "flagged" by lid:
% Extract block of data for winter entries.
data_winter = data(lid,:) ;
Play a bit with this material and it will become clear. Display intermediary variables when you can and display their sizes as well.
PS: I just answered your previous question as well, in case you don't check its thread anymore.

More Answers (1)

Andrei Bobrov
Andrei Bobrov on 21 Oct 2013
ssn = {'summer','autumn','winter','spring'};
d = [31 28 31 30 31 30 31 31 30 31 30 31];
dd = cumsum(d');
b = [1;dd(cumsum([2 3 3 3 1]'))];
x = sort(randi(365,20,1)); % Let x is data as column 8 from your data
[~,ii] = histc(x,b);
ii(ii==5) = 1; % or ii = rem(ii-1,4)+1;
out = ssn(ii);
  1 Comment
Natalie
Natalie on 21 Oct 2013
I have tried to substitute my data into this but I keep running into problems such as 'index exceeds matrix dimensions'. I have played around with it a bit but seem to always be running into some kind of problem... I can get it to read the season from a specific row but cannot get it to do it for the whole column. Sorry I'm new to matlab and am still learning. Thanks

Sign in to comment.

Categories

Find more on Creating and Concatenating Matrices in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!