How to deseasonalize this temperature record?

Hi all!
I have a temperature record (Temp.mat) that holds hourly temperature information from 2005 to 2019. Can anyone please give me an idea on how to deseasonalize the data?
any feedback will be greatly appreciated!!

 Accepted Answer

If you have the Signal Processing Toolbox, I'd use a Savitzky-Golay filter, sgolayfilt, with a third order polynomial and a window size of about 3 months. This will give you something like a sine wave. Then you can subtract that smoothed signal from your original signal to get the hourly deviation from the "average" for that time.
Try this:
fontSize = 16;
s = load('Temperatures.mat')
s = struct with fields:
Temp: [131274×1 double]
temps = s.Temp;
subplot(3, 1, 1);
plot(temps, 'b-')
xlabel('Hour', 'FontSize', fontSize)
ylabel('Temperature', 'FontSize', fontSize)
grid on;
title('Actual Temperature', 'FontSize', fontSize)
% Smooth the signal.
windowLength = round(numel(temps) / 60);
% Make sure it's odd
if rem(windowLength, 2) == 0
windowLength = windowLength + 1
end
windowLength = 2189
smoothedTemps = sgolayfilt(temps, 2, windowLength);
subplot(3, 1, 2);
plot(smoothedTemps, 'r-');
xlabel('Hour', 'FontSize', fontSize)
ylabel('Temperature', 'FontSize', fontSize)
title('Seasonal Average Temperature', 'FontSize', fontSize)
grid on;
% Get the deviation from average.
deviationtemps = temps - smoothedTemps;
subplot(3, 1, 3);
plot(deviationtemps, 'b-');
xlabel('Hour', 'FontSize', fontSize)
ylabel('Temperature', 'FontSize', fontSize)
grid on;
title('Detrended Temperature', 'FontSize', fontSize)
If you want, you can get the average of all the years and synthesize a smoothed signal from that. Or you can use interp1 to fill in gaps in your original data.

10 Comments

Thank you for this new idea! I did not know about it.
But this is not what I am looking for. As you can see from the 2nd subplot, the seasonality in this data is dominant. I want to remove the seasonality from this data. So that the seasons has no impact on the temperature recourd. After that what's left is probaly the tidal flactuation. It is more important to deseasonalize the data set than detrending it.
Not sure I understand. Isn't the 3rd subplot the data with the seasonality removed? Why not?
Oh yes, you are correct. I did not carefully notice the Y axis range. Can you give me an idea on why we got so many missing data points here?
You have many (7769) elements that are nan in your original data for some reason.
>> sum(isnan(temps))
ans =
7769
Why are there nan's in your data?
Do you have time values for every temperature? Can we turn your data into a 2-D array where the row is the day of the year and the columns are the hours for all the years for that particular day? Then we could find the average temperature for that day averaged over all years that you have.
That will be awesome! Thank you so much. I am very excited to see how do you do it.
I can't do anything until you supply a time/date stamp for every temperature.
Hi @Image Analyst, this is the time table that I am working on. Each row of the first column are time stamps and the second column is the corresponding temperature at that time. I hope this data set is approprite to calculate the average temperature scheme you mentioned in the earlier comment.
I can see there are 7568 NaN Temprature in that colummn. Can you please give me an idea on which interpolation method should be the best suited method for this kind of data set? I want to remove the NaNs so that I get a nicer deaseasonalized temp data set.
I have another temperature record that contains more varibaility. I want to fit it with a very good fit sine/cosine curve. Can you please give me an idea on on how to do that?
The record can be found from the T.Temp array that I atatched to this comment. I will really appreciate any feedback from you :)
I'd again use the Savitzky Golay filter because sine and cosine can be thought of as polynomials (recall the taylor series expansions of them). So then after it's smoothed, find the valleys with findpeaks. Then extract the signal (or smoothed signal) between each valley to get one hump. Then average all the humps together to get an average season, averaged over several years. Then fit that to a polynomial with polyfit or to a sine or cosine with fitnlm. to get the formula (if you need the formula). Then replicate the fitted data and subtract it from your signal.

Sign in to comment.

More Answers (1)

Try the trenddecomp function or the Find and Remove Trends Live Editor Task.

1 Comment

the trenddecomp function is not working. Unfortunately, I only have access to MATLAB R2021a.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!