File Exchange

image thumbnail


version 1.0 (16 MB) by

Empirical Orthogonal Functions tailored for spatiotemporal analysis, with a tutorial.



View License

This function simplifies the process of applying Empirical Orthogonal Functions (spatiotemporal principal component analysis) to 3D datasets such as climate data. EOF analysis is not terribly difficult to implement, but much time is often spent trying to figure out how to reshape a big 3D dataset, get the EOFs, and then un-reshape. This function does all the reshaping for you, and performs EOF analysis in a computationally efficient manner. The analysis method is a streamlined and optimized version of Guillame MAZE's caleof function, method 2.

For a full description and an in-depth tutorial describing how to perform EOF analysis on climate data, check out the html example file included in this submission (light bulb icon on this page).

Comments and Ratings (18)

Suman Maity

Hi can zu
I couldn't understand your idea regarding the multiplication of (-1). As I have understood, you are trying to match figure Messie and Chavez (2011). Please try to elaborate clearly for these following queries:
1) In the code of eof.m, there is
%% Flip signs to provide consistent results:
ind = sign(pc(:,1))<0;
eof_maps(:,:,ind) = eof_maps(:,:,ind)*(-1);
pc(ind,:) = pc(ind,:)*-1;
Whether this part is in general or somehow for matching the result with Messie and Chavez (2011)? I mean for an arbitrary data is it required to change?

2) In the example you have mentioned
s = [-1 1 -1 1 -1 1]; % (sign multiplier to match Messie and Chavez 2011)
this 's' is multiplied with first 6 eof.
Whether this part is in general or somehow for matching the result with Messie and Chavez (2011)? I mean for an arbitrary data how I will decide which one to be multiplied with "-1" and which one to be "1"? What for the others?

3) In another part you have mentioned
anomaly(t,-pc(1,:)) % First principal component is enso

Whether this is in general or somehow for matching the result with Messie and Chavez (2011)? I mean for an arbitrary data how to decide whether -1 should be multiplied with or not? whether that -1 should be multiplied with every pc?
Please try to clarify the above. My purpose is not to match the figures of Messie and Chavez (2011) but to understand what could I do for my data. Please reply with clear information with these above three cases.

With Regards

can zu

can zu (view profile)

I wonder that the spatial data need multiply the weights.

I loaded "PacOcean" and a simple problem appears when using the code "[eof_maps, pc, expv] = eof (sst)" :
Subscripted assignment dimension mismatch.
Why? The matrix (sst) is in 3d format (lat, lon,temporal data).
Help me, please!

can zu

can zu (view profile)

Hi Suman,
(1).12 months is one year, four seasons.
(2).As we know,the sun reaches the tropic of cancer once a year in the northern hemisphere and vice versa in the southern hemisphere.It means the period of positive sst phase is semi-annual i.e. 6 months.And in the one cycle , it includes the positive sst phase and negative sst phase in one hemisphere. In this cycle, it is sst change of ditterent months,i.e. seasonal singal. The annual signal is for the change of different years.If you want to get it,you can do the 12-months running average(some papers use the method) i.e. nonseasonal singal.
(3).In the code, it multiplies -1 with both PC and EOF mode. You don't need to do this. The result are the same.Since the signal is product of PC and EOF mode i.e. PC1xMode1, multiplication with "-1" or not do not change for the signal.

Suman Maity

can zu
Thank you very much for your answer. I couldn't understand how you are telling a season consists of 12 months i.e. one year? As I know in India there are four seasons defined as: Winter Season: January-February, Pre Monsoon Season: March-May, Southwest Monsoon Season: June-September, Post Monsoon Season: October - December. Will you want to mean one year as one season?
My 2nd query is what do you mean by "Semi-annual"? Is it having period of 6 months (i.e. 0.5 year)? But in the example shown here depicts 1 cycle/year i.e. the signal having period 1 year (i.e. 12 months) and hence it is an annual signal. How to distinguish?
The 3rd query is about the negative sign. In the example multiplication with -1 has been done to satisfy the figure Messie and Chavez (2011). I have seen in the code that multiplication with "-1" is done. My query is whether -1 multiplication is needed? If I am doing with some other datasets then is it required? how to decide which mode should be multiplied? is it require to multiply -1 with both PC and EOF for a particular mode?
Please clarrify these queries.

Mark Hadfield

I'd like to point out some odd behaviour (arguably a bug) that this package's function mycaleof shares with the caleof function in Guillaume Maze's PCAtools. The variance fraction (expv or expvar) that is returned for each EOF is the fraction of the variance explained by all calculated EOFs, not of the total variance. For example, I take the code in "TUTORIAL: From raw climate reanalysis data to ENSO, PDO, etc" but call the eof function with n (the number of EOFs to be calculated) equal to 6. Modes 1 & 6 account for 57.3% & 5.2% of the variance. I then set rerun it with n equal to 801. All the EOF patterns are identical, but Modes 1 & 6 now account for only 36.5% & 3.3%. If I modify my code so that Guillaume Maze's caleof function is called instead of mycaleof, then I get the above results when using methods 2 and 4 (calling eigs and svds respectively) but with methods 1 and 3 (calling eig and svd respectively) the fraction of explained variance does not depend on the argument n and is equal to 36.5% & 3.3% for Modes 1 & 6. I conclude that (and I haven't looked at this section of the code in detail yet): a) the denominator in expvar is the variance of all the EOFs that have been calculated, not of the total variance; b) caleof methods 2 and 4, and mycaleof, calculate only the EOFs that they return, but caleof methods 1 and 3 calculated all eofs even if they return fewer.

can zu

can zu (view profile)

Hi Suman,
I'm also learning the codes.In my opinion,the "Seasonal Variation" is the singal changing with different seasons (constitute by 12 different months).In the PC1 ,you can see the temporal variation of mode1.It's a semi-annual variations corresponding with SST changes caused by solar motion.

Suman Maity

Hello Chat
This is an excellent software to me as I have just started learning. One point is not clear to me which is mentioned below:
In the example you have shown that if EOF applied on the original sst then the 1st mode will show the seasonal variation with high explained variance. Why it is called "Seasonal Variation"? Because I have seen one cycle takes one year to complete. Will you want to mean a season is equal to one year?
To get rid of that variation basically "monthly anomaly" is found. I couldn't connect the two i.e. "Seasonal Variation" and "Monthly anomaly". Although this is an usual practice but will you please put some light on that?

Raden Susanto

Hallo Chat,
Excellent code. I would like to suggest that the results are normalized, i.e. the temporal eigen-vectors to 1 and the corresponding eigen-values to the total of the variances, so when we multiply the amplitude of eigen-vector and eigen-value is equal to the actual amplitude variability of the time series (i.e. the sst).

can zu

can zu (view profile)

Fig 10 in Deser, Clara, et al(2009) may be not EOF1,it doesn't mark. The SST just be anomaly.

can zu

can zu (view profile)

Hi Chad Greene,

very great code! I wonder that the data (SST dataset in the example) need to be detrended, the mean removed, and the seasonal cycle removed,but in the eof subfunction (mycaleof) be detrended again (F = detrend(M,'constant');).

And another paper Deser, Clara, et al(2009)
Fig 10, just do SST anomaly and get the EOF1 as PDO in North Pacific,this is a little different with Fig5 in Messie and Chavez (2011) which is EOF3 corresponding to PDO.

Chad Greene

Chad Greene (view profile)

Hi Hao and Qiao,

Indeed, following the example file

[eof_maps,pc,expv] = eof(sst);

does produce a complex double result. However, take a look at the magnitude of the imaginary component relative to the total magnitude. Below I'm multiplying by 100 to plot the imaginary component as a percentage of the total magnitude:

caxis([0 100])

The horizontal bands indicate there is exactly zero imaginary component for modes 1 through 795. It goes imaginary beyond mode 795, but in climate science we don't typically use any modes past the first few. And there's a reason for that: If you look at how much variance is explained by mode 796:

ans =

If you get a complex double result, check the magnitude of the imaginary component relative to the rest of the signal. And see how much percent variance is explained. If the imaginary component is negligible, you can get rid of it by

eof_maps = real(eof_maps);
pc = real(pc);

Hope this helps,

Hao Guo

It is good and note worthy, however, I have the same problem with Qiao Sun, the example data has some problem with complex double for the results of eof_maps and pc.

Tao Jia

Qiao Sun

It is great code. I used the example data PacOcean.mat you provided, but in the line [eof_maps,pc,expv] = eof(sst), the results of eof_maps and pc were complex double.


J (view profile)

Yu Long



Typo fix in the documentation.


Added a simple example in the tutorial.

MATLAB Release
MATLAB 8.0 (R2012b)

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video