Stochastic spread method for pairs trading by Elliot et. al (2005) - Kalman filter + EM algorithm in MATLAB, am I doing something wrong?

7 views (last 30 days)
I am implementing the Stochastic spread method for pairs trading by Elliott et. al (2005).
The procedure consists of modeling the spread between two stocks, log(P1)-log(P2), as a mean reverting process, calibrated from market observations.
The hidden state process for the spread can be written like this:
x_{t+1} = A + Bx_t + Ce_{t+1}
The observation process is:
y_t = x_t + D*w_t
Both e_t and w_t are i.i.d. Gauusian N(0,1).
My procedure: (P is the same as s above)
To generate the spread between two stocks, I take the difference between the log-prices: y=log(p1)-log(p2).
I set a training period of 252 days, where I estimate the initial parameters (A, B, C2 and D2) using the EM algorithm. I implement the EM algorithm using all the data for the training period; that is y(1), y(2), ..., y(252) as well as initial guesses for A, B, C2 and D2:
2a. I set x_{1|1}=y(1). Furthermore I set the MSE, P_{1|1}=D2, my initial guess for D^2.
2b. I recursively calculate Kalman filters, x_{t|t}, x_{t+1|t}, P_{t|t}, P_{t+1|t} and k_{t} for all t=1...252 (the entire training period) using my initial guesses for A, B, C2 and D2.
2c. After I have calculated the kalman filters for the entire training period, I (backward) recursively calculate Kalman smoothers for the entire training period as well: t=1...252. These are x_{t|T}, P_{t|T}, P_{t,t-1|T} and j_{t}.
I then compute the log-likelihood value and the updated values for A, B, C2 and D2. Then I repeat the steps from 1 until the log-likelihood converges and I obtain optimal values for A, B, C2 and D2.
Is it correct to calculate Kalman filters for the entire training period before starting to calculate Kalman smoothers? Or should I, for example, calculate Kalman filters up till t=2, then Kalman smoothers for T=2, then Kalman filters up till t=3, then smoothers for T=3 etc.?
Now I have values for A, B, C2 and D2 and can begin my testperiod, also 252 days. I don't update my estimates for A, B, C2 and D2, but keep them constant. For each new observation I can compute Kalman filters (the same as in 2b).
Elliott gives the Kalman filter equations in his paper, which I have implemented in my code for the updating step:
function [xt_t,st_t,xt_tm,kt,Pt_tm]=EMupdate(DATA_t,xt_t_m1,Pt_t_m1,A,B,C2,D2)
Pt_tm=B^2*Pt_t_m1+C2;
kt=Pt_tm/(Pt_tm+D2);
xt_tm=A+B*xt_t_m1;
xt_t=xt_tm+kt*(DATA_t-xt_tm);
Pt_t=Pt_tm-kt*Pt_tm;
where
xt_t is x_{t|t}
xt_t_m1 is x_{t-1|t-1}
xt_tm is x_{t|t-1}
Pt_t is P_{t|t} (the MSE)
Pt_t_m1 is P_{t-1|t-1}
Pt_tm is P_{t|t-1}
kt is the kalman gain for time t
DATA_t is the observed data for time t, y_t
A, B, C2, D2 are the estimated parameters (which I have estimated using the EM algorithm in another code).
This update step is done every time a new data point arrives. I am storing all the x's, s's and k's in vectors. I am supposed to compare y_t with x_{t|t-1}, and given a large deviation of the two, a trade should be initiated. However, the two follows each other very closely, and I am unsure whether I have done something wrong.
My results look like this:
While a paper by Chen, Ren and Lu have the following results:
NB: Not the same security... but the difference is obvious nonetheless.
I really hope someone can help me out, I am really stuck. Thanks very much.
Best, Johan

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!