Let's solve the the inner loop at first (I prefer "j", because the lower case "L" looks like a one):
R = rand; r = rand; lambda = rand;
Slm = rand(70); Clm = rand(70);
Plm = rand(70, 71);
Cl = rand(70, 1); Pl = rand(70, 1);
s = 0;
for j = 2:70
s = s + ((R/r)^j) * (j+1) * Pl(j) * Cl(j) + ...
sum(((R/r)^j) * (j+1) * Plm(L,1:j) .* ...
(Clm(L, 1:j) .* cos((1:j) * lambda) + ...
Slm(L, 1:j) .* sin((1:j) * lambda)));
But this is 4 times slower than the original version under Matlab 2009a!
Let's try to avoid the repeated power, COS and SIN:
Rr = R / r;
RrL = RrL;
cosLambda = cos((1:70)* lambda);
sinLambda = sin((1:70)* lambda);
u1 = uint8(1);
s = 0;
for j = uint8(2):uint8(70)
RrL = RrL * Rr;
q = RrL * (double(j) + 1);
t = Pl(j) * Cl(j);
for m = u1:j
t = t + Plm(j,m) * ...
(Clm(j, m) * cosLambda(m) + ...
Slm(j, m) * sinLambda(m));
s = s + q * t;
EDITED: 40% faster with UINT8 loop indices instead of DOUBLEs! Same speed for INT32, but only 25% for UINT32.
This is 12 times faster than the original version - with FOR loops!
So vectorized does not necessarily mean faster. The JIT acceleration introduced with Matlab 6.5 increases the speed of this loop remarkably. And avoiding powers and trigonometric calculations is important also.
The old tale of the slow FOR loops is very sticky.