This is something, but I bet there are other ways too. This is about 10x faster on my hardware/version, but only about 2-3x faster on the web version.
B = rand(pagesz(1),pagesz(2),npages);
c2 = sum(B.*permute(sum(A,2),[3 2 1]),3);
c3 = c3 + sum(reshape(A(:, j), 1, 1, ).*B, 3);
While Chunru's proposal might not be generally as fast as what I proposed, it preserves the order of operations, and so the accumulated error will be smaller.