is x' * x faster than x.^2 (x is a vector)

150 views (last 30 days)
Hi~ Somebody recommended me to use x' * x.
Is it faster than sum(x.^2) ? Any thoughts are appreciated. Thanks!
[EDITED, sum() was forgotton at first]
  1 Comment
Bruce Elliott
Bruce Elliott on 9 Jan 2019
I had the same question. Years ago we were taught that even for scalars x*x would be faster than x^2, since the "power" function actually evaluates x^n as exp(n*ln(x)), in order to handle non-integer exponents.
Is that true for Matlab? I just ran some simple timing tests, and it does not seem to be true. I'm guessing that the power function is smart enough to detect integer exponents.

Sign in to comment.

Accepted Answer

John D'Errico
John D'Errico on 3 Oct 2014
Edited: John D'Errico on 3 Oct 2014
Um, the two do completely different things. Or, at least significantly different things.
Assuming that x is a column vector, x'*x will yield the SUM of squares of the elements, returning a SCALAR value. This is a dot product.
x.^2 returns a vector of squares of the elements.
And, of course, if x is a row vector, then x'*x yields an n by n array of all products of elements.
So whoever told you the two do the same thing was simply wrong. Had they told you to compare x'*x (with x a column vector) with sum(x.^2), then they would have been correct that the two do the same thing, as long as x contains only real values. If x has complex numbers in it, then again, they would have been wrong, even if you included that sum.
x = [1 + i; 2 - i]
x =
1 + 1i
2 - 1i
ans =
ans =
3 - 2i
So you need to think carefully about what you need to do. Do you need the sum of the squares? Then you need to compare the dot product with the sum of squares. Do you want to form a vector of squares of the elements? Then compare x.*x with x.^2, with no transpose. Only you know what you are looking for.
IF x is a real column vector AND you use a sum in the second version, then and only then can you compare things, but even then one must be careful. The length of the vectors will be crucial, since MATLAB probably makes decisions on how to do those computations based on the length. Well, actually it is probably the BLAS that makes those decisions. And for small vectors, the difference is nothing to worry about anyway.
But why not test it yourself, rather than asking a question? You will ALWAYS learn something by playing around. Using Steve Eddins timeit from the FEX, I find that...
x = rand(1e6,1);
timeit(@() x'*x)
ans =
timeit(@() sum(x.^2))
ans =
For larger vectors, the difference is similar. This time I'll try a couple of other options, just for kicks.
x = rand(1e7,1);
timeit(@() x'*x)
ans =
timeit(@() sum(x.^2))
ans =
timeit(@() norm(x).^2)
ans =
timeit(@() dot(x,x))
ans =
timeit(@() sum(x.*x))
ans =
timeit(@() sum(bsxfun(@times,x,x)))
ans =
The reason it is a bit faster is probably because the vector dot product is implemented in the BLAS using pretty highly optimized code, but it looks like the dot version costs due to some function overhead. But unless your vectors are seriously large, and you are doing the computation a lot, don't bother. And in any case, make sure you understand when the two will yield different results. (I was surprised by the way that the bsxfun version was second fastest.)
Similarly, if you really wanted to compute a vector of squares of elements, then again, a simple test is best.
x = rand(1e7,1);
timeit(@() x.*x)
ans =
timeit(@() x.^2)
ans =
Note that the two results are VERY close here. By the way, see that my use of timeit is important, as tic and toc are a dangerous way to compare time. (A simple use of tic and toc does not factor in the startup overhead, so the first time a function is called, it takes longer to run. As well, a single call is not a good estimator of the time required, since your own CPU often does things behind the scene that may cost time. You must make replicate calls, and average the time after deleting the first few. timeit does all this for you and more.)
Next, it is important to recognize that there are differences if the comparison is done as a script, versus inside a function. If your computations are done inside functions, then MATLAB does additional accelerations that are not done in scripts. So be careful when comparing things at the command line, and assuming that your timed results will be valid in a function!
And, yes, please accept the answer that does give a clear AND complete answer to your question.
John D'Errico
John D'Errico on 5 Oct 2014
Neat. I've long wondered why Steve's great FEX submission did not get into MATLAB. It finally did.

Sign in to comment.

More Answers (1)

José-Luis on 3 Oct 2014
Edited: José-Luis on 3 Oct 2014
Why don't you test it?
x = rand(10^6,1);
res1 = x.*x; %The transpose is unnecessary;
res2 = x.^2;
Elapsed time is 0.005957 seconds.
Elapsed time is 0.016413 seconds.
Please accept the answer that best solves your problem.
Matheus Rocha
Matheus Rocha on 9 Aug 2021
Edited: Matheus Rocha on 9 Aug 2021
A tic is necessary after the first toc in order to reset the time counter. Without this, the second toc will return the time spent in computing both res1 and res2.

Sign in to comment.


Find more on Performance and Memory in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!