Redirecting arithmetic functions from LAPACK/BLAS?

1 view (last 30 days)
Andrew on 28 Jun 2021
Edited: Jan on 3 Jul 2021
I have a specific line of code:
P2_r = 1 + 0.5*abs(U_O(:,:,ctr) - 0.5) + 0.5*abs(cand - 0.5) - 0.5*abs(U_O(:,:,ctr) - cand);
where U_0 is an array that varries in size from 10's of elements to 100,000+ elements depending on the progression of the parent's code. cand is a single element of U_0, and ctr is simply controlling the 3rd dimension. Now, when U_0 hits a size of approximately 130,000+ elements the computation time of the line of code jumps by 2 to 3 orders of magnitude. This becomes a bottleneck because P2_r is at the core of an optimization routine.
I've read that MATLAB will call LAPACK/BLAS when it is determined the overhead is worth it. If MATLAB is truly calling a different math library, is it possible to control the criteria it uses to determine the switching point? Or redirect LAPACK/BLAS calls back to MATLAB's default algorithms? multiply() and abs() are built-in functions, so I do not have access to the source code.
EDIT: The above figure depicts the jump in computation time observed for arrays of varrying dimensions (columns) around 130,000+ elements. The specific line of code is the bottleneck of a larger rountine that computes the following quantity known as discrepancy: I have also attached a profile of the discrepancy routine that shows P2_r as consuming the largest fraction of computation time
Andrew on 28 Jun 2021
The speed gets slower by a factor of 100 to 1000. By size I mean total number of elements in the array. For instance, a 2D array of 10 rows and 6 columns would be 60 elements

Jan on 28 Jun 2021
Edited: Jan on 3 Jul 2021
The question is not clear yet. Can you post some input, e.g. produced by RAND(), which reproduce the timings you are observing?
P2_r = 1 + 0.5 * abs(U_O(:,:,ctr) - 0.5) ...
+ 0.5 * abs(cand - 0.5) ...
- 0.5 * abs(U_O(:,:,ctr) - cand);
If this line is really the bottleneck, reduce the number of arithmetic operations by moving the scalar parts to the front:
P2_r = 1 + 0.5 * abs(cand - 0.5) ...
+ 0.5 * (abs(U_O(:,:,ctr) - 0.5) - abs(U_O(:,:,ctr) - cand));
With some guessing:
cand = 17;
ctr = 12;
U_O = rand(130000, 10, 20);
tic;
for k = 1:1e2
P2_r = 1 + 0.5 * abs(U_O(:,:,ctr) - 0.5) ...
+ 0.5 * abs(cand - 0.5) ...
- 0.5 * abs(U_O(:,:,ctr) - cand);
end
toc
tic;
for k = 1:1e2
P2_r = 1 + 0.5 * abs(cand - 0.5) ...
+ 0.5 * (abs(U_O(:,:,ctr) - 0.5) - abs(U_O(:,:,ctr) - cand));
end
toc
At least some percent computing time. But I do not see a large jump in computing time.
Maybe you oberserve an exhausted RAM, such that the much slower virtual RAM is used? Or the limit of the 2nd level cache size?
Andrew on 29 Jun 2021
Thanks for the tips! my general intention is overall computational efficiency, so any code improvements are always welcomed!

Cleve Moler on 2 Jul 2021
It might help to use the transposes of your arrays or interchange the order ot your loops. MATLAB stores 2-d arrays by coluumns, For example:
A = reshape(1:25,5,5)
A =
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
If the array is very large, it would be faster to do
sum(A,1)
ans =
15 40 65 90 115
than
sum(A,2)
ans =
55
60
65
70
75
-- Cleve

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!