Insert efficiently elements into sorted array

I want to insert elements into a sorted array -one per column- so that the result is also sorted. The following example is very simple but the matrices I am working with are quite big and hence my question is how to do it efficiently rather that how to do it.
A=[1 3 5 7; 2 3 6 7; 3 5 8 9]'
v=[6 1 6] --elements to be inserted
so that the result should be
B=[1 3 5 6 7; 1 2 3 6 7; 3 5 6 8 9]'
My first option was
B=sort([A; v])
but this takes very long with big matrices. Any optimization on this?

1 Comment

Jan
Jan on 8 Mar 2018
Edited: Jan on 8 Mar 2018
As usual for optimizing code: The real dimensions matter. It is important, if you are working with [1e3 x 1e6] or [1e6 x 1e3] matrices.

Sign in to comment.

 Accepted Answer

Here is a mex routine to do the operation in case you are interested. It runs in about 1/2 the time of the m-code on my machine. (Sorry Jan ... I couldn't resist!)
/* --------------------------------------------------------------------
File: sorted_insert.c
sorted_insert inserts vector elements into a sorted matrix by columns
to result in a sorted matrix by columns. Syntax:
C = sorted_insert(A,B)
A = full real double 2D matrix that is sorted by columns (MxN)
B = full real double 1D vector that has same number of elements
as A has columns (1xN) or (Nx1)
C = Matrix A with B elements inserted in sorted fashion. I.e., if B is
a row vector then this is the equivalent of C = sort([A;B],1)
Programmer: James Tursa
Date: March 8, 2018
*/
/* Includes ----------------------------------------------------------- */
#include "mex.h"
/* Prototype in case this is earlier than R2015a */
mxArray *mxCreateUninitNumericMatrix(size_t m, size_t n,
mxClassID classid, mxComplexity ComplexFlag);
/* Gateway ------------------------------------------------------------ */
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mwSize i, i1, i2, i3, j, m, n;
double *Apr, *Bpr, *Cpr;
if( nlhs > 1 ) {
mexErrMsgTxt("Too many outputs");
}
if( nrhs != 2 || !mxIsDouble(prhs[0]) || !mxIsDouble(prhs[1]) ||
mxIsComplex(prhs[0]) || mxIsSparse(prhs[0]) ||
mxIsComplex(prhs[1]) || mxIsSparse(prhs[1]) ) {
mexErrMsgTxt("Need two full real double inputs");
}
if( mxGetNumberOfDimensions(prhs[0]) != 2 ) {
mexErrMsgTxt("First input must be a 2D matrix");
}
m = mxGetM(prhs[0]);
n = mxGetN(prhs[0]);
if( (mxGetM(prhs[1]) != 1 && mxGetN(prhs[1]) != 1) ||
mxGetNumberOfElements(prhs[1]) != n ) {
mexErrMsgTxt("Second input must be vector with same number of elements as columns of first input");
}
plhs[0] = mxCreateUninitNumericMatrix(m+1,n,mxDOUBLE_CLASS,mxREAL);
Apr = mxGetPr(prhs[0]);
Bpr = mxGetPr(prhs[1]);
Cpr = mxGetPr(plhs[0]);
for( j=0; j<n; j++ ) { /* For each column */
i1 = 0;
i3 = m;
while( i3-i1 > 1 ) { /* Binary search to find insert spot */
i2 = (i3 + i1) >> 1;
if( Apr[i2] > *Bpr ) {
i3 = i2;
} else {
i1 = i2;
}
}
if( i1 < m && Apr[i1] > *Bpr ) { /* Potential 1st spot adjustment */
i3--;
}
for( i=0; i<i3; i++ ) { /* Copy the stuff before */
*Cpr++ = *Apr++;
}
*Cpr++ = *Bpr++; /* Copy the vector element */
for( i=i3; i<m; i++ ) { /* Copy the stuff after */
*Cpr++ = *Apr++;
}
}
}

11 Comments

Thanks James, I will give it a try, although I have never worked with mex routines. Maybe it is time to start ...
@Manuel: Mex'ing is not hard. Install a C-compiler, run "mex -O sorted_insert.c" and call sorted_insert() afterwards as it is a standard M-function.
@James: Thank you. I'm not sad about finding your solution here already instead of writing it by my own. :-) +1
Is the binary search faster than testing all values linearly? The binary search has less comparisons, but the linear search uses the processor cache more efficiently.
// !!! UNTESTED !!!
Cpr[m] = *Bpr; // Care for exception B>A(end)
for (i = 0; i < m; i++) {
if (*Apr < *Bpr) {
*Cpr++ = *Apr++;
} else {
*Cpr++ = *Bpr;
memcpy(Cpr, Apr, (m - i) * sizeof(double));
Apr += m - i;
Cpr += m - i;
}
}
Bpr++;
I assume modern compilers optimize the loop to copy the values already, such that memcpy does not give a speedup.
I'm in doubt if this is faster than the binary search, because the branch prediction due to the if is most likely a brake. But it is worth to test.
This should run in multiple threads. Does parfor accelerate the M-code substantially? (I should buy the Parallel Processing Toolbox...)
I ran the linear search method and it didn't make much difference. The binary search method ran faster on my 32-bit and 64-bit MATLAB versions, but only by less than 5% on average.
I did not test parallel loops on the j column loop, so not sure how much effect that would have on the overall timing.
@Manuel Barrio: If James' solution is faster, you can unaccept my answer and select his answer as the best solution.
To distribute your code, you could include the C- and M-source, such that a user can compile the C-Mex function for speed or use the M-code as fallback. In addition comparing the results of M- and C-code can help to identify or exclude bugs in the tool.
@James: Thanks for the tests. It is interesting, that the binary search is just 5% faster. Some programming paradigms are influenced massively by caching strategies of the CPU. With growing data, the advantage of the binary search should be increased, so it is the better choice.
Yes, it makes sense. The fastest one should be the accepted answer.
@James, is it too much if I ask you for a version where the result should be in A (maybe the processing as well since an extra A=B takes time) assuming we can delete the last row of the original A to make room for the inserted vector and therefore the size of A doesn't change? That is exactly what I need. Thanks
To be sure I understand, you want to modify A inplace? And whatever the last row would have been, simply ignore it in the solution? There are pitfalls to modifying variables inplace btw ... you can inadvertently change another variable that is sharing the same data as A. But the code should be fairly easy to write if that is what you really want.
Yes that is what I need. Values should be added in order (per column) into A and stay there until they are read and deleted. This happens regularly and this is why I need all the writes/reads to happen in A. The size doesn't change since the last row can be discharged when there is an insertion. The function should return the index of where the insertion has taken place. Thanks indeed
Here is the code for the inplace version. Note that there is no checking to see if A is sharing data with another variable ... it is up to the user to ensure that this is not the case. This code does the data copy in the forward direction. Copying the data in the reverse direction would take a bit less code but might not use the cache as well (I didn't bother testing that method).
/* --------------------------------------------------------------------
File: sorted_insert_inplace.c
sorted_insert_inplace inserts vector elements into a sorted matrix by
columns to result in a sorted matrix by columns. The operation is done
inplace on the matrix A, and what would have been the last row after the
insertion is discarded. A vector is returned that contains the indexes
of the insertion points. Note that if an insertion point is at the end
of a column, the insertion will not happen (since it would have been in
the last row) and the index returned for that spot will be size(A,1)+1.
Syntax:
C = sorted_insert_inplace(A,B)
A = full real double 2D matrix that is sorted by columns (MxN)
B = full real double 1D vector that has same number of elements
as A has columns (1xN) or (Nx1)
C = Vector that contains the indexes of the insertions (1xN)
The A result will be the equivalent of:
A = sort([A;B],1);
A(end,:) = [];
CAUTION: If the A matrix is sharing data with another variable, then
that other variable will be changed inplace also. It is up to the user
to ensure that there are no other variables sharing data with matrix A,
since this routine does not check for this condition.
Programmer: James Tursa
Date: March 12, 2018
*/
/* Includes ----------------------------------------------------------- */
#include "mex.h"
/* Prototype in case this is earlier than R2015a */
mxArray *mxCreateUninitNumericMatrix(size_t m, size_t n,
mxClassID classid, mxComplexity ComplexFlag);
/* Gateway ------------------------------------------------------------ */
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mwSize i, i1, i2, i3, j, m, n;
double *Apr, *Bpr, *Cpr;
double temp1, temp2;
if( nlhs > 1 ) {
mexErrMsgTxt("Too many outputs");
}
if( nrhs != 2 || !mxIsDouble(prhs[0]) || !mxIsDouble(prhs[1]) ||
mxIsComplex(prhs[0]) || mxIsSparse(prhs[0]) ||
mxIsComplex(prhs[1]) || mxIsSparse(prhs[1]) ) {
mexErrMsgTxt("Need two full real double inputs");
}
if( mxGetNumberOfDimensions(prhs[0]) != 2 ) {
mexErrMsgTxt("First input must be a 2D matrix");
}
m = mxGetM(prhs[0]);
n = mxGetN(prhs[0]);
if( (mxGetM(prhs[1]) != 1 && mxGetN(prhs[1]) != 1) ||
mxGetNumberOfElements(prhs[1]) != n ) {
mexErrMsgTxt("Second input must be vector with same number of elements as columns of first input");
}
plhs[0] = mxCreateUninitNumericMatrix(1,n,mxDOUBLE_CLASS,mxREAL);
Apr = mxGetPr(prhs[0]);
Bpr = mxGetPr(prhs[1]);
Cpr = mxGetPr(plhs[0]);
for( j=0; j<n; j++ ) { /* For each column */
i1 = 0;
i3 = m;
while( i3-i1 > 1 ) { /* Binary search to find insert spot */
i2 = (i3 + i1) >> 1;
if( Apr[i2] > *Bpr ) {
i3 = i2;
} else {
i1 = i2;
}
}
if( i1 < m && Apr[i1] > *Bpr ) { /* Potential 1st spot adjustment */
i3--;
}
*Cpr++ = i3+1;
Apr += i3;
if( i3 < m ) {
temp1 = *Apr;
*Apr++ = *Bpr++;
for( i=i3+1; i<m; i++ ) { /* Copy the stuff after */
temp2 = *Apr;
*Apr++ = temp1;
temp1 = temp2;
}
} else {
Bpr++;
}
}
}
Thank you very much James. Hopefully this will save me a great amount of computation. Cheers,
hey hi need answer for modify INSERTION_SORT.m in mtalab to sort an input array A using binary search instead of linear search and compare the time complexities of both insertion sort Algorithms. please help these...

Sign in to comment.

More Answers (3)

Jan
Jan on 8 Mar 2018
Edited: Jan on 8 Mar 2018
Actually a binary search in the subvectors is optimal. But in my experiments find was not slower even if the binary search was performed in a C-mex function.
A = [1 3 5 7; 2 3 6 7; 3 5 8 9]';
v = [6 1 6];
sA = size(A);
B = zeros(sA(1) + 1, sA(2));
for k = 1:sA(2)
index = find(A(:, k) > v(k), 1);
if isempty(index)
B(1:sA, k) = A(:, k);
B(sA+1, k) = v(k);
else
B(1:index - 1, k) = A(1:index - 1, k);
B(index, k) = v(k);
B(index + 1:sA+1, k) = A(index:sA, k);
end
end
Maybe FEX: insertrows solves the problem efficiently, but it works along the rows. Working along columns is faster.
Please post some timings with relevant input data.

3 Comments

Thanks for the answer. I have checked with the following example and the results show that find outperforms sort by 30%
ntest=1e2;
sA1=1e5; sA2=1e2;
A=sort(rand(sA1,sA2));
B=zeros(sA1+1,sA2);
a=rand(ntest,sA2);
% SORT
tic
for w=1:ntest
B=sort([A; a(w,:)]);
end
toc
% FIND
tic
for w=1:ntest
for k=1:sA2
index=find(A(:,k)>a(w,k),1);
if isempty(index)
B(1:sA1, k) = A(:, k);
B(sA1+1, k) = a(w,k);
else
B(1:index - 1, k) = A(1:index - 1, k);
B(index, k) = a(w,k);
B(index + 1:sA1+1, k) = A(index:sA1, k);
end
end
end
toc
Results:
Elapsed time is 10.036303 seconds.
Elapsed time is 7.089176 seconds.
Jan
Jan on 8 Mar 2018
Edited: Jan on 8 Mar 2018
Do you need it faster? Then I could try a C-mex function. But actually I do not see a bottleneck caused by Matlab here, except that find(A(:,k)>a(w,k),1) might create a copy of A(:, k) without need.
The time for sorting grows super-linear with the size of the data, but find has a linear complexity only. Therefore the advantage of find might be growing with the data size. Is 1e5 x 1e2 a typical size?
I need it as fast as possible since my goal is to increase sA2 as much as possible --minimum 1e3 and ideally 1e4, and so far I cannot even reach 1e3. sA1 should always be in the range of 1e5

Sign in to comment.

Additionally, if you wanted the result to be in A and decided to manipulate A directly with the following code
for k=1:sA2
index=find(A(:,k)>a(w,k),1);
if isempty(index)
A(sA1+1, k) = a(w,k);
else
A(index:sA1+1,k)=[a(w,k); A(index:sA1,k)];
end
end
Then performance drops considerably: Elapsed time is 15.218897 seconds.

2 Comments

Try to replace
A(index:sA1+1,k)=[a(w,k); A(index:sA1,k)];
by
A(index+1:sA1+1,k) = A(index:sA1,k);
A(index) = a(w,k);
With the first [a(w,k); A(index:sA1,k)] must be created explicitly, while it could be possible, that shifting the values in the 2nd method is done inplace.
It doesn't work. Same as before. Thanks anyhow for the suggestion

Sign in to comment.

The best way to insert/delete into a sorted array is ... not using array at all, but a more sophisticated data structure, e.g. red-black tree.
Unfortunately MATLAB does not provide any efficient support for list/tree. One need to use C/C++ for efficient implementation.

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!