Generalized sorting function with user defined (template-)criterion

IIs there a build in function for sorting cell arrays with a user-defined comparison function(-handle) like qsort in ANSI-C (or corresponding template-based technique in C++/STL... or other modern languages)?
To make it clear what I search for, I added an elementary handmade C-Code-example, that I tested with MS-VS-2015:
Below the code a screenshot of the output. One can see that it is possible with this technique to sort according to arbitrary criteria. In the example, the elements are sorted with respect to the character ascending as a primary criterion and with respect to the numeric element in descending order as a secondary criterion (applied when different elements have the same character).
I hoped that such approaches might also be available as Matlab-build-in concepts for more general fields of data-structures?
#include "stdafx.h"
#include <stdio.h>
#include <stdlib.h>
struct myStrDef {
int num;
char lit;
};
void myStrPrint(myStrDef* myStr) {
printf("%c | %d\n", (*myStr).lit, (*myStr).num);
return;
}
int myCmpfunc(const void * a, const void * b) {
int judgement;
myStrDef v, w;
v = *(myStrDef*)a;
w = *(myStrDef*)b;
judgement = v.lit - w.lit; // sort in alphabetic order
if (0 == judgement) {
judgement = -(v.num - w.num); // minus = in desc. order
}
return judgement;
}
int main() {
int idx;
myStrDef myStrucList[5];
myStrucList[0] = { 10, 'y' };
myStrucList[1] = { 10, 'x' };
myStrucList[2] = { 1, 'z' };
myStrucList[3] = { 100, 'a' };
myStrucList[4] = { 10000, 'a' };
printf("before sorting:\n");
for (idx = 0;idx < 5; idx++) {
myStrPrint(&myStrucList[idx]);
}
qsort(myStrucList, 5, sizeof(myStrDef), myCmpfunc);
printf("\nafter sorting:\n");
for (idx = 0;idx < 5; idx++) {
myStrPrint(&myStrucList[idx]);
}
return(0);
}

Answers (2)

AFAIK there is no such thing in stock function in MATLAB, the main reason is MATLAB function calling mechanism will kill the performance with user-defined comparison.
However it is not hard to write your own quicksort algorithm. My implementation below has another drawback since swapping elements requires deep-copy (?) which is slow.
% Example:
s=arrayfun(@(x) sprintf("%d",x), randi(200,1,20))
s = 1×20 string array
"140" "124" "121" "91" "44" "63" "130" "142" "168" "132" "179" "48" "108" "139" "45" "181" "156" "198" "59" "23"
ss=qsort(s, @(x,y) sign(double(x)-double(y)))
ss = 1×20 string array
"23" "44" "45" "48" "59" "63" "91" "108" "121" "124" "130" "132" "139" "140" "142" "156" "168" "179" "181" "198"
(EDIT: optimize qsort function)
function [A, is] = qsort(A, cmpfun)
% function [As, is] = qsort(A, cmpfun)
%
% A: vector array or vector cell array
% cmpfun: comparison function for form @(x,y)
% return -1 if x<y
% return +1 if x>y
% return 0 otherwise
% where x and y are (cell)elements of A
% OUTPUTS:
% As: A, but sorted, i.e. so that cmpfun(A(i),A(j)) <= 0 for any i<=j
% is: index vector so that A(is) is equal to As.
%
% Example:
% s=arrayfun(@(x) sprintf("%d",x), randi(200,1,20))
% ss=qsort(s, @(x,y) sign(double(x)-double(y)))
n = length(A);
is = 1:n;
if nargin < 2
cmpfun = @(x,y) sign(x-y);
end
is = qsort_helper(A, cmpfun, is);
A = A(is);
end % qsort
%%
% quicksort recursive engine
function idx = qsort_helper(A, cmpfun, idx)
if ~isempty(idx)
[p, idx] = partition(A, cmpfun, idx);
idx(1:p-1) = qsort_helper(A, cmpfun, idx(1:p-1));
idx(p+1:end) = qsort_helper(A, cmpfun, idx(p+1:end));
end
end
%%
function [p, idx] = partition(A, cmpfun, idx)
% idx(1:n) is selft-permutation such that
% pivot = A(idx(p));
% all(A(idx(1:p-1))<=pivot) && all(A(idx(p+1:n))>=pivot)
n = size(idx,2);
p = 1;
if ~isempty(idx)
% pivot index: strategy median-of-three
% A(idx(mid)) <= A(idx(1)) <= A(idx(n))
mid = floor((1 + n) / 2);
if cmpfun(A(idx(mid)),A(idx(n))) > 0
idx = swap(idx, n, mid);
end
if cmpfun(A(idx(mid)),A(idx(1))) > 0
idx = swap(idx, 1, mid);
end
if cmpfun(A(idx(1)),A(idx(n))) > 0
idx = swap(idx, 1, n);
end
pivot = A(idx(p));
i = p+1;
j = n;
% Loop to satisfied pivot condition
while true
while i<n && cmpfun(A(idx(i)),pivot) <= 0
i = i + 1;
end
while j>1 && cmpfun(A(idx(j)),pivot) >= 0
j = j - 1;
end
if i < j
idx = swap(idx, i, j);
i = i + 1;
j = j - 1;
else
% i is first element of right segment
if j > 1
% Last swap to put the pivot as the last element of the left subdivision
idx = swap(idx, 1, j);
p = j;
end
return
end
end
end
end
%%
function idx = swap(idx, i, j)
% swapt two elements idx, indexes by i and j
temp = idx(i);
idx(i) = idx(j);
idx(j) = temp;
end

6 Comments

@Bruno Luong: Just a hint:
s = arrayfun(@(x) sprintf("%d",x), randi(200,1,20));
% simpler:
s = string(randi(200,1,20));
Thanks. I don't use string that often, still stick with the old char array.
I want to close this thread but need a confirmation to have the right understanding:
  • Currently there is not build in mechanism in Matlab that allows a user defined comparison function realize sorting of data-structures similar to STL in C++ or collections in modern JIT-type languages
  • The approach might be problematic because a user provided comparison function (handle) would slow down a build in sorting as Bruno presumed.
  • Of course, on can easily implement a quick-sort (see Brunos answer) and apply it to an array. However, to combine it with Matlab data structures it would always be a rather specific process (index the sorting column from the big data-structure, sort it with own comparison-&-sort- routines, reorder the whole data-structure according to the derived sorting order). The sorting process would inevitably involve for-looping... and could not benefit from build in compiled routines.
If this is correct, then I would close the thread and accept the answer.
Still, for me it would make sense, to have such a template-based sorting option also in Matlab in combination with its data structures (arrays, tables,...).
So, if it is not implemented yet I would leave it as a suggestion for further releases.
"Of course, on can easily implement a quick-sort (see Brunos answer) and apply it to an array"*
Not only to array but struct array a more general data such as CELL-ARRAY where you can stores anything on the cell.
s=struct('name',{'smith' 'jackson' 'anna'},'yearofbirth', {40 70 60})
s = 1×3 struct array with fields:
name yearofbirth
s(1), s(2), s(3),
ans = struct with fields:
name: 'smith' yearofbirth: 40
ans = struct with fields:
name: 'jackson' yearofbirth: 70
ans = struct with fields:
name: 'anna' yearofbirth: 60
% struct-array
ss=qsort(s, @(x,y) sign(x.yearofbirth-y.yearofbirth));
% cell-array
c=num2cell(s);
cs=qsort(c, @(x,y) sign(x{1}.yearofbirth-y{1}.yearofbirth));
ss=cat(2,cs{:})
ss = 1×3 struct array with fields:
name yearofbirth
ss(1), ss(2), ss(3),
ans = struct with fields:
name: 'smith' yearofbirth: 40
ans = struct with fields:
name: 'anna' yearofbirth: 60
ans = struct with fields:
name: 'jackson' yearofbirth: 70
PS: The qsort.m attached in the comment is fater than the ons posted in the awswer.
Hi
@Bruno Luong Thanks for the suggested approach
@Jürgen I understand, you want to sort the data using template-based sorting. Currently, this functionality is not available in MATLAB. I have brought this issue to the notice of the concerned people and it might be considered for a future release.

Sign in to comment.

From my understanding, you are trying to sort values on custom criteria. This can be done using sortrows().
value={10,'y';
10,'x';
1,'z';
100,'a';
1000,'a'
};
a=sortrows(value,[2,1],{'ascend','descend'});
This sorts the cell array based on 2nd column as primary criteria in ascending order and 1st column as secondary criteria in descending order
"Please refer to the documentation of sortrows() for more information".

1 Comment

Hi Prasanna,
I know sortrows, but it is still to restricted and does not cover the options that are provided by template-technique that we know from C/qsort or more advanced from C++/STL. Assume, e.g. a slight extension of the above problem:
value={10,'cyabddd';
10,'bxcv';
1,'azqwe';
100,'mac1';
1000,'zaa9'
};
If the taks is to sort it descending according to the second and third character, then any such thing in arbitrary complexity it can be realized with the generic 'comparison-function' approach. To my experience I could solve it in Matlab only wiht loops... and mapping the sort-column intermediately to a column of sortable properties.
Maybe by applying something like arrayfun... but this twists my brain and I guess it will hardly be feasible.
I guess something like STL or Java/C#-Collectioons would still be a helpful paradigm also to include in Matlab. This would allow to avoid loops and run the sorting under the hood of Mathworks excellent memory management and utilize the natural strength of the template concept.

Sign in to comment.

Categories

Products

Release

R2021b

Asked:

on 20 Jan 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!