Very intriguing program, brilliant in its simplicity. It tackles very effectively the over-fitting issue that often appears in various regression scenarios. Also, very easy to use and well documented.
It would be much more practical though if it actually yielded the coefficients of this polynomial (at least have the option to do so). This way the user won't have to run polyfit.m to find them.
Truly excellent. It has everything you could ask of a Matlab program: good structure, excellent comments, simplicity, flow and efficiency. Thank you for sharing.
It is true that there are different ways of going about this problem, one of which is using the unique function. However, the whole idea of developing the nc program was to avoid the "unique" function as it is slower. Note that the speed of a function is more accurately measured via the profiler program of Matlab, or through the cputime function.
% Convert a decimanl number into a binary array
%
% Similar to dec2bin but yields a numerical array instead of a string and is found to
% be rather faster
if x==1
y=1;
return
end
c = ceil(log(x)/log(2)) + 1; % Number of divisions necessary ( rounding up the log2(x) )
y(c) = 0; % Initialize output array
for i = 1:c
r = floor(x / 2);
y(c+1-i) = x - 2*r;
x = r;
end
% If there is a preceding one, remove it.
if(y(1) == 0)
y(1) = [];
end
Respectfully, I must disagree. Firstly, the Mathworks help makes it clear that tic,toc is to be preferred over cputime. In addition tic, toc provides the same answer as "total time" of the profiler.
I am not sure why you say unique is slower. I just re-wrote your function using it and get a 3x speed up over your code (8s compared to 25s for a vector of length 1e7). The code is much neater and easier to understand and provides the same output:
function [q,Q,C,N] = nc2(O)
Q=unique(O);
q=length(Q);
N=hist(O,Q);
for k=1:length(Q);
C{k}=find(O==Q(k));
end
As I said earlier, please do correct me if I'm missing something silly. I'm not trying to make pedantic criticisms, just to help!
It is true that there are different ways of going about this problem, one of which is using the unique function. However, the whole idea of developing the nc program was to avoid the "unique" function as it is slower. Note that the speed of a function is more accurately measured via the profiler program of Matlab, or through the cputime function.
I will hold off a rating for now...
Nice idea but I have some comments:
- Code is not commented and is rather long for what it does. You could do 90% of this using a call to unique and call to hist (2 lines!). Using your output labels:
q=length(unique(A));
Q=unique(A);
Then:
N=hist(A,unique(A)); %the number in each class
The indecies for each class is the 3rd output of unique. You can either use that as is or turn it into a cell array as you have (C).
- Your code doesn't seem faster to me, it seems about 7 times slower (although without making C, but C may not be necessary). What am I missing?
r=round(randn(1,10e5)*10);
>> tic,[b,i,j]=unique(r);a=hist(r,b);toc
Elapsed time is 0.570190 seconds.
>> tic,nc(r);toc
Elapsed time is 3.543892 seconds.
More correct version:
function y = d2b(x)
% Convert a decimanl number into a binary array
%
% Similar to dec2bin but yields a numerical array instead of a string and is found to
% be rather faster
if x==1
y=1;
return
end
c = ceil(log(x)/log(2)) + 1; % Number of divisions necessary ( rounding up the log2(x) )
y(c) = 0; % Initialize output array
for i = 1:c
r = floor(x / 2);
y(c+1-i) = x - 2*r;
x = r;
end
% If there is a preceding one, remove it.
if(y(1) == 0)
y(1) = [];
end
Comment only