R= ALLSTATS(A) returns a structure R with several statistics of vector A. A grouping factor can be given in an optional vector. In that case all the statistics will be calculated per group. The stats calculated (returned as fields of R) are:
R.std= standard deviation
R.q2p5= 2.5 percentile
R.q5= 5 percentile
R.q25= 25 percentile
R.q50= 50 percentile (median)
R.q75= 75 percentile
R.q95= 95 percentile
R.q97p5= 97.5 percentile
% Example. No groups
% Example. With 2 groups
Francisco de Castro (2020). Allstats (https://www.mathworks.com/matlabcentral/fileexchange/25572-allstats), MATLAB Central File Exchange. Retrieved .
Great function, thank you. I've amended it to allow for a character based grouping vector, and allowed for secondary groupings as well. I've posted it up on github (with the license file) for anyone to branch off.
I appreciate the author's mind flexibility making this functions closer to the needs of the community.
It is nice to see some improvments Franciso! A couple of additional comments: if your sor worried about speed, I suggest you take a look at logical indexing, and replace all the zeros from the struct command by one call to zeros.
Another point: the behavior for 2D matrices should be explained, especially when the groups are specified. I think you did not intend that behavior. For instance, take a look if the following produces what you want them to produce:
A = rand(3,4) ; allstats(A), allstats(A,1:4), allstats(A,1:3)
With respect to the help, you can add examples of the expected output, besides giving only the calling syntax.
Below is the outline of a more optimized code:
function R = allstats(DATA, GROUP)
error('Function requires STATS toolbox.') ;
GROUP = ones(numel(DATA),1) ;
Ngroups = 1 ;
UniqueGroups = 1 ;
if numel(DATA) ~= numel(GROUP)
error('Number of elements should match') ;
UniqueGroups = unique(GROUP) ;
Ngroups = numel(UniqueGroups) ;
Z = nan(Ngroups,1) ;
R = struct('min',Z,'max',Z,'mean',Z) ; % etce
Q = GROUP == UniqueGroups(k) ; % logical indexing
R(k).min = nanmin(DATA(Q)) ;
R(k).max = nanmax(DATA(Q)) ;
R(k).mean = nanmean(DATA(Q)) ;
Tried to address the comments by Hanselman and Jos. The requirement of the Statistics TB is now shown. Help is (hopefully) less ambiguous. Some error checking is included, for numeric arguments and sizes of data and groups. By the way, examples WERE given from the beginning.
Grouping now works for the cases mentioned by Jos, i.e. the following cases all work:
Regarding Jos #2 comment: I don't see the advantage of using NARGIN > 1 instead of ~ISEMPTY(varargin). Also, in case no groups are not specified, the 'other piece of code' is used to avoid using FIND, which can be slow and would be useless in this case.
This submission can be improved considerably, considering the following:
1) when the data and group specification does not match, e.g., allstats([1 2 3],[1 2]), it should error
2) when the group data is not specified another piece of code is executed than when it is specified. Why not set the group data to all zeros, when no second argument is given. (and why not use nargin if the second argument is specified, instead of varargin)
3) Grouping does not work flawlessly: for instance ALLSTATS(rand(1,5),1:5) returns overall statistics, rather than statistics for the five individual groups
4) As Duane already mentioned, there should be reference, and some See Also's to the Stats TB and other functions
So, a lot to improve on this potential useful pick-of-the-week wrapper function.
This function calls functions that exist in the Statistics toolbox, but does not make that requirement known. No error checking is done. Help text is ambiguous and no examples are given. The last two help sentences describe unknown concepts that require more than a passing inspection of the function code to understand.
Calculation of the mode is adapted to discrete variables
Take into account the shape of data and groups vectors
Clarified help (hopefully). Included some error checking.
Added kurtosis and skewness