File Exchange

image thumbnail

Allstats

version 1.6.0.0 (2 KB) by Francisco de Castro
Many statistics of a vector or matrix

17 Downloads

Updated 21 Jul 2011

View License

Editor's Note: This file was selected as MATLAB Central Pick of the Week

R= ALLSTATS(A) returns a structure R with several statistics of vector A. A grouping factor can be given in an optional vector. In that case all the statistics will be calculated per group. The stats calculated (returned as fields of R) are:
R.min= minimum
R.max= maximum
R.mean= mean
R.std= standard deviation
R.mode= mode
R.q2p5= 2.5 percentile
R.q5= 5 percentile
R.q25= 25 percentile
R.q50= 50 percentile (median)
R.q75= 75 percentile
R.q95= 95 percentile
R.q97p5= 97.5 percentile
R.kurt= Kurtosis
R.skew= Skewness

% Example. No groups
x= rand(10,1);
R= allstats(x)

% Example. With 2 groups
g= [1;1;1;1;1;2;2;2;2;2];
R= allstats(x,g)

Cite As

Francisco de Castro (2020). Allstats (https://www.mathworks.com/matlabcentral/fileexchange/25572-allstats), MATLAB Central File Exchange. Retrieved .

Comments and Ratings (6)

Matthew

Great function, thank you. I've amended it to allow for a character based grouping vector, and allowed for secondary groupings as well. I've posted it up on github (with the license file) for anyone to branch off.
https://github.com/poot17/allstats

I appreciate the author's mind flexibility making this functions closer to the needs of the community.

Jos (10584)

It is nice to see some improvments Franciso! A couple of additional comments: if your sor worried about speed, I suggest you take a look at logical indexing, and replace all the zeros from the struct command by one call to zeros.

Another point: the behavior for 2D matrices should be explained, especially when the groups are specified. I think you did not intend that behavior. For instance, take a look if the following produces what you want them to produce:

A = rand(3,4) ; allstats(A), allstats(A,1:4), allstats(A,1:3)

With respect to the help, you can add examples of the expected output, besides giving only the calling syntax.

Below is the outline of a more optimized code:
function R = allstats(DATA, GROUP)
if isempty(ver('stats'))
error('Function requires STATS toolbox.') ;
end
if nargin==1,
GROUP = ones(numel(DATA),1) ;
Ngroups = 1 ;
UniqueGroups = 1 ;
else
if numel(DATA) ~= numel(GROUP)
error('Number of elements should match') ;
end
UniqueGroups = unique(GROUP) ;
Ngroups = numel(UniqueGroups) ;
end
Z = nan(Ngroups,1) ;
R = struct('min',Z,'max',Z,'mean',Z) ; % etce
for k=1:Ngroups,
Q = GROUP == UniqueGroups(k) ; % logical indexing
R(k).min = nanmin(DATA(Q)) ;
R(k).max = nanmax(DATA(Q)) ;
R(k).mean = nanmean(DATA(Q)) ;
end

Tried to address the comments by Hanselman and Jos. The requirement of the Statistics TB is now shown. Help is (hopefully) less ambiguous. Some error checking is included, for numeric arguments and sizes of data and groups. By the way, examples WERE given from the beginning.

Grouping now works for the cases mentioned by Jos, i.e. the following cases all work:
allstats(rand(1,5),1:5)
allstats(rand(1,5),(1:5)')
allstats(rand(5,1),1:5)

Regarding Jos #2 comment: I don't see the advantage of using NARGIN > 1 instead of ~ISEMPTY(varargin). Also, in case no groups are not specified, the 'other piece of code' is used to avoid using FIND, which can be slow and would be useless in this case.

Jos (10584)

This submission can be improved considerably, considering the following:
1) when the data and group specification does not match, e.g., allstats([1 2 3],[1 2]), it should error
2) when the group data is not specified another piece of code is executed than when it is specified. Why not set the group data to all zeros, when no second argument is given. (and why not use nargin if the second argument is specified, instead of varargin)
3) Grouping does not work flawlessly: for instance ALLSTATS(rand(1,5),1:5) returns overall statistics, rather than statistics for the five individual groups
4) As Duane already mentioned, there should be reference, and some See Also's to the Stats TB and other functions

So, a lot to improve on this potential useful pick-of-the-week wrapper function.

This function calls functions that exist in the Statistics toolbox, but does not make that requirement known. No error checking is done. Help text is ambiguous and no examples are given. The last two help sentences describe unknown concepts that require more than a passing inspection of the function code to understand.

Updates

1.6.0.0

Calculation of the mode is adapted to discrete variables

1.4.0.0

avoid output...

1.3.0.0

Take into account the shape of data and groups vectors

1.2.0.0

Clarified help (hopefully). Included some error checking.

1.1.0.0

Added kurtosis and skewness

MATLAB Release Compatibility
Created with R2009a
Compatible with any release
Platform Compatibility
Windows macOS Linux