Code covered by the BSD License  

Highlights from
Allstats

3.5

3.5 | 2 ratings Rate this file 11 Downloads (last 30 days) File Size: 2 KB File ID: #25572

Allstats

by Francisco de Castro

 

14 Oct 2009 (Updated 21 Jul 2011)

Many statistics of a vector or matrix

Editor's Notes:

This file was selected as MATLAB Central Pick of the Week

| Watch this File

File Information
Description

R= ALLSTATS(A) returns a structure R with several statistics of vector A. A grouping factor can be given in an optional vector. In that case all the statistics will be calculated per group. The stats calculated (returned as fields of R) are:
R.min= minimum
R.max= maximum
R.mean= mean
R.std= standard deviation
R.mode= mode
R.q2p5= 2.5 percentile
R.q5= 5 percentile
R.q25= 25 percentile
R.q50= 50 percentile (median)
R.q75= 75 percentile
R.q95= 95 percentile
R.q97p5= 97.5 percentile
R.kurt= Kurtosis
R.skew= Skewness

% Example. No groups
x= rand(10,1);
R= allstats(x)

% Example. With 2 groups
g= [1;1;1;1;1;2;2;2;2;2];
R= allstats(x,g)

Required Products Statistics Toolbox
MATLAB release MATLAB 7.8 (R2009a)
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (5)
03 Nov 2009 Duane Hanselman

This function calls functions that exist in the Statistics toolbox, but does not make that requirement known. No error checking is done. Help text is ambiguous and no examples are given. The last two help sentences describe unknown concepts that require more than a passing inspection of the function code to understand.

04 Nov 2009 Jos (10584)

This submission can be improved considerably, considering the following:
1) when the data and group specification does not match, e.g., allstats([1 2 3],[1 2]), it should error
2) when the group data is not specified another piece of code is executed than when it is specified. Why not set the group data to all zeros, when no second argument is given. (and why not use nargin if the second argument is specified, instead of varargin)
3) Grouping does not work flawlessly: for instance ALLSTATS(rand(1,5),1:5) returns overall statistics, rather than statistics for the five individual groups
4) As Duane already mentioned, there should be reference, and some See Also's to the Stats TB and other functions

So, a lot to improve on this potential useful pick-of-the-week wrapper function.

04 Nov 2009 Francisco de Castro

Tried to address the comments by Hanselman and Jos. The requirement of the Statistics TB is now shown. Help is (hopefully) less ambiguous. Some error checking is included, for numeric arguments and sizes of data and groups. By the way, examples WERE given from the beginning.

Grouping now works for the cases mentioned by Jos, i.e. the following cases all work:
allstats(rand(1,5),1:5)
allstats(rand(1,5),(1:5)')
allstats(rand(5,1),1:5)

Regarding Jos #2 comment: I don't see the advantage of using NARGIN > 1 instead of ~ISEMPTY(varargin). Also, in case no groups are not specified, the 'other piece of code' is used to avoid using FIND, which can be slow and would be useless in this case.

05 Nov 2009 Jos (10584)

It is nice to see some improvments Franciso! A couple of additional comments: if your sor worried about speed, I suggest you take a look at logical indexing, and replace all the zeros from the struct command by one call to zeros.

Another point: the behavior for 2D matrices should be explained, especially when the groups are specified. I think you did not intend that behavior. For instance, take a look if the following produces what you want them to produce:

A = rand(3,4) ; allstats(A), allstats(A,1:4), allstats(A,1:3)

With respect to the help, you can add examples of the expected output, besides giving only the calling syntax.

Below is the outline of a more optimized code:
function R = allstats(DATA, GROUP)
if isempty(ver('stats'))
    error('Function requires STATS toolbox.') ;
end
if nargin==1,
    GROUP = ones(numel(DATA),1) ;
    Ngroups = 1 ;
    UniqueGroups = 1 ;
else
    if numel(DATA) ~= numel(GROUP)
        error('Number of elements should match') ;
    end
    UniqueGroups = unique(GROUP) ;
    Ngroups = numel(UniqueGroups) ;
end
Z = nan(Ngroups,1) ;
R = struct('min',Z,'max',Z,'mean',Z) ; % etce
for k=1:Ngroups,
    Q = GROUP == UniqueGroups(k) ; % logical indexing
    R(k).min = nanmin(DATA(Q)) ;
    R(k).max = nanmax(DATA(Q)) ;
    R(k).mean = nanmean(DATA(Q)) ;
end

05 Nov 2009 Oleg Komarov

I appreciate the author's mind flexibility making this functions closer to the needs of the community.

Please login to add a comment or rating.
Updates
03 Nov 2009

Added kurtosis and skewness

04 Nov 2009

Clarified help (hopefully). Included some error checking.

04 Nov 2009

Take into account the shape of data and groups vectors

04 Nov 2009

avoid output...

21 Jul 2011

Calculation of the mode is adapted to discrete variables

Tag Activity for this File
Tag Applied By Date/Time
statistics Francisco de Castro 14 Oct 2009 11:40:32
mean Francisco de Castro 14 Oct 2009 11:40:33
mode Francisco de Castro 14 Oct 2009 11:40:33
median Francisco de Castro 14 Oct 2009 11:40:33
quartile Francisco de Castro 14 Oct 2009 11:40:33
maximum Francisco de Castro 14 Oct 2009 11:40:33
minimum Francisco de Castro 14 Oct 2009 11:40:33
potw Shari Freedman 30 Oct 2009 10:18:29
kurtosis Francisco de Castro 03 Nov 2009 12:09:21
skewness Francisco de Castro 03 Nov 2009 12:09:21
pick of the week Jiro Doke 11 Feb 2011 20:10:02

Contact us at files@mathworks.com