Code covered by the BSD License  

Highlights from
notBoxPlot - alternative to box plots.

4.85714

4.9 | 9 ratings Rate this file 99 Downloads (last 30 days) File Size: 5.06 KB File ID: #26508
image thumbnail

notBoxPlot - alternative to box plots.

by Rob Campbell

 

28 Jan 2010 (Updated 19 Mar 2012)

This function visualizes raw (grouped) data along with the mean, 95% confidence interval, and 1 SD.

Editor's Notes:

This file was selected as MATLAB Central Pick of the Week

| Watch this File

File Information
Description

Whilst box plots have their place, it's sometimes nicer to see all the data making up a distribution rather than hiding them with summary statistics such as the inter-quartile range. This function (with a tongue in cheek name) addresses this problem. To the best of my knowledge there aren't similar functions here on the FEX.

Jittered raw data are plotted for each group. Also shown are the mean, and 95% confidence intervals for the mean. This allows one to eyeball the data to look for significant differences between means (non-overlapping confidence intervals indicate a significant difference at the chosen p-value, which here is 5%). Also see: http://jcb.rupress.org/cgi/content/abstract/177/1/7 Finally, 1 SD is also shown. Note that if data are not normally distributed then these statistics will be less meaningful.

The function has several examples and there are various visualization possibilities in addition to those shown in the above screenshot. For instance, the coloured areas can be replaced by lines.

Although it's worked well for situations I've needed it, I will be happy to modify the function if users come up against problems.

%%%%%%
Included functions
notBoxPlot.m - generates plots as shown in screenshot
SEM_calc.m - calculate standard error of the mean. Provided as a separate function file so that it can be used for other purposes.
tInterval_Calc.m - calculate a t-interval. Right now notBoxPlot doesn't make use of this (unless the user edits the code, of course), but it still might be useful. For small sample sizes, the t-interval is larger than the SEM.

* NOTE *
The statistics toolbox is not required if you install the nantoolbox from here: http://pub.ist.ac.at/~schloegl/matlab/NaN/ Otherwise you will need the statistics toolbox for nan-handling.

MATLAB release MATLAB 7.8 (R2009a)
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (21)
29 Jan 2010 Michael Ashby

I like the idea but it seems to be missing the required "SEM_calc" function.

29 Jan 2010 Rob Campbell

Really? I thought I zipped it in there. Thanks for letting me know. I shall re-upload.

30 Nov 2010 Rossella Blatt

Very nice and useful. Thanks!

29 Mar 2011 Mahmoud

Very Useful!
Two questions,
1) How do you add labels to the x-axis like you would with the 'label' option in the boxplot function?
2) How can you specify what range should be plotted on the y-axis of notboxPlot?

29 Mar 2011 Rob Campbell

Mahmoud,
You can achieve these things in exactly the same way as you would for most other plotting commands. I try to avoid having functions behave too idiosyncratically. So, to answer your question:
clf
h=notBoxPlot(randn(10,2));
set(gca,'XTickLabel',{'GrpA','GrpB'})
ylim([-5,5])

The last two lines are obviously standard ways of setting labels and changing the axis limits. These work with any plot. Note that the notBoxPlot function returns the handles to the plot objects so that you can change their properties or even delete them. For example, you could remove all the data points by doing: delete([h.data])

28 Jul 2011 Dylan

Very useful, code is very well written.

08 Sep 2011 J G

This is very useful thanks! Is it possible to plot the SD as error bars instead of the box?

20 Sep 2011 Rob Campbell

Normally I'd say you should modify the plotted objects with the handles returned by the function. However, it would be awkward to do what you requested in this way. Consequently I've just submitted an update which should do what you want. The 4th argument can how have the values "sdline." If you want to alter the line properties, I recommend doing so by modifying the object properties via the handle returned by the function.

17 Oct 2011 Andrea  
28 Oct 2011 Harry MacDowel

Thanks Rob. Love it.

16 Feb 2012 Kelvin

I’m wondering how noBoxPlot can plot vectors of different lengths.

i.e.
x = rand(5,1);
y = rand(10,1);
z = rand(15,1);
group = [repmat({'First'}, 5, 1); repmat({'Second'}, 10, 1); repmat({'Third'}, 15, 1)];
boxplot([x;y;z], group)

Thanks in advance!

16 Feb 2012 Rob Campbell

Almost the same way: just don't code your groups as a cell array of strings. To modify your example:
 
group = [repmat(1, 5, 1); repmat(2, 10, 1); repmat(3, 15, 1)];
notBoxPlot([x;y;z], group)

You can then change the XTickLabels to strings if needed. I've not found I do this often enough to add cell arrays as an input possibility. Perhaps I should, though (when time allows!).

06 Mar 2012 Ted P Teng

At the moment, I am admiring what I just made with your function. Love it, thank you.
You guys may also want to use this function in conjunction with XTICKLABEL_ROTATE.

19 Mar 2012 Alexander

In opinion, a better replacement for the builtin boxplot is "Violin Plots for plotting multiple distributions (distributionPlot.m)" which does no require any additional toolboxes. Check:
http://www.mathworks.com/matlabcentral/fileexchange/23661-violin-plots-for-plotting-multiple-distributions-distributionplot-m

19 Mar 2012 Rob Campbell

I will soon be modifying this function to require no additional toolboxes. Otherwise, which function is best probably depends on the size of the data set. For large sample sizes the violin plots work best. For small sample sizes I prefer the plot style on this page, since it doesn't bin the data.

29 Mar 2012 Ian Shapiro

Great tool. It's an excellent way to visualize the distribution in a set of data. However, I've found that it does not appear work with 'gname' for labeling individual data points, whereas boxplot is able to do this. Any idea why that's the case?

29 Mar 2012 Rob Campbell

Hmmm... Don't know. I will look into it.

20 Apr 2012 Rob Campbell

Ok... For some reason adding a patch object causes gname to fail. If you run notBoxPlot using the "line" plotting style then gname works.

28 May 2012 J G

Great function! Is there a way to have a trend line through the means, i.e. using polyval/polyfit? Thanks!

29 May 2012 J G

How can I use this function with continuous spacing on x-axis?
For example,
p = [0.1 0.25 0.5 0.75 0.9];
boxplot(A,'position',p)
will place the boxplots unevenly spaced along x-axis. Is there a way to do this with this function?

29 May 2012 Rob Campbell

JG:
Q1. The function will return the coordinates of the means so you can use these with polyval. e.g.
H=notBoxPlot(randn(10),[],[],'line');
x=get([H.mu],'XData'), y=get([H.mu],'YData');
Without "line" the above will return two data points for each mean (since the means are lines), but it's easy enough to work with that too. Does that work for you?

Q2. You can do this as follows:
notBoxPlot(randn(10,5),[1,2,5,9,10])

Please login to add a comment or rating.
Updates
29 Jan 2010

re-upload because support file (SEM_calc.m) seemed to be missing

29 Jan 2010

Clarify a point in the description.

30 Jan 2010

Add tInterval_Calc and update the comments in SEM_calc

12 Feb 2010

Add link to JCB article on error bars.

24 Feb 2010

Handle to mean line when in patch mode (the default mode) is now returned.

20 Sep 2011

The 4th argument can now also have the value "sdline". This creates plots where the SD is a line instead of a patch.

21 Sep 2011

If "y" is a vector then the function ensures it is a column vector in order to yield one box-plot.

12 Oct 2011

Both x and y can now be vectors, in which case the function behaves like Mathworks' boxplot. An example of this behaviour is provided.

An example of the "sdline" plot style is now provided.

14 Nov 2011

Fix bug that was causing handles not return for one of the plot formats.

10 Jan 2012

Better handles x-ticks and x axis limits. Add missing semicolon.

19 Mar 2012

Update summary to explain that the function works without the stats toolbox if the nan-toolbox is installed.

Tag Activity for this File
Tag Applied By Date/Time
statistics Rob Campbell 28 Jan 2010 13:55:26
data exploration Rob Campbell 28 Jan 2010 13:55:26
plotting Rob Campbell 28 Jan 2010 13:55:26
error bars Rob Campbell 13 May 2010 10:31:07
data exploration Chris Seaman 08 Jun 2010 12:55:15
box plot Chris Seaman 08 Jun 2010 12:55:39
potw Lindsay Coutinho 07 Oct 2011 13:08:44
pick of the week Lindsay Coutinho 07 Oct 2011 13:08:44
box plot Dongzhen Piao 12 Oct 2011 14:46:53
box plot Brian 03 May 2012 17:19:52

Contact us at files@mathworks.com