version 3.2.1.0 (150 KB) by
Christopher Hummersone

Draw a box plot with various display options

**Editor's Note:** This file was selected as MATLAB Central Pick of the Week

NOTE: this function is now available from the IoSR Matlab Toolbox as iosr.statistics.boxPlot.

-------------------------

Alternative box plot function for Matlab with many options. These options include:

- Variable sample sizes (via the tab2box() function).

- Show box sample size.

- Scaled or uniform box spacing.

- Box width scaled by sample size.

- Overlay scatter plots of underlying data.

- Overlay the mean of the data.

- Overlay additional percentiles, and attach labels to them.

- Hierarchical X-labeling and support for multidimensional data.

- Notched boxes.

- Vertical lines to separate groups.

- Automated construction of a legend.

- Set box limits as percentiles.

- Set whisker extent via various methods.

- Use of weighted quantiles.

- Creation of violin plots.

Christopher Hummersone (2021). Alternative box plot (https://github.com/IoSR-Surrey/MatlabToolbox), GitHub. Retrieved .

Created with
R2015a

Compatible with any release

**Inspired by:**
notBoxPlot, Hierarchically grouped boxplot

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!Create scripts with code, output, and formatted text in a single executable document.

guang xuChristopher DürrbeckHi Christopher,

thank you for this great package! I have a question on how to deal with a whole column full of NaNs when plotting N boxes from a MxN array. All boxes up to the column containing only NaNs are plotted perfectly. The only-NaN containing column then leads to an error. The behaviour I would prefer is to plot no box at all for that only-NaN containing column and continue with the other columns coming after that. Is there a solution or workaround for this?

Best regards,

Chris

Prashant PandeyIf anyone is having issues with legends working in MATLAB R2020a, you need to add squeeze() to line 1974 of boxPlot.m :

[obj.handles.legend, icons] = legend(squeeze(legendTarget(1,1,order)),lgndstr(order));

Jordan LuiWhat version of Matlab is required for this? I'm using an older version of Matlab and cannot run installer because websave() method is not available.

mimi adaIkke89I am trying to group my boxes with the GROUPLABELS option. I struggle to understand from the documentation how to do this properly. I have tried all kind of combinations of cell arrays, but I keep getting the error:

Error using iosr.statistics.boxPlot/drawStyle (line 1850)

The GROUPLABELS option should be a cell vector; the Nth element should contain a vector of length SIZE(Y,N+2)

Could somebody help me with a simple example? E.g. how I would group a 10x4 array (4 boxes) into two groups, so that the the adjacent boxes are in the same group?

Thank you very much in advance!

Dominic YanHi Christopher,

I have a trivial question here: How can I plot different colors for multiple boxes in one plot?

Jochen2I had to change the following in boxPlot.m for 2019a:

obj.handles.samplesTxt(subidx{:}) = text(double(obj.xticks(subidx{2})+gOffset+halfboxwidth-xoffset),...

double(obj.statistics.PU(subidx{:})-yoffset),...

num2str(sum(~isnan(obj.y(subidxAll{:})))),...

'horizontalalignment','right','verticalalignment','top');

(the text command requires doubles for x and y)

Also I changed the following in kernelDensity.m :

if numel(x) < 1

d = 0;

bw = 0;

xd = 0;

return

end

(this allows you to showScatter even if one box only has one data point)

Jochen2Regarding my former comment: It should be

if numel(x) < 1

d = 0;

bw = 0;

xd = 0;

return

end

(this allows you to showScatter even if one box only has one data point)

AiyushHow do I change the font size of the resulting legend? I tried retrieving the legend object from the handle using

h=findobject(gcf,'Type','Legend');

set(h,'FontSize',10)

but this won't work when there are multiple legends on the figure.

IstNanMichael MarquisAdriana MurraçasChristopher HummersoneHi Arnold. I no longer have access to MATLAB, so I’m afraid I won’t be doing this anytime soon!

arnoldHi Christopher,

you might want to consider adopting tables, especially categorical input for tab2box instead of just cell arrays out of convenience.

regards

Arnold

Christopher HummersoneYes, you can see here how the different methods lead to different outcomes: https://uk.mathworks.com/matlabcentral/mlc-downloads/downloads/submissions/46555/versions/10/screenshot.jpg

Julian RüdigerThanks Christopher for the fast response. Is the difference expected due to the R-8, R-5 calculation methods? I did compare the results by using R-5 with your function as well. However, I found a workaround for my application by fixing the unweighed index in quantile.m at line 179: q(m,n) = Qp(xSorted,huw). This worked to give the same median in the plots as using the matlab function.

Christopher HummersoneHi Julian. I don’t currently have access to MATLAB, so I’m afraid my ability to help is somewhat limited. However, note that it is expected that the median produced by these functions is different to the median MATLAB calculates. This is described in the iosr.statistics.quantile help.

Julian RüdigerHi Christopher,

thanks for this awesome function. I got an issue with the median of an array using you function, which doesn't result in the same number as using the matlab median function or the matlab boxplot itself. I was trying to find the source for this inconsitency:

"iosr.statistics.boxPlot" uses "iosr.statistics.statsPlot" to calculate the median, which uses "iosr.statistics.quantile(obj.y,.5,[],obj.method,obj.weights)" so far so good. Using "iosr.statistics.quantile" without weights defined gives the same result as the matlab function. But if I use "iosr.statistics.quantile" with the weights that are definded in "iosr.statistics.statsPlot" namely "obj.weights = ones(size(obj.y))" a different results is calculated.

Have you experienced this issue yet?

Thank you!

Cheers,

Julian

Christopher HummersoneHi Arne. Not directly, no, but you can use the iosr.statistics.tab2box function to put the data in to boxPlot’s required format.

Arne GraulHello Christopher,

is there a way to plot multiple boxes in one plot, using a grouping variable just like in the Matlab-boxplot function? (boxplot(x,g))

Thanks a lot

Christopher HummersoneSee the solution here: https://github.com/IoSR-Surrey/MatlabToolbox/commit/11f8077e6870a961e3106e371f149f838af397f2#commitcomment-23722979

LaurensVery good toolbox! Is there a possibility to add rotation to group labels?

TillVery helpful! Your work is appreciated.

Christopher HummersoneThanks for that, Filntisis. I've uploaded a fix to GitHub.

Filntisis PanagiotisHi Christopher:

test = [2,3,2,2,2,3,2,1,2,2,3,2,2,3,2,3,2,2,2,4,3,4,2,5,2,4,1,2,2,1,2,4,2,3,3,4,2,2,2,2,1,3,1,2,2,2,3,4,1,1]';

iosr.statistics.boxPlot(test);

Here the upper limit is 4.5, however when I check the object outliers they are empty (and they are not plotted).

I tied to check a bit (not that familiar with matlab programming) and at line 1896 of boxPlot.m the statsPlot.calculateStats is called and the outliers are populated, however, at line 1904-1905 outliers are overwritten to 0 again.

Christopher HummersoneHi Filntisis. If the outliers array is empty then are you sure that your data has outliers?! The outliers are calculated in the statsPlot base class. Can you post a minimal example that produces the error?

Filntisis PanagiotisHi Christopher ! Thanks for this. I am trying to use the function but the outliers do not show up. If I check the statistics object the outliers are empty and I looked at the source and I cant find where the outliers are calculated. The outliers show only if I set the ShowScatter option to true, along with all the other data points. (The limit is set to 1.5IQR). Thanks !

Christopher HummersoneUse the 'limit' property to specify the whisker extent.

Aditya NandaThanks Christopher, that explains it. Theres another issue. The whiskers are supposed to represent the maximum and minimum data. But on manually checking the values, I found that, for some cases, the whiskers were less than the maximum data or more than the minimum data. How is this possible ?

Christopher HummersoneHi Aditya. Thanks for the feedback. The reason the means were not what you were expecting was because boxPlot did not calculate weighted means. I've since modified it so that it does; the fix is live on GitHub. Cheers.

Aditya NandaI figured out how to plot the mean. So that is all set. But the mean value does not match what I have on record.

I am using quadrature points (like Gauss-hermite) so I have y_i and corresponding weights w_i. I calcultate the mean as \sum_ y_i w_i and it does not match the mean plotted by the BoxPlot. How is this possible?

Aditya NandaHi Christopher, Amazing work on this set of files. There is something I need help with. Now, the standard syntax for the Weighted Box plot (iosr.statistics.boxPlot(x,y,'weights',weights)) is to plot the median, the 25th percentile, the 75th percentile and the outliers.

I am interested in plotting the mean( not the median) , and just the 25h percentile and the 75th percentile. How do I od this ? I tried changing the name-value-pairs in classdef (CaseInsensitiveProperties = true) boxPlot < iosr.statistics.statsPlot

but that did not help. The mean never shows up. I tried to change the medianColor to white so that its invisible but the median always shows up.

Christopher HummersoneHi Roland. Thanks for the feedback and suggestion. I've implemented your suggestion as an optional first argument in the constructor (keeping it consistent with other Matlab functions). The change has been committed to the GitHub repo and should be pulled on to the FX within 24 hours. Thanks!

RolandHi Christopher,

great work. your box plot looks much better than the official version of the statistics toolbox. however, i have a kindly feature request. can you provide an optional property in the constructor method for feeding an axis handle from outside? or is there already any workaround to come up with an "own" axis handle?

best, roland

Christopher HummersoneHi Arnold,

It took me a little while to even show the legend title (for me, the 'visible' property of the legend title object was 'off' by default).

Anyway, I think this speaks to known problems with the legend title in HG2. See: http://undocumentedmatlab.com/blog/plot-legend-title. So I have no other ideas, or any fixes, I'm afraid.

Chris

arnoldHi Christopher,

I had another look at the legend. Since Matlab had finally given the possibility to set a title to a legend I tried to do just that but the title is always off (position wise), in the lower left corner of the entirety of the legend (at least I found it to be reproducible where it is situated).

Do you have any idea why that would be?

I found the Matlab ability to add a title to the figure quite useful and robust so I'm wondering what's gong on here.

kind regards

Arnold

cai onionHi Chris,

Thanks for your suggestion. I have revised the script "boxPlot", but another errors were gotten, as followings:

Undefined function 'histcounts' for input arguments of type 'double'.

Error in iosr.statistics.boxPlot/xOffset (line 2328)

[N,~,bin] = histcounts(y); % create a histogram

Error in iosr.statistics.boxPlot/drawOutliers (line 1482)

xScatter = X + (0.8.*halfboxwidth.*obj.xOffset(obj.statistics.outliers{subidx{:}}));

Error in iosr.statistics.boxPlot/draw (line 1299)

obj.drawOutliers(subidx);

Error in iosr.statistics.boxPlot (line 658)

obj.draw('all');

Error in Boxplot4Index (line 23)

h11 = iosr.statistics.boxPlot(NewCC,...

It seemed that this tool was not suitable for the older version of Matlab.

Christopher HummersoneHi Cai,

Unfortunately the matlab.mixin.SetGet class was introduced in R2014b. To make boxPlot work on earlier versions, try modifying line 1, replacing "matlab.mixin.SetGet" with "handle". boxPlot should work OK, but you won't be able to use set(...) and get(...) syntaxes to change boxPlot properties.

Chris

cai onionDear Christopher,

I tried to use this tool to make a boxplot figure, but it did not work. The error was listed as followings:

Error using iosr.statistics.boxPlot

The specified superclass 'matlab.mixin.SetGet' contains a parse error or cannot be found on MATLAB's search

path, possibly shadowed by another file with the same name.

Error in Boxplot4Index (line 23)

iosr.statistics.boxPlot({'A','B','C'},NewCC,...

I did not know how to solve it. I was using a win7 and Version 2013a. However this script could be run successfully in win 7 and matlab 2015b. Thanks.

Christopher HummersoneNo problem at all (sorry if I sounded defensive).

When boxPlot was a function, it was called box_plot. I renamed it to boxPlot because I reimplemented it as a class, and the two coexisted locally for a short time. Still, in every other context Matlab correctly distinguishes between "boxplot" and "boxPlot", just not with the keyboard/mouse shortcuts.

arnoldthanks Chris,

my lack of knowledge and usage on the notches caused that question, now looking at it I get it, it just looked like a glitch in the way the boxes were being set up.

as for the aliasing, of course I did not mean to criticize, just state my surprise as ever since HG2 I didn't consider this could cause a problem anymore. Using a tool like export_fig negates such things anyways.

As of the toolbox-issue. It's ok. When the reinstall didn't help I thought I'm just too stupid to get it. Did you not at one point change the name from box_plot to boxPlot? I guess having a unique name could remedy the problem but I could be wrong.

Christopher HummersoneHi Arnold,

Notch:

The strange shapes you see are because the notch extends beyond the IQR (boxPlot should warn you about this). The notch is analogous to a confidence interval; its height is ±(1.58*IQR)/sqrt(N). So it will be generally be large if you have a small sample size (you can see a similar effect here: https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_boxplot_sect012.htm - scroll down to "NOTCHES"). So this behaviour is expected.

Aliasing:

This is something I have no control over; it's determined by whatever functions/drivers you use to create the image file. Do you use export_fig (http://www.mathworks.com/matlabcentral/fileexchange/23629-export-fig)? It's a fantastic tool. Among its MANY options is control over the degree of anti-aliasing.

Toolbox:

I understand your problem. I typed 'import iosr.statistics.*' and F1 (etc) on 'boxPlot' no longer gives help related to iosr.statistics.boxPlot. Yet the 'which' function and other runtime name resolution operations run as expected. I'm afraid this is a shortcoming in Matlab. Perhaps consider filing a bug report? But there is nothing I can do about it I'm afraid.

arnoldI see (the problem with legend and hatchfill2). I didn't think it through since I almost always use legendflex.

Notch:

I never used it but it doesn't seem to work for me. I see oddly shaped boxes, I'll send you a screenshot.

Also, I get heavy aliasing at the angled box edges, something I thought Mathworks prevented with their HG2 graphics system.

Toolbox

I have to ask again about the toolbox construct. I reinstalled matlab yesterday just to fix this but it didn't work. I'm on 2016a as you but hitting F1 or CTRL+D to get help or open boxPlot opens the original Mathworks function for me in either case (having installed iosr+ or having imported it at session start).

Christopher HummersoneHi Arnold.

I've been thinking about this, but I can't think of a good interface without adding every hatchfill2 option to boxPlot. But it is easy enough to use with boxPlot, e.g.:

import iosr.statistics.boxPlot;

h = boxPlot(rand(100,3,3));

hatchfill2(h.handles.box(:,:,1),'single');

will add a hatch to the first box in each x-group.

Of course this won't update the legend, which is a known shortcoming of hatchfill2. There is a discussion about hatchfill2 and legends on the hatchfill2 FX page.

Chris

arnoldHi Chris,

to continue ideas.... implementing hatchfill2 might readability of the diagrams.

http://www.mathworks.com/matlabcentral/fileexchange/53593-hatchfill2

I use it quite often for publications/diagrams but haven't tried using it with boxPlot.

I'll never understand why matlab does not have native hatch support.

As far as I know at least legendflex supports it in legends.

arnoldHi Chris,

I see, I somehow overlooked that option, thanks! I'm just too unfamiliar with matlab toolboxes and how to handle them properly otherwise I might have picked up on it myself.

Thanks for the kind words, I'd like to think I gave some useful hints/feedback. Starting at an early point there has been no reason whatsoever anymore to go and use the integrated boxplot function, it is just too limited and cumbersome compared to yours. This sadly applies to a lot of the data visualization features like no proper XY errorbars etc. shame, but for most decent plots (besides boxplots now) one still has to go and copy all data to something like Origin instead.

If I were Mathworks, I'd ask you to have this integrated.

later

Arnold

Christopher HummersoneHi Arnold,

I've already had this debate with another user: https://github.com/IoSR-Surrey/MatlabToolbox/issues/1. As I mentioned in the discussion, 'Typing "import iosr.statistics.*" will prevent you from having to re-write any code.'

As it mentions in the readme "Basic installation only requires you to add the install directory to the Matlab search path." So you don't need to run iosr.install, just add the install path to your Matlab path. The rest of the tools in the package might not be relevant to you, but they are to myself and colleagues. Since they're only text files, you can ignore them; they take up very little space on your hard disk.

Getting function help (and opening the function for editing) works as normal for me, but I am using R2016a, so I'm afraid I can't help if you're using an earlier version. Typing "help iosr" should present you with a full list of the toolbox contents. In that list, each file should have a clickable link that displays help for that file.

Best,

Chris

P.S. Getting "Pick of the Week" was great. But many of the features of boxPlot were your ideas, so huge thanks for that.

arnoldDear Christopher,

I was offline for quite a while but saw this made it to 'picks of the week', very nice. Alternative boxplot has come a long way.

One thing that confuses me a bit is why you have packed so many different tools into one toolbox. I understand you have put in a lot of work into all these other tools but I don't see how they are related for most users. The installer created another folder called 'SOFA_API'... I have no interest in installing that when I am looking for the 'alternative boxlplot'

Bottom line, a ton of stuff that people might not want seems to be added even to the path. I'm not sure everybody things this is more tidy than before. iosr.statistics.boxplot is not more readable that box_plot, which it was some version ago.

One thing I usually use all the time in the editor is mark a function name and use CTRL+D or F1 to either open the function or get the help in order to figure out syntax questions etc. Now, with the function being iosr.statistics.boxplot one can't do that anymore. The only way I see now is more cumbersome, manually digging through the folders and finding that function again.

Is there an easier way to open the function and/or help that I am unaware of?

For now I'll stick with the older version because I don't want to change syntax everywhere just now. Maybe I can figure out an easier way to strip the toolbox of everything but the boxplot, quantile2 and tab2box functions.

Christopher HummersoneMany thanks, Sean. I've uploaded a fix to GitHub.

Sean de WolskiYou have a bug on line 842

assert(isnumeric(obj.addPrctilesTxtSize) && isscalar(obj.addPrctilesTxtSize), '''ADDPRCTILESTXTSIZE'' should be a numeric scalar')

It's checking the property and not the new setting: val.

Christopher HummersoneHi again Arnold. I think I found and fixed your bug.

I've also been trying, on and off, to implement your sorting suggesting. But after a number of attempts, I've decided that it's too difficult to implement, because the input can have an arbitrary number of dimensions, each of an unknown size.

If you, or anyone else, wants to have a go, feel free to fork the repo!

WarwickChristopher,

I got it to work properly. Probably I had a glitch in my set paths.

Great contribution. Thanks

Christopher HummersoneHi Warwick.

I also have Matlab's boxplot function, and use this version with no problems. Note that Matlab class/function calls are case-sensitive, so you should be able to call this class as 'boxPlot' with no problems. If you can't, then I can only assume that there is an issue with your path (or it may be an OS thing, as I'm using a Mac). If you still can't get it to work, could you please post the error text, and perhaps a minimal working example? Thanks.

Chris

WarwickChristopher

This looks pretty useful. I like the option to have whiskers at [5,95] percentiles, for example, rather than 1.5*IQR which is not applicable to skewed data. However, because I already have Matlab's boxplot (no caps), your boxPlot is not recognised when I call it even though I have put it into a 'set path'. Or, at least, Matlab by default chooses its own boxplot function. Is there a workaround to this? How would I rename it to boxPlotCH, say? I'm using a Mac and Version 2016a if that is important.

Christopher HummersoneHi Arnold,

I'm afraid I've been unable to recreate the error. Can you please download the latest version, and post a minimal working example that reproduces the error?

Thanks.

arnoldHi Christopher,

I was playing with the new features when I discovered a bug I can't really make out. It might have something to do with the overhaul of the handles you did? When using samplesize 'true' this is the result:

Reference to non-existent field 'groupsTxt'.

Error in boxPlot/drawGlobalGraphics (line 1672)

set(obj.handles.groupsTxt,'FontSize',obj.groupLabelFontSize);

Error in boxPlot/draw (line 1322)

obj.drawGlobalGraphics();

Error in boxPlot (line 656)

obj.draw('all');

I'm on the current version 2016a, don't have an installation of 2015 up and running to cross check it there.

Christopher HummersoneHi Royi,

3) The code is now on GitHub.

1) I've added an issue on to the GitHub repo for the percentile enhancement.

2) I have already implemented a 'showOutlier' option.

Thanks!

Royi AvitalHi Christopher,

I wrote my response in context of my previous comment and your reply:

1. It would be great to have a marker for the 5% and 9% Percentile. Juts like there is the median, the mean, and 25% / 75%, add another label for 5% / 95% (Maybe optional by default).

2. I think the outliers and the scatter should (And underneath are) 2 separate scatter series. What I would like to have is the opacity and visibility property of those 2 be exposed using the methods of your function. Something like hBoxPlot.scatterVisible = false(); hBoxPlot.outlierVisible = true();.

3. Putting your code on GitHub is a great idea in my opinion.

Thank You.

Christopher HummersoneHi Royi,

I'm not sure what you mean. Can you post a link to an example?

Thanks,

Chris

Royi AvitalHi Christophe,

1. I think you should add new queue for 5% and 95%. It can be a Star or any other marker, just to be able to put the 5% and 9% point on the graph (In addition to what we have now).

2. I think you just need to make the outliers and the data have the property "Visible" exposed in your methods, that would be perfect and easy to do on your hand (Also the opacity).

3. Putting it on GitHub is a great idea!

Thank You!

Christopher HummersoneHi Arnold. Perhaps I should put the file on Github and let you (or others) tinker ;-)

I'll put the weighting and sorting options on my [very long] todo list. Do you have any useful references on implementing a weighting algorithm?

arnoldHi Christopher,

I just now had another idea which should be easy to implement yet very useful visually:

An option to sort the groups so that the boxes within one x-group are ascending/descending.

Of course the similar functionality for the x-groups would also make sense.

This should make it a lot easier to read the data, especially if there are a lot of groups

This contribution is coming along very nicely due to your tireless commitment. Well done!

You should probably set up a donation link. :)

arnoldexactly,

I know it is not very common in most plotting tools but it is the scientifically 'more correct' approach if one has measurement errors. Obviously people need to know what they're doing with errors and PDFs anyways for things to be 'correct' but take a guess how many people use boxplots for 3 data-points ^^.

As I said, most scientist then go and do scatter plots with one point each + error bars (either all data points or just replace all by one using proper error propagation).

R does indeed have it. Matlab is quite cumbersome with all of that since no total least square for x errors is supported i.e.

Anyways, back on target:

Depending on how much effort you want to put into it, there are several approaches. I would start with the first and easiest:

Stick to normally distributed errors. As you suggested, the user has the option to give an extra array/vector of 'errors', one for each value or datum as you call it. Usually for most that'll be the standard deviation of the measurement i.e.. You can then calculate the weights (1./sigma.^2) for each group and using this get a weighted mean and confidence interval based on those errors.

Out of my head I have no clue about a a weighted median though it seems defined on wikipedia i.e.

I think it would go along nicely with box_plot but I could also see it as a standalone contribution on the fileexchange, also using tab2box for grouping and then doing a proper weighted mean with (asymmetrical) error bars in x and y some time in the future maybe.

Again I think the smaller effort of including it here assuming normally distributed errors/stdevs would serve 90% of the demand.

It's really a shame that Mathworks don't seem to bother about these things :)

Christopher HummersoneRoy - thanks for your feedback. I'll upload a modified function soon. To be clear, do you mean that the box should be 5–95%, or the whiskers? Also, the documentation was poorly phrased. What I meant was that the scatter plot does not include the outliers (as you observe, they are plotted separately). I'm not particularly keen on adding an option that would make the plot misleading. A suitable alternative (to me) would be to set 'limit' to 'none'.

Arnold - that's an interesting idea. It's something I've not come across before, at least in Matlab, but I see R has one or two libraries for weighted box plots. So would the idea be that you specify an additional array, the same size as Y, that determines the weight associated with each datum?

arnoldHi Christopher,

what about adding functionality to display weighted boxplots - that would also be really useful and quite unique as no matlab function I know supports it. I could most definitely use it and I'm sure I'm not the only one.

As of now I always go and calculate weighted means and confidence intervals using the matlab fit function (or one by myself) and then usually plot as a scatter plot or errorbar.

regards

Arnold

Royi AvitalGreat submission.

Few notes:

1. Could you add the option for 5% and 95% bars (In addition to 25% and 75%)?

2. The documentation states that when the option to show scatter data is on the outliers will be removed though it doesn't happen. Could you just give a different "On / OFF" to each (Can be done manually using the handlers, yet better use the classes).

Thank You!

Christopher Hummersone@arnold I added the 'theme' property I mentioned. Hopefully

>> set(h,'theme','colorall')

(h is the boxPlot handle) is somewhere close to what you described...?

I know the interface is more complicated, because of the support for an arbitrary number of dimensions in Y. Here are a couple of tips.

1) You no longer need to precisely specify any colormaps; you can just specify a function handle, for example,

>> set(h,'boxColor',@parula)

This functionality is described towards the end of the help.

2) Getting and setting boxPlot properties is asymmetrical, especially for group properties. For example,

>> h.boxColor = 'none';

>> h.boxColor

ans =

'none'

'none'

'none'

So you could try capturing a property value, e.g.

>> bc = get(h,'boxColor');

modifying its value, and returning the modified value to the box plot, e.g.

set(h,'boxColor', bc);

arnoldtrue... maybe using the same color but just darken is more intuitive.

I just don't seem to get the necessary input format for the colormap of the boxes/groups. How do I just assign a standard colormap like parula, hot, etc??

This matlab typical approach by giving it a matrix of RGB vectors doesn't work

'boxColor', parula(size(g{1},1))

---

ah, forget it. it's documented in the classdef but wasn't really clear in the box_plot function.

I find it a bit cumbersome since I sometimes leave out the darkest color of parula for instance by choosing the colormap space to be slightly larger than I need it. parula(6) for 5 groups for instance and then use the lightest 5.

Christopher Hummersone@arnold would that not make the scatter plots difficult to see if the box is the same colour?! Or perhaps you mean the outliers? The trick with this is providing a consistent and easy-to-use interface. I'm thinking about adding a 'theme' option that automates various display options (especially colour). For example, one theme option could be 'colorboxes' which would give you something similar to the screenshot attached here; 'grayboxes' would do something similar but in grayscale; 'colorlines' would have unfilled boxes and change all of the line colours; etc. I'd happily add any suggested themes.

Fabian SchrumpfVery usefull and versatile function. Let's you adjust a lot of settings that the build-in function boxplot() doesn't give you access to. Thanks for sharing.

arnoldhave you thought of optionally colouring the scatter plots with the same colour as the according box instead of just gray?

Christopher HummersoneThanks for the feedback, José. Actually showing the mean was something I was thinking about. It's an easy addition, so I've just uploaded a revised version. The outputs have been modified to include the mean data and a handle to the markers used to plot the means (see updated documentation).

José Ignacio OrlandoAwesome contribution!

Is is possible to plot the mean values in the same plot?

AndersSvenChristopher HummersoneHi Arnold,

I did upgrade to R2015b, but I'm fairly certain the updated function doesn't require any functionality that's new to R2015b.

Unfortunately I had to change quite a bit about the function interface in order to support the new functionality, which does make some things harder to do than they were.

If data.groups has one column, then you can use:

[~,h] = box_plot(...);

legend(squeeze(h.boxes(:,1,:)),g{1},'location','best')

(g has as many cells as data.groups has columns.)

As for boxcolor, it and other parameters should be specified as a cell array of size G-by-I-by-J (check out the help text for more info).

The interface is more cumbersome now, I accept that, so I'd appreciate any insight you can offer that would make it less cumbersome (whilst retaining the flexibility of allowing data.groups to be of arbitrary size). At the moment I'm thinking of further developing the function into a class and providing methods to handle things like legends and plotting parameters.

Chris

arnolddid you change matlab versions in between?

I could upgrade to 2015b but haven't gotten around to it yet. (I'm still at 2014b)

arnoldHi,

I'm having a couple of issues with the current version, coming from revision 381.

[y,x,g] = tab2box(data.x, data.y, data.groups);

now results in g being a cell array with everything in the first cell, i.e. a 5x1 list. So the legend which used to work doesn't anymore:

legend(squeeze(hb(:,1,:)),g, 'location', 'best');

results in:

Error using legend (line 120)

Invalid argument. Type 'help legend' for more

information.

The Group-Coloring of the boxes used to work like:

'boxColor', parula(size(g,1))

but also, because of g ending up as a cell structure with everything in the first cell, it doesn't anymore.

How to apply 'boxColor' now? Using 'boxcolor', parula(5) results in:

Error using box_plot (line 318)

BOX_PLOT needs propertyName/propertyValue pairs

arnoldI see you're still busy adding features and bugfixing, well, I can recommend two or three more things. Don't take it the wrong way, I'm just suggesting what I edit in my plots and what my experience with catchy scientific plotting tells me.

- background scatter plot (with jitter in x) behind each box: it's sometimes nice to see the actual data points too, many modern tools like Origin and JMP offer this option. If you just overlay a scatterplot (randomize x-position according to the width of the boxes for nicer look instead of strict x-position and set color according to box group) it achieves just that. I did it manually in some scripts now, and it looks great.

- An option to display the number of data points which make up the box on the top right (i.e.) of each box. the the prior suggestion, this helps to understand the quality of the statistics behind the boxes since many people ignorantly use box-plots with far too little data.

We've done this in our institute for years and I think it enhances the speed & quality of data interpretation.

- when the x-groups are non-scalar, vertical separator lines between the groups usually improve readability

- the ultimate box-plotting tool would be the combination of your contribution here with "hierarchical box plot" (also found on fileexchange).

... as I said, I don't say you have to include any of this ;)

Christopher HummersoneThanks again for your feedback and suggestions, Arnold. At your suggestion I've added an xSpacing option, that I hope is satisfying. I've also fixed both functions to remove NaNs in x and corresponding y data. Again, I hope that fixes the problem. As for logarithmic spacing, there is undoubtedly a solution, but I think it will make things unnecessarily complicated. Instead, I suggest using the new xSpacing option. The former boxWidthMode evolved into the boxSpacing option, and would not have been helpful in this case.

arnoldyes, that fixed the bug! Thank you!

Some more suggestions. I miss the option to use an X vector with numbers for labelling but NOT spacing the axis accordingly. When X contains strings it goes and just evenly spaces the categories. It would be nice if evenly spaced x-ticks/groups would be possible for numbered X vectors as well. Maybe you just go and introduce the option 'xSpacing', 'scalar' or 'even'

Regarding this, I also found another bug. When X contains numbers AND nan's (which happens to some of my datasets), line 390 in box_plot throws an error. Adding the line x(isnan(x)) = [ ] fixes it. I'm not sure if it's generally applicable.

A minor thing... When using scalar x-axis-spacing with a numbered X-vector, the box widths get messed up when applying a logarithmic scale. You did have a 'boxwidthmode' which is no longer supported, did that have something to do with it?

I have no elegant solution for this in mind.

Thx very much!

Christopher HummersoneAh. Of course the example works for me! There was a bug in quantile2 that I fixed a while back, but forgot to upload the file to the FX. Please download the most recent version:

http://uk.mathworks.com/matlabcentral/fileexchange/46555-quantile-calculation

Hopefully that should help. I've also updated the box_plot documentation to explicitly mention tab2box.

arnoldHi Christopher,

Interesting function of yours (tab2box). Looks like that, combined with box_plot.m could be what I've been looking for ... but for me (R2014b) the example given by you in the file itself (line 35-57) doesn't even work. Throws the same error as it does for my own data table:

Attempted to access x2(1); index out of bounds because numel(x2)=0.

Error in quantile2 (line 156)

q(m,n) = x2(1);

Error in box_plot (line 266)

Z.median = quantile2(y,.5,[],options.method); % median

Christopher HummersoneHi Arnold,

Thanks for your feedback and comment. Just to make sure I understand correctly, you're suggesting that the Y input should facilitate plotting samples of different sizes? So the columns could be of different lengths? What if Y could be a cell array?

The tab2box function implicitly offers this functionality, since it will pad Y with NaN when samples are unequal in size. But this relies on the data being in tabular form initially.

arnoldif you used structures, you could have different x-values and sizes for each set. This would make a great addition, since Matlab and other submissions here haven't supported this forver.

or you could just introduce a group vector. X an ix1 vextor for the x-positions. Y an m x i matrix for the data and g a i x 1 vector for the groups....

arnoldHi Christopher,

great function, thx.

It would be great if you added the ability to use different sizes for the groups which would not have to be integrated into one 3d matrix.

Christopher HummersoneThere's example code in the file help...

Juan DeatonCan you attach the code example for the figure you have at the top?

Christopher Hummersone@Alberto you mean the box colour? Use the 'boxColor' option and set it, for example, to [1 1 1; .5 .5 .5] (assuming you have two boxes per y-tick). Setting parameters for each group is described towards the bottom of the help text.

AlbertoNice function, it works as promise. I miss the option , (or i don't know how to do it ) to fill the notch with a specified color like on the figure example.