File Exchange

image thumbnail

Proportional Venn Diagrams

version 1.0.0.0 (2.53 KB) by Jeremy Heil
Draws a venn diagram for two or three sets with proportional areas.

23 Downloads

Updated 28 Oct 2004

No License

%
% function error = vennX( data, resolution )
%
% vennX - draws an area proportional venn diagram
%
% Draws a venn diagram (either two or three set) using
% circles, where the area of each region is proportional
% to the input values.
%
% INPUT:
% data - a vector of counts for each set partition
%
% For a two circle diagram:
% data is a three element vector of:
% |A|
% |A and B|
% |B|
%
% For a three circle diagram:
% data is a seven element vector of:
% |A|
% |A and B|
% |B|
% |B and C|
% |C|
% |C and A|
% |A and B and C|
%
% resolution - A measure of accuracy on the image,
% typical values are within 1/100 to 1/1000 of
% the maximum partition count. Note that smaller
% resolutions take longer compute time.
%
% OUTPUT:
% error - the difference in area of each partition
% between the actual area and the input vector
%
% EXAMPLES:
%
% vennX( [ 106 26 257 ], .05 )
%
% vennX( [ 75 143 210 ], .1 )
%
% vennX( [ 16 3 10 6 19 8 3 ], .05 )
%
%
% COMMENTS:
%
% The implementation is trivial, for the two circle case, two circles
% are drawn to scale and moved closer and closer together until the
% overlap is 'near' to the desired intersection. For the three
% circle case, it is repeated three times, once for each pair of
% circles. Hence the two circle case is almost exact, whereas the
% three circle case has much more error since the area |A and B and C|
% is derived. This means that large variations from random, especially
% close to zero, will have larger errors, for example
%
% vennX( [ 20 10 20 10 20 10 0], .1 )
%
% as opposed to
%
% vennX( [ 20 10 20 10 20 10 10], .1 )
%
% ENHANCEMENTS
%
% The implementation could be sped up tremendously using a MRA
% (multi-resolutional analysis) type algorithm. e.g. start with a
% resolution of .5 and find the distance between the circles, then use
% that as a seed for a resolution of .1, then .05, .01, etc.
%
% The error vector could be used as a measure to 'perturb' the position
% of the third circle as to minimize the error. This could be done
% with a simple gradient descent method. This would help the
% exceptions described above where the distribution deviates from
% random.
%
% When small mishapen areas are drawn, the text does not match up, e.g.
% vennX( [ 15 143 210 ], .1 )
%
%
% Original implementation and method by Jeremy Heil, for the Order of
% the Red Monkey, and the Tengu
%
% Oct. 2004
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Comments and Ratings (11)

Angela Pisco

when using for 3 circle diagram, if you replace line 137
[X,Y] = meshgrid( 0:resolution:size_x, 0:resolution:size_y );
with the following 2 lines the circles are not cut anymore:
sizeXY = max(size_x,size_y);
[X,Y] = meshgrid( (-.05*sizeXY):resolution:(1.05*sizeXY), (-.05*sizeXY):resolution:(1.05*sizeXY));

Yuri K

Yuri K (view profile)

Does not show anything if there is no overlap. I'd like to see separated circles of size proportional to number of elements in them. Please don't make a new figure by default. Also how to make several diagrams have the same (or similar) resolution? I'm creating multiple diagrams on the same figure in subplots. Another suggestion - it would be nice to be able to customize colors and labels (position, font, etc).

manoj

manoj (view profile)

Nice work ! Thank you !

liz

liz (view profile)

does this not work with the student version? I am attempting to make a simple venn diagram and it will not work.

Julia

Julia (view profile)

I think the sub-set situation still works, except the number labels are not at the optimal place for display.

Matt J, I think you might be interpreting the inputting data slightly differently. I think the commenting of this function should be that if you are doing 2 sets, the 3 numbers should be:
data(1) = number of elements in A but not in B (as opposed to be interpreted as number of elements in A);
data(2) = number of elements in the intersect of A and B;
data(3) = number of elements in B but not in A;

Same is true for the 7-element (3-sets Venn diagram) data. Each data point represents the single color shade on the graph.
For example,
data(1) in the 7-element vector represents the number of elements in A but and Not in B or C.
data(2) represents the number of elements in the intersect of A and B but not in C.

I wrote this little utility that calculate these values if you just input your original sets. Your original sets can be 3 vectors or 3 cell arrays (with strings). If you leave the 3rd vector empty you'll get the 2-set diagram. Feedback welcome!

function vennX_calc(x,y,z)
%% Venn diagram for 2 sets;
if isempty(z)
if ~isnumeric(x) %cell array of strings;
if or(size(y,1)>1,size(x,1)>1) %colum vectors;
all=[x;y];
else
all=[x y];
end
allString = unique(all);
numericVec = 1:length(allString);
x = numericVec(ismember(allString,x));
y = numericVec(ismember(allString,y));
end
vec = NaN(1,3);
xNy = length(unique([x y]));
vec(1) = xNy - length(y);
vec(2) = length(x)+length(y)-xNy;
vec(3) = xNy - length(x);
%% Venn Diagram for 3 Sets;
else
if ~isnumeric(x) %cell array of strings;
if or(size(z,1)>1,or(size(y,1)>1,size(x,1)>1)) %colum vectors;
all=[x;y;z];
else
all=[x y z];
end
allString = unique(all);
numericVec = 1:length(allString);
x = numericVec(ismember(allString,x));
y = numericVec(ismember(allString,y));
z = numericVec(ismember(allString,z));
end
vec = NaN(1,7);
xIy = intersect(x,y);
yIz = intersect(y,z);
zIx = intersect(z,x);
xIyIz = intersect(xIy,z);
vec(7) = length(xIyIz);
vec(2) = length(xIy) - vec(7);
vec(4) = length(yIz) - vec(7);
vec(6) = length(zIx) - vec(7);
vec(1) = length(x) - vec(2) - vec(6) - vec(7);
vec(3) = length(y) - vec(2) - vec(4) - vec(7);
vec(5) = length(z) - vec(4) - vec(6) - vec(7);
end
vec %display the input vector;
%% draw the diagram;
vennX(vec,0.01);
end

matt j

does this fail for the case where B is a subset of A? I entered A, B, and A&B where A&B=B and did not get what i expected

Pedro Martins

Richard Moffitt

To get the pretty primary colors, you should change the code @142 to look like this:

img = img + 1 ...
img = img + 2 ...
img = img + 4 ...

and then use a colormap like this:

colormap([...
0,0,0;... %0
0,0,1;... %1
0,1,0;... %2
0,1,1;... %3
1,0,0;... %4
1,0,1;... %5
1,1,0;... %6
1,1,1]); %7

Oscar Puig

A nice and simple script with great results.
Thanks!

Steve Haddock

Great! Thanks for this -- I was pretty surprised that I couldn't find it in the stats toolbox. Nice implementation.

John Veysey

Very nice! My incredibly picky comment: The default color scheme has some repetition. It could be made to have each of the 3 circles be (eg) primary colors, and then have the overlap regions reflect the colors of a color wheel ... but that may just be nerdy.

MATLAB Release Compatibility
Created with R13SP1
Compatible with any release
Platform Compatibility
Windows macOS Linux
Acknowledgements

Inspired: venn

MATLAB Online Live Editor Challenge

View the winning live scripts from faculty and students who participated in the recent challenge.

Learn more

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video