A MATLAB implementation of the plot described in Eilers, P.H.C. and Goeman, J.J (2004) "Enhancing scaterplots with smoothed densities", Bioinformatics 20(5):623-628.
This plot is useful when a 2D scatterplot of your data would result in an uninterpretable mass of overlaid points. The smoothed histogram displays the density of points as greyscale intensity in a 2D image plot.
Optionally, you can also plot outliers as individual points, or plot the image as a surface in 3D.
Hi, I am very much benefited with this code; exactly what I wanted. However, though I can get pretty good results (used in Monte carlo simulations for photon transport in tissue), I need to add in my thesis how exactly I got the density plot. In the code, the function smooth1D is not very clear to me, especially the equation P = lambda.^2 .* D2'*D2 + 2.*lambda .* D1'*D1;
Z = (E + P) \ Y; I will be highly obliged if someone please elaborate this equations/ give some links to the origin of it. I am eagerly waiting for some respond from any of you. Thanks in advance.
Glad I finally found the script, but have some difficulties...
Don't know wheather this is the right place to post questions reagarding this skript, but I'll try:
How can change the axis: I would like to have the point of origin in the lower left corner and not at the upper right one.
Thanks for any suggestions!
Hi, I modified the function so that you con get the histogram and axes out and so that you can enter the edges as well (instead of only the bin numbers)
let me know if there are problems. From my tests it looks ok but didn't try on non uniform grids yet
% SMOOTHHIST2D Plot a smoothed histogram of bivariate data.
% [H,X,Y]=SMOOTHHIST2D(X,LAMBDA,NBINS) plots a smoothed histogram of the bivariate
% data in the N-by-2 matrix X. Rows of X correspond to observations. The
% first column of X corresponds to the horizontal axis of the figure, the
% second to the vertical. LAMBDA is a positive scalar smoothing parameter;
% higher values lead to more smoothing, values close to zero lead to a plot
% that is essentially just the raw data. NBINS is a two-element vector
% that determines the number of histogram bins in the horizontal and
% vertical directions.
% SMOOTHHIST2D(X,LAMBDA,NBINS,CUTOFF) plots outliers in the data as points
% overlaid on the smoothed histogram. Outliers are defined as points in
% regions where the smoothed density is less than (100*CUTOFF)% of the
% maximum density.
% SMOOTHHIST2D(X,LAMBDA,NBINS,,'surf') plots a smoothed histogram as a
% surface plot. SMOOTHHIST2D ignores the CUTOFF input in this case, and
% the surface plot does not include outliers.
% SMOOTHHIST2D(X,LAMBDA,NBINS,CUTOFF,'image') plots the histogram as an
% image plot, the default.
% MODIFICATIONS TO THE ORIGINAL FUNCTION:
% 1. you can also enter the histogram edges instead of the bin numbers
% by making NBINS a CELL array. Example (using X defined below)
% 2. Added outputs (histogram and edges)
% X = [mvnrnd([0 5], [3 0; 0 3], 2000);
% mvnrnd([0 8], [1 0; 0 5], 2000);
% mvnrnd([3 5], [5 0; 0 1], 2000)];
% smoothhist2D(X,5,[100, 100],.05);
% smoothhist2D(X,5,[100, 100],,'surf');
% Eilers, P.H.C. and Goeman, J.J (2004) "Enhancing scaterplots with
% smoothed densities", Bioinformatics 20(5):623-628.
% Written by Peter Perkins, The MathWorks, Inc.
% Revision: 1.0 Date: 2006/12/12
% This function is not supported by The MathWorks, Inc.
% Requires MATLAB R14.
if nargin < 4 || isempty(outliercutoff), outliercutoff = .05; end
if nargin < 5, plottype = 'image'; end
[n,p] = size(X);
bin = zeros(n,2);
% Reverse the columns of H to put the first column of X along the
% horizontal axis, the second along the vertical.
[dum,bin(:,2)] = histc(X(:,1),edges1);
[dum,bin(:,1)] = histc(X(:,2),edges2);
% Eiler's 1D smooth, twice
G = smooth1D(H,lambda);
F = smooth1D(G',lambda)';
% % An alternative, using filter2. However, lambda means totally different
% % things in this case: for smooth1D, it is a smoothness penalty parameter,
% % while for filter2D, it is a window halfwidth
% F = filter2D(H,lambda);
relF = F./max(F(:));
if outliercutoff > 0
outliers = (relF(nbins2*(bin(:,2)-1)+bin(:,1)) < outliercutoff);
nc = 256;
image(ctrs1,ctrs2,floor(nc.*relF) + 1);
% plot the outliers
if outliercutoff > 0
plot(X(outliers,1),X(outliers,2),'.','MarkerEdgeColor',[.8 .8 .8]);
% % plot a subsample of the data
% Xsample = X(randsample(n,n/10),:);
function Z = smooth1D(Y,lambda)
[m,n] = size(Y);
E = eye(m);
D1 = diff(E,1);
D2 = diff(D1,1);
P = lambda.^2 .* D2'*D2 + 2.*lambda .* D1'*D1;
Z = (E + P) \ Y;
% This is a better solution, but takes a bit longer for n and m large
% opts.RECT = true;
% D1 = [diff(E,1); zeros(1,n)];
% D2 = [diff(D1,1); zeros(1,n)];
% Z = linsolve([E; 2.*sqrt(lambda).*D1; lambda.*D2],[Y; zeros(2*m,n)],opts);
function Z = filter2D(Y,bw)
z = -1:(1/bw):1;
k = .75 * (1 - z.^2); % epanechnikov-like weights
k = k ./ sum(k);
Z = filter2(k'*k,Y);
06 Oct 2008
The author added 2D filter to make it look unique, although there were many version of MATLAB codes for the cited paper. e.g.: http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=6037&objectType=file
15 Sep 2008
helpful... but i believe such tools should be a requisite /mandatory for a product like matlab
17 Jan 2007
Really excellent file.
Should have default parameters for lamda and nbins tho.