Asked by John Soong
on 1 Dec 2012

For tightly packed data, hist() always wastes so many bins trying to show data that are clearly outliers or measurement errors (sometimes there are measurement anomalies). Like curvature being distributed mostly between 0 and 1 but there being occasional measurements at 100 or 2000. Is there a convenient way to get it to behave reasonably? I don't get why hist doesn't have an option to ignore outliers in its x-scaling.

*No products are associated with this question.*

Answer by Image Analyst
on 1 Dec 2012

Use cumsum(). Here's a snippet for you to study. Earlier in the code I had gotten the counts and the bin values using this line:

[PixelCounts, GLs] = imhist(imageArray, numberOfBins); % make sure you label the axes after imhist because imhist will destroy them.

Here's the snippet:

% Calculate CDF, Cumulative Distribution Function cdf = zeros(1, numberOfBins); CDFPercentiles = zeros(100, 1); CDFIndexes = uint32(zeros(100, 1)); % Get the Cumulative Distribution Function. cdf = cumsum(PixelCounts); % Now normalize the CDF to 0-1. biggestValue = cdf(numberOfBins); normalizedCDF = cdf / biggestValue; % And scale it for the range of the plot. maxY = ylim; scaledCDF = maxY(2) * normalizedCDF ; % Find out what gray level corresponds to a given percentile. grayLevelAtGivenPercentile = zeros(100, 1); [x, m, n] = unique(normalizedCDF); % Need to have no duplicate values in the x direction (which is the CDF). y = GLs(m); xi = 0.01 : 0.01 : 1.00; % Make up an array to have every percentage from 1% to 100% % Do the interpolation to get EVERY percent (even if it doesn't appear in the CDF). grayLevelAtGivenPercentile = interp1(x, y, xi);

lowestValueToPlot = grayLevelAtGivenPercentile(1); highestValueToPlot = grayLevelAtGivenPercentile(95); xlim([lowestValueToPlot highestValueToPlot]); % Needs to be after bar() function because bar() function automatically sets x limits itself, overriding what you've set.

Opportunities for recent engineering grads.

## 1 Comment

## Jurgen (view profile)

Direct link to this comment:http://www.mathworks.com/matlabcentral/answers/55351#comment_114671

What do you mean with wasting bins? You can specify the amount you want right, and maybe use axis() to get rid of the outlier?