File Exchange

image thumbnail


version 1.3 (8.07 KB) by

Four-plot for efficient visual exploratory data analysis, with box plot (V3.0, feb 2015)



View License

FOURPLOT(X) creates for the values in X a "four-plot" that allows for a
    powerful and efficient visual inspection of the four underlying
    assumptions of univariate statistical analyses. Descriptive statistics
    are printed out in the command window.
    X is a vector of observational values. It should be numerical and
    cannot contain NaNs or Infs.
    In four subplots, the run sequence plot (X[k] vs k), a lag plot (X[k]
    vs X[k-1]), a histogram, and a normal probablity plot are shown. Within
    these axes, the mean value of X is drawn as a straight line. In
    addition, a 5th panel shows a box-and-whisker plot of X.
    If the four underlying assumptions holds, the four plots will have a
    characteristic appearance.
    1. If the fixed location assumption holds, then the run sequence plot
       will be flat and non-drifting.
    2. If the fixed variation assumption holds, then the vertical spread in
       the run sequence plot will be the approximately the same over the
       entire horizontal axis.
    3. If the randomness assumption holds, then the lag plot will be
       structureless and random.
    4. If the fixed distribution assumption holds, in particular if the
       fixed normal distribution holds, then the histogram will be
       bell-shaped, and the normal probability plot will be linear.
    The box-and-whisker plot will show the median (red line), mean and SD (in
    blue), the 25th and 75th percentile (the box), and outliers (plus
    symbols), if any. The whiskers are the lowest value still within 1.5
    times the inter-quartile range (IQR) of the lower quartile, and the
    highest value still within 1.5 IQR of the upper quartile. Raw data are
    plotted in gray.
    STATS = FOURPLOT(X) also returns some statistical values in the
    structure STATS. Descriptive statistics are not printed.
      % case 1: the four assumptions hold
        X = 20 + randn(100,1) * 10 ;
        fourplot(X) % nice, we can use classical statistics!
      % case 2: data is oscillating, which is not immediately clear
        unknown = cumsum(rand(1000,1)) ;
        unknown = unknown(randperm(numel(unknown))) ;
        X = sin(unknown) ; % X looks random (see, e.g., run sequence) ..
        fourplot(X) % .. but it is not!
    The usefulness of a four-plot extends beyond inspection of univariate
    and time series
    data. For instance, it can be used to inspect the residuals of model fit
    to determine whether the underlying error term of the model fullfills the
    assumptions, no matter how complicated the model may be.
      x = 2*rand(100,1) ; y = exp(x) ; % the data
      par = polyfit(x,y,1) ; % a simple model
      res = y - polyval(par,x) ; % residuals
      fourplot(res) % -> our model is poor!
    More information can be found on the internet, e.g.,
    See also normplot (Statitiscs Toolbox)
             mean, median

Comments and Ratings (6)

Jos (10584)

Jos (10584) (view profile)

Thanks jed wang, for the useful comment ...

jed wang

Jos (10584)

Jos (10584) (view profile)

Version 3.0 is now online! (thanks Edwin)


Edwin (view profile)

The file is still the version 2 file.


MURAT (view profile)

usefull code thanks


Edwin (view profile)



File now updated to version 3.0 (I hope)


In v 3.0 descriptive statistics are printed in the command window.


version 2.0- added box plot

MATLAB Release
MATLAB 7.14 (R2012a)

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video