5.0

5.0 | 2 ratings Rate this file 254 downloads (last 30 days) File Size: 27.22 KB File ID: #22470

LOWESS, Locally Weighted Scatterplot Smoothing for linear and non-linear data (enhanced)

by Jeff Burkey

 

16 Dec 2008 (Updated 27 Oct 2009)

Code covered by BSD License  

LOWESS, a robust regression like LOWESS allows detecton of a trend otherwise with too much variance

Download Now | Watch this File

File Information
Description

LOWESS- Locally Weighted Scatterplot Smoothing that does not require the statistical toolbox in matlab.

This regression will work on linear and non-linear relationships between X and Y.

Modifications:
12/19/2008 - added upper and lower LOWESS smooths. These additional smooths show how the distribution of Y varies with X. These smooths are simply LOWESS applied to the positive and negative residuals separately, then added to the original lowess of the data. The same smoothing factor is applied to both the upper and lower limits.

 2/21/2009 - added sorting to the function, data no longer need to be sorted. Also added a routine such that if a user also supplies a second dataset, linear interpolations are done one the lowess and used to predict y-values for the supplied x-values.

10/27/2009 - modified the second user provided X-data for obtaining predictions. Matlab function unique sorts by default. It really was not needed in the section of code to perform linear interpolations of the x-data using the y-predicted LOWESS results. If the user does not supply a second x-data set, it will assume to use the supplied x-y data set. Thus there is an output (xy) that maintains the original sequence of the input. Additionally, the user can now include a sequence index as the first column of input data. This can be a datenum or some other ordering index. The output will be sequenced using that index. If a sequence index is provided a second subplot will be created show the predicted Y-values in the order of the included sequence index. I suspect this sequence index most often will be a DateTime (i.e. datenum). Just to the function generic enough, the X-axis labels are not converted to a nice date format, but the user could easily change that with a datetic attribute in the subplot.

Using a robust regression like LOWESS allows one the ability to detect a trend in data that may otherwise have too much variance resulting in non-significance p-values.

Yhat (prediction) is computed from a weghted least squares regression whose weights are both a function of distance from X and magnitude from of the residual from the previous regression.

The logic of these functions and subfunctions follow the USGS
Kendall.exe routines. Because matlab is 8-byte precision, there are some very small differences between FORTRAN compiled and matlab. Maybe 64-bit OS's has 16-byte precision in matlab?

Data are expected to be sorted prior to data input for this function. Sorted on first column of datain.

There is a very simple subfucntion to create a plot of the data and regression if the user so choses with a flag in the call to the lowess function. BTW-- the png file looks much better than what the figure looks like on screen.

There are loops in these routines to keep the memory requirements to a minimum, since it is foreseeable that one may have very large datasets to work with.

f = a smoothing factor between 0 and 1. The closer to one, the more smoothing done.

Syntax:
   [dataout lowerLimit upperLimit]
                    = lowess(datain,f,wantplot,imagefile)

  datain = n x 2 matrix
  dataout = n x 3 matrix
  wantplot = scaler (optional)
       if ~= 0 then create plot
  imagefile = full path and file name where to output the figure to an
       png file type at 600 dpi.
       e.g. imagefile = 'd:\temp\lowess.png';

where:
  datain(:,1) = x
  datain(:,2) = y
  f = scaler (0 < f < 1)
  wantplot = scaler
  imagefile = string

 datain must be sorted prior to loading into this function on the
 x-value. This is not done in the function because the user may want to have the end result be unsorted (e.g. time sort).

dataout(:,1) = x
dataout(:,2) = y
dataout(:,3) = y-prediction (aka yhat)
lowerLimit(:,1) = x with negative residuals
lowerLimit(:,2) = y-prediction of residuals + original y-prediction
upperLimit(:,1) = x with positive residuals
upperLimit(:,2) = y-prediction of residuals + original y-prediction

Requirements: none

Written by
Jeff Burkey
King County Department of Natural Resources and Parks
email: jeff.burkey@kingcounty.gov
12/16/2008

MATLAB release MATLAB 7.7 (R2008b)
Zip File Content  
Other Files license.txt,
lowess.m,
lowess.pdf,
skl1a.mat
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (3)
21 Jun 2009 Jeff Evans

Fantastic! Worked perfectly the first time. I did modify it just a bit to fit my purposes by suppressing the screen output and only returning the dataout variable, since don't want to plot the confidence limits. Thanks for the contribution.

- Jeff

05 Aug 2009 Kaite

awsome! This is exactly what I need. I am a little confused about one thing...I have an n x 2 matrix in datain that consists of x=precip and y=groundwater level that I would like to do the regression on and get the y-predictions using the lowess regression results. I modified the function to not include the "xdata" and "xy" but I am getting errors about the try-catch syntax on line 298 of the m-file.I have version 7.4.0 (R2007a). The error states that it will continue to run but I dont get results. Am I putting the data in the wrong place? Do I need to put a second data set in? I wanted to take the resulting predicted y values and run the Mann-Kendall test to see if i see a significant trend. what is the purpose of the second dataset of x-values? As you can tell I am new to Matlab and it would be wonderful if I could figure this out! thanks so much!! If i can get this to work for my data this would be AMAZING!

05 Aug 2009 Jeff Burkey

You shouldn’t need to modify the function to not input the “xdata”. If you don’t input it, the function will perform just not that task. If that is the case, don’t specify the “xy” as part of your output either.

The “dataout” variable contains the predicted y for each x- value supplied in the “datain” variable. The “xdata” is if you want a different set of y-predictions using the lowess function for other x-data not provided in the “datain” variable.

I don’t remember if the function I posted sorts or not, but if you are wanting to regress as a function of time, you may not want to sort.

I hope this helps.

- Jeff

Please login to add a comment or rating.
Updates
19 Dec 2008

Added computation of upper and lower smooths of the residuals, also will plot on figure.

21 Feb 2009

Added a sorting call. Also added a few lines of code allow the user to supply a second set of x-values to be used for y-predictions using the results of the lowess regression.

27 Oct 2009

Updated graphing. Revised input. Includes example data file. Revised output.

Tag Activity for this File
Tag Applied By Date/Time
regression Jeff Burkey 17 Dec 2008 14:34:02
line fit Jeff Burkey 17 Dec 2008 14:34:02
weighted regression Jeff Burkey 17 Dec 2008 14:34:02
robust regression Jeff Burkey 17 Dec 2008 14:34:02
bisquare Jeff Burkey 17 Dec 2008 14:34:02
statistics Jeff Burkey 17 Dec 2008 14:34:02
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com