Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Question From "Data Driven Fitting" Webinair By Richard Willey

Subject: Question From "Data Driven Fitting" Webinair By Richard Willey

From: Kevin Ellis

Date: 17 Apr, 2013 17:41:08

Message: 1 of 1

Hello,

I have been trying for a couple of days now to efficiently apply the lessons and code from the Webinair Data Driven Fitting by Richard Willey. I am most interested in how he was able to apply a nonparametric fitting routine to a set of datapoints.

I am trying to create a nonparametric fit and interpolate values using the function "smooth" for 218 different utilities with each having many datapoints. I have tried to break apart his code and read everything about it but there are a couple of lines I cannot figure out and am hoping someone here could explain these sections of code:

%%Fitit
% Copyright (c) 2011, The MathWorks, Inc.

function [myfit,varargout] = fitit(X,Y,varargin)
....
....
% Finding optimal span for lowess
num = 99;
spans = linspace(.01,.99,num);
sse = zeros(size(spans));
cp = cvpartition(100,'k',10);

for j=1:length(spans),
    f = @(train,test) norm(test(:,2) - mylowess(train,test(:,1),spans(j)))^2;
    sse(j) = sum(crossval(f,[X,Y],'partition',cp));
end

[~,minj] = min(sse);
span = spans(minj);

I have read all the help documentation related to "cvpartition" and "crossval" and have looked at the code for the mylowess function which is given later but the code I do not understand is given by:

for j=1:length(spans),
    f = @(train,test) norm(test(:,2) - mylowess(train,test(:,1),spans(j)))^2;
    sse(j) = sum(crossval(f,[X,Y],'partition',cp));
end

I understand he is trying to minimize the sum of squared errors, but how are the variables train and test created? From cvpartition? Why do you take the norm minus the result from the mylowess function? What exactly is crossval doing? Why do we linear interpolate (from mylowess function) using test(:,1) whatever that is?

And the accompanying mylowess function is given by:

function ys=mylowess(xy,xs,span)
%MYLOWESS Lowess smoothing, preserving x values
% YS=MYLOWESS(XY,XS) returns the smoothed version of the x/y data in the
% two-column matrix XY, but evaluates the smooth at XS and returns the
% smoothed values in YS. Any values outside the range of XY are taken to
% be equal to the closest values.

if nargin<3 || isempty(span)
    span = .3;
end

% Sort and get smoothed version of xy data
xy = sortrows(xy);
x1 = xy(:,1);
y1 = xy(:,2);
ys1 = smooth(x1,y1,span,'loess');

% Remove repeats so we can interpolate
t = diff(x1)==0;
x1(t)=[]; ys1(t) = [];

% Interpolate to evaluate this at the xs values
ys = interp1(x1,ys1,xs,'linear',NaN);

% Some of the original points may have x values outside the range of the
% resampled data. Those are now NaN because we could not interpolate them.
% Replace NaN by the closest smoothed value. This amounts to extending the
% smooth curve using a horizontal line.
if any(isnan(ys))
    ys(xs<x1(1)) = ys1(1);
    ys(xs>x1(end)) = ys1(end);
end

If anyone can provide some insight on the exact specifics of this code that would be great. Thanks.

Kevin

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us