5.0

5.0 | 10 ratings Rate this file 318 downloads (last 30 days) File Size: 292.29 KB File ID: #24443

SLM - Shape Language Modeling

by John D'Errico

 

15 Jun 2009 (Updated 29 Oct 2009)

Code covered by BSD License  

Least squares spline modeling using shape primitives

Download Now | Watch this File

File Information
Description

If you could only download one curve fitting tool to your laptop on a desert island, this should be it.

For many years I have recommended that people use least squares splines for their curve fits, with a caveat. Splines offer tremendous flexibility to build a curve in any shape or form. They can nicely fit almost any set of data you will throw at them. This same flexibility is their downfall at times too. Like polynomial models, splines can be too flexible if you are not careful. The trick is to bring your knowledge of the system under study to the problem.

As a scientist, engineer, data analyst, etc., you often have knowledge of a process that you wish to model. Sometimes that knowledge comes from physical principles, sometimes it arises from experience, and sometimes the knowledge just comes from looking at a plot of the data. Regardless of the source, we often want to build in this prior knowledge of a process into our modeling efforts. This is perhaps the biggest reason why nonlinear regression tools are used, and I'll argue, the worst reason. If you are fitting a sigmoid function to your data only because it happens to be monotone and your data appear to have that property, then you have made the wrong choice of modeling tool. (If you are fitting a sigmoid because this is known to be the proper model for your process, then go ahead and fit the sigmoid.)

I'll argue the proper tool when you merely need a monotonic curve fit is a least squares spline, but a spline that is properly constrained to have the fundamental shape you know to be there. This is a very Bayesian approach to modeling, and a very useful one in my experience.

The SLM tools provided here give you an easy to use interface to build an infinite number of curve types from data. SLM stands for Shape Language Modeling. The idea is to provide a prescription for a curve fit using a set of shape primitives. If your curve is monotone, then build that information into the model, so you can estimate the monotone curve that best fits your data. What you will find is that once you employ the proper set of constraints, you will wonder why you ever used nonlinear regression in the past!!!

For example, the screenshot for this file was generated for the following data:

x = (sort(rand(1,100)) - 0.5)*pi;
y = sin(x).^5 + randn(size(x))/10;

slm = slmengine(x,y,'plot','on','knots',10,'increasing','on', ...
'leftslope',0,'rightslope',0)
slm =
            form: 'slm'
          degree: 3
           knots: [10x1 double]
            coef: [10x2 double]
    prescription: [1x1 struct]
               x: [100x1 double]
               y: [100x1 double]

You can evaluate the spline or its derivatives using slmeval.

slmeval(1.3,slm)
ans =
      0.79491

You plot these splines using plotslm.

plotslm(slm)

The plotslm function is nice because it is a simple gui, allowing you to plot the curve, residuals, its derivatives or the integral. You can also evaluate various parameters of the spline, such as the maximum function value over an interval, the minimum or maximum slope, etc.

slmpar(slm,'maxslope')
ans =
       1.5481

You provide all this information to slmengine using a property/value pair interface. slmset mediates this interaction, so you can use it to create the set of properties that will be used. The default set of properties and their values are given by slmset. Everything about the shape, slopes, curvature, values, etc., about your function can be controlled by a simple command. SLMENGINE also offers the ability to generate splines of various orders, as well as free knot splines.

For a complete set of examples of the SLM tools in action, see the included published tutorial with this submission. There is also a small treatise included on the concept of Shape Language Modeling for curve fitting.

The SLM toolkit will be considerably improved over the next few months. I expect to add a graphical interface, as well as at least one helper application. As well, if I have missed any natural shape primitives, please let me know. While I have tried to be very inclusive, surely there is something I've missed. If I can add your favorite to the list above I will do so.

Finally, the SLM tools require the optimization toolbox to solve the various estimation problems.

Required Products Optimization Toolbox
MATLAB release MATLAB 7.5 (R2007b)
Zip File Content  
Published M Files slm_tutorial
Other Files
license.txt,
SLMtools/html/slm_tutorial.png,
SLMtools/html/slm_tutorial_01.png,
SLMtools/html/slm_tutorial_02.png,
SLMtools/html/slm_tutorial_03.png,
SLMtools/html/slm_tutorial_04.png,
SLMtools/html/slm_tutorial_05.png,
SLMtools/html/slm_tutorial_06.png,
SLMtools/html/slm_tutorial_07.png,
SLMtools/html/slm_tutorial_08.png,
SLMtools/html/slm_tutorial_09.png,
SLMtools/html/slm_tutorial_10.png,
SLMtools/html/slm_tutorial_11.png,
SLMtools/html/slm_tutorial_12.png,
SLMtools/html/slm_tutorial_13.png,
SLMtools/html/slm_tutorial_14.png,
SLMtools/html/slm_tutorial_15.png,
SLMtools/html/slm_tutorial_16.png,
SLMtools/html/slm_tutorial_17.png,
SLMtools/html/slm_tutorial_18.png,
SLMtools/html/slm_tutorial_19.png,
SLMtools/html/slm_tutorial_20.png,
SLMtools/html/slm_tutorial_21.png,
SLMtools/html/slm_tutorial_22.png,
SLMtools/html/slm_tutorial_23.png,
SLMtools/html/slm_tutorial_24.png,
SLMtools/html/slm_tutorial_25.png,
SLMtools/html/slm_tutorial_26.png,
SLMtools/html/slm_tutorial_27.png,
SLMtools/html/slm_tutorial_28.png,
SLMtools/html/slm_tutorial_29.png,
SLMtools/html/slm_tutorial_30.png,
SLMtools/html/slm_tutorial_31.png,
SLMtools/html/slm_tutorial_32.png,
SLMtools/html/slm_tutorial_33.png,
SLMtools/html/slm_tutorial_34.png,
SLMtools/html/slm_tutorial_35.png,
SLMtools/html/slm_tutorial_36.png,
SLMtools/html/slm_tutorial_37.png,
SLMtools/html/slm_tutorial_38.png,
SLMtools/html/slm_tutorial_39.png,
SLMtools/plotslm.m,
SLMtools/private/hermite2slm.m,
SLMtools/private/parse_pv_pairs.m,
SLMtools/private/slmfit.m,
SLMtools/scann.mat,
SLMtools/shape prescriptive modeling.rtf,
SLMtools/SLM.jpg,
SLMtools/slm2pp.m,
SLMtools/slm_tutorial.m,
SLMtools/slmengine.m,
SLMtools/slmeval.m,
SLMtools/slmpar.m,
SLMtools/slmset.m
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (16)
17 Jun 2009 S B  
19 Jun 2009 wee  
30 Jul 2009 Peter Simon

A superb package. Well done! It is perfect for fitting a curve to in-orbit test (IOT) antenna pattern measured data points, used in validating the performance of our satellite antennas. Thanks very much.

Peter Simon
Space Systems/Loral
Antenna Subsystems Operations

13 Aug 2009 Pete sherer  
15 Sep 2009 Pete sherer

Would it be possible to add the GUI such that users can simply drag/adjust the fitted line in the way they want. The scripts then return the functional form and coefficients of the adjusted line. Maybe as an additional script or something.

16 Sep 2009 John D'Errico

Pete - A goal to be finished as soon as I can is to write a gui that will wrap SLM inside. Note that the computational tool in SLM is called SLMENGINE. This was purposeful, since my expectation is that most users would use the gui form when I am able to offer it. So I have definitely been planning for a gui wrapper.

Even at that though, SLM would not have allowed you to just draw a curve through your data freehand, and then return the coefficients of that curve. A freehand drawn curve may be arbitrarily complex, or not even a single-valued function. SLM is best when you fit the curve, then apply your own special knowledge of a system to be modeled, in the form of constraints on the overall shape of the underlying functional form. But a drawn curve to follow is too broad of a constraint to follow. So SLM might not be capable in general of returning such a functional form in that eventuality.

As well, this becomes a unique problem. Is the problem then to find the curve that fits the original data, and has the general shape of the freehand drawn curve? This involves two separate problems of approximation. It seems one must use a tool to smooth and approximate the drawn curve. Then take that same tool and try to fit the drawn curve through the data? This task would take some serious amount of effort, and would never be possible to do so where it would be transparent to the user. Worse, suppose the curve in the end had some lack of fit to the data (as perceived by the user?) Where did the lack of fit arise? Is it lack of fit to the freehand drawn curve? Or is it lack of fit to the data itself?

17 Sep 2009 Joshua  
01 Oct 2009 Jeem79 Olsen

I just used your package to fit my stress-strain curves obtained from image acquisition and processing, and it works perfectly. Thanks for a great job! However, is it possible to extract only the fitted curve? I didn't see an obvious option for that in your documentation.

Jeem79

03 Oct 2009 Fabian Kloosterman

Dear John,

I wonder if it is possible with your functions to fit a spline to a set of (x,y) coordinates over time (each data point also has an associated weight). I want to constrain the velocity of the fitted spline trajectory, which means I can't just fit (x,t) and (y,t) separately. If this is not possible with your SLM tools, do you have any suggestions how to approach this problem?

Thanks, Fabian

04 Oct 2009 John D'Errico

Fabian - This is more difficult to solve. Ordinarily, one would simply fit x(t) and y(t) independently. However, your constraint is on the term

sqrt((dx/dt)^2 + (dy/dt)^2)

Do you need a cubic result? If so, then it is more complex yet, since any overall slope type of constraint is bad enough to formulate for a cubic.

I might use a brute force approach. Use a pair of independent models, x(t), y(t). You can put a global constraint on dx/dt and dy/dt on these models, limiting the maximum and minimum slopes attained.

Now, go back, and test the actual velocity when the two curves are united into a parametric path in the (x,y) plane. If sqrt((dx/dt)^2 + (dy/dt)^2) never exceeds your velocity limit, then you are done. Otherwise, you will now need to use fmincon to perturb the parameters of the splines, while minimizing the global sum of squares of errors to (x(t), y(t)).

The constraints for this will clearly be nonlinear. You might set one constraint at every point, but this will not constrain the true maximum velocity attained. So you might sample each curve at perhaps 1000 points, returning 1000 nonlinear constraints along the curve. This will give you a necessary condition on the velocity, but it need not be truly sufficient. Thus it might exceed the aim max velocity by a tiny amount.

Finally, the optimization over the spline parameter space will also have other linear constraints on those parameters. I suppose one could (if you were adventuresome) go into the SLM code to extract (and return) the actual equations used to estimate the model as it is sent to lsqlin. Fmincon would need to employ those constraints too.

HTH,
John

16 Oct 2009 Didi Cvet

Hi
I think that this kind of toolbox is great idea and I am doing some tests over this file. I have very specific data points and while doing my experiments I've tried to use 'knots', 'free' option but I get this warning message:

Warning: Options LargeScale = 'off' and Algorithm = 'trust-region-reflective' conflict.
Ignoring Algorithm and running active-set method. To run trust-region-reflective, set
LargeScale = 'on'. To run active-set without this warning, use Algorithm = 'active-set'.

I've tried to add this row
>>fminconoptions.Algorithm = 'Active-set';
in your slmengine.m code but it doesn't work.

Thanks for shearing this!
Best wishes
Dijana

16 Oct 2009 John D'Errico

The warning message that fmincon returns is new, due apparently to a change in the optimization toolbox. It is only a warning though, that does not hurt the operation of the code itself.

I'll fix the problem.

20 Oct 2009 Chris

Very nice John, I will cite you in our papers.

I also got this warning:
Warning: Options LargeScale = 'off' and Algorithm = 'trust-region-reflective' conflict.
Ignoring Algorithm and running active-set method. To run trust-region-reflective, set
LargeScale = 'on'. To run active-set without this warning, use Algorithm = 'active-set'.

and added this line after line numer 158 in your slmengine.m file;
fminconoptions.Algorithm = 'active-set';

now I don't get the warning anymore.

29 Oct 2009 John D'Errico

The new version just got uploaded to repair the 'active-set' problems incurred with the newer optimization toolbox releases.

02 Nov 2009 James  
21 Nov 2009 Raymond Cheng

Thanks for your sharing.

Please login to add a comment or rating.
Updates
02 Oct 2009

New capability - predictions at a list of points are also generated.

29 Oct 2009

Fix for the 'active-set' problem, incurred with newer versions of the optimization toolbox. Also repaired a problem when the pp form is requested for free knot problems.

Tag Activity for this File
Tag Applied By Date/Time
slm John D'Errico 15 Jun 2009 10:38:18
curve fitting John D'Errico 15 Jun 2009 10:38:19
splines John D'Errico 15 Jun 2009 10:38:19
estimation John D'Errico 15 Jun 2009 10:38:19
least squares John D'Errico 15 Jun 2009 10:38:19
smoothing John D'Errico 15 Jun 2009 10:38:19
spline John D'Errico 15 Jun 2009 10:38:19
interpolation John D'Errico 15 Jun 2009 10:38:19
modeling John D'Errico 15 Jun 2009 10:38:19
approximation John D'Errico 15 Jun 2009 10:38:19
shape John D'Errico 15 Jun 2009 10:38:19
free knots John D'Errico 15 Jun 2009 10:38:19
breaks John D'Errico 15 Jun 2009 10:38:19
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com