Joseph,
The point is that the knots MUST contain the data. It will throw an error if that is not true, telling you something to that effect. In fact, I don't know where you got the idea that the knots should lie wholly inside the range of the data, as that is not something I have shown in any example in the tutorials. Note that the default for 'knots' is a set of 6 equally spaced points that COMPLETELY span the data.
If the knots were to fall entirely inside the data, with data points that lie outside, then slmengine would be forced to extrapolate, something which I am strongly opposed to doing with a spline, since extrapolation of a cubic polynomial segment will give virtually random crap.
If however, I force you to contain the data inside knots of the spline, then you can control the shape of the spline over the entire knot range. This is how the tool works. It does so for very good reasons. Note that if you DO wish to extrapolate, then since you can control the shape of the spline out as far as the knots go, then you can force the tool to extrapolate intelligently, at least within the bounds of the knots.
Hi John,
Greatly appreciate your fast response.
Turning C2 off at specific points is probably what I needed, but I am not complaining that it isnt there. Also, I didn't realize the concave up/down can be applied to more than one region, but other aspects of my simulations don't keep the concavity the same. I do appreciate those suggestions.
I kinda knew I would have to overcome my lazyness and start using custom placed knots for each of intervals to capture the inflections and avoid overfitting certain regions.
However I am a little confused by lines 176-178 of slmengine.m.
if (knots(1)>min(x)) || (knots(end)<max(x))
error(... 'Knots do not contain the data. Data range: ',num2str([min(x),max(x)])])
Suppose my data is just x = 1:10 with the knots = [2, 5, 7] (i.e. knots contained in the data range). That would cause line 176 to evaluate to true and then throw an error.
(2 > 1) || (7 < 10) => true || true => true
(If I am not mistaken or confused; but probably just confused) Shouldn't that knot placement for the data set be acceptable because all knots are interior to the data set?
Or is it that the range of the knots must include the data range as subset?
Sorry if this is a basic question, but I don't understand what the significance of having knots placed outside the range of data would be.
Hi Joseph,
I had thought about allowing a user to turn off C2 continuity at specific knots, but it seemed a bit klugy in terms of the interface.
If you have a decent idea of where those breaks occur, then you could add a spare knot or two in those areas. If you are trying to do it semi-automatically, then you might need to do it in two passes, using the first fit to find roughly where those second derivative breaks live. So the first pass would employ lots of knots in the fit.
You can also specify a region or set of regions where the curve will be concave down or up. Make sure those regions don't overlap, else the fit will be too strongly constrained to be useful, and make sure there are a couple of knots between each pair of consecutive regions.
Finally, remember that derivative estimation is an ill-posed process. So it tends to be a noise amplification process.
Dear John,
I very much appreciate your SLM tool (plus all your other submissions and useful tips).
I have a question regarding fitting a monotonic increasing data (gas effusion data) that will have inflections (due to increasing or holding the temperature during simulation with time that has an exponential scaling on effusion rate). I am able to get a good fit to the raw data but I am more interested in derivative of the spline because that gives me information about the effusion rate.
I expect the derivative to always be positive (due to monotonicity) and there to be sharp peaks in the derivative due to the inflections in the raw data at temperature changes. Essentially, exponentially increasing then exponentially decreasing, followed by exponentially increasing and exponentially decreasing, etc.
If I was to fit the original data with separate splines defined over intervals of constant temperature behavior I would have jump discontinuities in the first derivatives at the boundary of the temperature intervals which is unacceptable.
To give a pseudo-example to what I expect from the first derivative:
d = [exp(0:0.1:1) exp(0.9:-0.1:0.5) exp(0.4:0.1:1.5) exp(1.4:-.1:1)];
plot(d);
I am fine with the peaks being smoothed a bit due to the smoothness conditions of the spline fit of the original data. I did try changing the C2 to 'off' but then each interval of the derivative was not sufficiently smooth.
Any suggestions on what prescriptions to use when I know my first derivative should be smooth in an interval but have a sharp peak?
Hi Peter,
Yes, I'm afraid that you did misunderstand the intent, although I can understand your confusion.
SLMEVAL does not look at the prescription you have provided to know how to extrapolate. In fact, it looks only at the spline itself. (The prescription field is returned to you as a field of the model for several reasons. For example, you can use that prescription field to fit a similar spline to other sets of data, passing the prescription itself into slmengine. It also represents a form of documentation for what you had done to build the spline itself.)
From the help for slmeval, I quote:
"As opposed to extrapolation as I am, slmeval will not
extrapolate. If you wanted to extrapolate, then you
should have built the spline differently."
The point is, if you provide a point that lies outside of the support of the spline, it uses the value of the spline at the end point knot for the value it returns. I suppose you can view this as constant extrapolation.
My point about needing to build the spline "differently" is that you need to supply knots that extend over the region where you will expect to extrapolate. When you do this, now you can tell slmengine how to build the spline over those regions, especially if there is no data out there. Your knowledge as the user is of paramount importance where no data exists to fill a void.
For example, since you are building a purely linear spline (as opposed to a piecewise linear spline), there is no need to use even the default set of 6 equally spaced knots. So you might have done this:
X = linspace(5, 10, 100);
Y = 0.5 + 2*X + 0.001*X.^2;
slm = slmengine( X, Y,'plot','on','knots',[-50,50],'degree','linear');
slmeval( -10, slm )
ans =
-19.704
See that now slmeval can evaluate the function at any point in the desired region.
Or you might have specified the end conditions for the spline. The natural end conditions indicate a spline that has zero second derivatives at each end of the spline. Since this is a two knot cubic spline, that forces the spline to be linear over the entire range, although it is still explicitly cubic. Again, as long as the knots go all the way to where you will need it evaluated, slmeval has no problem.
slm = slmengine( X, Y,'plot','on','knots',[-50,50],'endc','natural');
slmeval( -10, slm )
ans =
-19.704
Even here though, SLMEVAL willl not extrapolate beyond the knots of the spline, except as a constant. So if I try to force it to do so, SLMEVAL will refuse to cooperate:
slmeval( -100:20:100, slm )
ans =
-100.3 -100.3 -100.3 -80.154 -39.854 0.44588 40.746 81.046 101.2 101.2 101.2
No warnings of this behavior are generated, although I suppose I could have built that into SLM too. So the next time I update SLM, I might consider adding an "extrapolation" option. The options available to the user might then arguably be:
{'error', 'warning', 'constant', 'linear', 'cubic'}
Thus 'error' would produce an error whenever any extrapolation is done. 'warning' would issue a warning message, but then extrapolate as a constant. 'constant' is what is currently done. 'linear' would extrapolate linearly form the end knots, etc.
I suppose the most logical default would be 'warning' to tell the user something strange is being done, although for consistency, 'constant' seems right.
A final note, deep in my past, I once wrote a tool that would allow you to extrapolate an existing spline, based on ideas not unlike those in the SLM tools. That is, given a spline, it would attach new knots to that spline, and a shape for the spline that was consistent with any goals that the user supplied. So could specify a new spline that maintained the shape at the end of the old curve, smoothly extrapolating out to a specific point, AND such that the spline was monotonic over that region, or linear over that region, etc. I included ways to specify that the spline could not go above a maximum, below a minimum, have a given slope at the ends, etc. Essentially anything you wanted to do, it allowed you do to it over the extrapolated region. I suppose one day I'll write a tool like that to work with SLM.
Comment only