The point is that the knots MUST contain the data. It will throw an error if that is not true, telling you something to that effect. In fact, I don't know where you got the idea that the knots should lie wholly inside the range of the data, as that is not something I have shown in any example in the tutorials. Note that the default for 'knots' is a set of 6 equally spaced points that COMPLETELY span the data.
If the knots were to fall entirely inside the data, with data points that lie outside, then slmengine would be forced to extrapolate, something which I am strongly opposed to doing with a spline, since extrapolation of a cubic polynomial segment will give virtually random crap.
If however, I force you to contain the data inside knots of the spline, then you can control the shape of the spline over the entire knot range. This is how the tool works. It does so for very good reasons. Note that if you DO wish to extrapolate, then since you can control the shape of the spline out as far as the knots go, then you can force the tool to extrapolate intelligently, at least within the bounds of the knots.
Greatly appreciate your fast response.
Turning C2 off at specific points is probably what I needed, but I am not complaining that it isnt there. Also, I didn't realize the concave up/down can be applied to more than one region, but other aspects of my simulations don't keep the concavity the same. I do appreciate those suggestions.
I kinda knew I would have to overcome my lazyness and start using custom placed knots for each of intervals to capture the inflections and avoid overfitting certain regions.
However I am a little confused by lines 176-178 of slmengine.m.
if (knots(1)>min(x)) || (knots(end)<max(x))
error(... 'Knots do not contain the data. Data range: ',num2str([min(x),max(x)])])
Suppose my data is just x = 1:10 with the knots = [2, 5, 7] (i.e. knots contained in the data range). That would cause line 176 to evaluate to true and then throw an error.
(2 > 1) || (7 < 10) => true || true => true
(If I am not mistaken or confused; but probably just confused) Shouldn't that knot placement for the data set be acceptable because all knots are interior to the data set?
Or is it that the range of the knots must include the data range as subset?
Sorry if this is a basic question, but I don't understand what the significance of having knots placed outside the range of data would be.
I had thought about allowing a user to turn off C2 continuity at specific knots, but it seemed a bit klugy in terms of the interface.
If you have a decent idea of where those breaks occur, then you could add a spare knot or two in those areas. If you are trying to do it semi-automatically, then you might need to do it in two passes, using the first fit to find roughly where those second derivative breaks live. So the first pass would employ lots of knots in the fit.
You can also specify a region or set of regions where the curve will be concave down or up. Make sure those regions don't overlap, else the fit will be too strongly constrained to be useful, and make sure there are a couple of knots between each pair of consecutive regions.
Finally, remember that derivative estimation is an ill-posed process. So it tends to be a noise amplification process.
I very much appreciate your SLM tool (plus all your other submissions and useful tips).
I have a question regarding fitting a monotonic increasing data (gas effusion data) that will have inflections (due to increasing or holding the temperature during simulation with time that has an exponential scaling on effusion rate). I am able to get a good fit to the raw data but I am more interested in derivative of the spline because that gives me information about the effusion rate.
I expect the derivative to always be positive (due to monotonicity) and there to be sharp peaks in the derivative due to the inflections in the raw data at temperature changes. Essentially, exponentially increasing then exponentially decreasing, followed by exponentially increasing and exponentially decreasing, etc.
If I was to fit the original data with separate splines defined over intervals of constant temperature behavior I would have jump discontinuities in the first derivatives at the boundary of the temperature intervals which is unacceptable.
To give a pseudo-example to what I expect from the first derivative:
d = [exp(0:0.1:1) exp(0.9:-0.1:0.5) exp(0.4:0.1:1.5) exp(1.4:-.1:1)];
I am fine with the peaks being smoothed a bit due to the smoothness conditions of the spline fit of the original data. I did try changing the C2 to 'off' but then each interval of the derivative was not sufficiently smooth.
Any suggestions on what prescriptions to use when I know my first derivative should be smooth in an interval but have a sharp peak?