%% Data Based Modeling of Nonlinear Dynamic Systems
% Copyright 2009-2010 The MathWorks, Inc.
% This demo discusses some popular data-driven modeling techniques using
% the engine throttle body as an example. Data collected by testing on the
% throttle is used for estimation of a variety of linear and nonlinear
% models. The models are created both from purely data-centric
% considerations as well as using knowledge from first principles. Various
% types of models available in System Identification Toolbox are described.
%
% This demo is command-line based, that is, it discusses how one can write
% MATLAB commands that would perform the task of creating dynamic models
% using data-driven techniques. For GUI-centric approaches, see the
% document titled "Data Based Modeling of Nonlinear Dynamic Systems".
% It is recommended that you read that document before using MATLAB file
% anyway, since it provides many useful insights into the art of data-based
% modeling.
%% Data Preparation
% The following script loads raw data sets, decimates them and uses them to
% create IDDATA objects that are used with System Identification Toolbox.
dataprep
%% Estimating Linear Models
% For linear analysis, we need a data set where the signals are constrained
% in the linear operating range of the system. The closest data set we have
% is data3:
plot(data3)
% The output barely touches the hard stops at 90 degrees, but when the
% input excitation is removed, it still shows an offset of 15 degrees. A
% linear model cannot capture such offsets. So to turn this data into
% something "suitable" for linear model estimation, we must subtract off
% the offset.
T = getTrend(data3);
T.OutputOffset = 15;
data3lin = detrend(data3, T);
plot(data3, data3lin)
%% Linear ARX Model
% Let us estimate a model of the form Ay = Bu + e, where A is a second
% order polynomial and B is a constant:
mlin1 = arx(data3lin, [2 1 0], 'Focus', 'simulation')
% Estimate a linear "output error" model which is represented by equation y
% = B/F u e. This is the well-known "transfer function" model of the
% system. We try two different sets of orders for polynomials B and F
% assuming input delay = 0.
%
% Try [nb nf nk] = [2 2 0]
mlin2 = oe(data3lin, [2 2 0]);
%%
% Try [nb nf nk] = [4 4 0]
mlin3 = oe(data3lin, [4 4 0]);
%%
% Note that we did not have to set the focus explicitly to 'simulation' for
% output error models mlin2 and mlin3 because there is no difference
% between prediction and simulation errors for such models.
%%
% Next we try linear state-space models. We need to specify the order of
% the system which is the number of states in the model. Suppose we do not
% know that. We can specify a range of values (say 1 to 10) and the
% software will suggest an optimal value based on analysis of Hankel
% singular values.
mlin4 = n4sid(data3lin, 1:10, 'nk',0, 'Focus', 'simulation');
%%
% We select order = 3. Thus mlin4 is a state-space model containing 3
% states. For comparison, let us also estimate a 2nd and a 4th order
% state-space models:
mlin5 = n4sid(data3lin, 2, 'nk', 0, 'Focus', 'simulation');
mlin6 = n4sid(data3lin, 4, 'nk', 0, 'Focus', 'simulation');
%%
% The above linear models were all discrete-time. If you want
% continuous-time models, you can convert the above models into
% continuous-time using the "d2c" command. Also, System Identification
% Toolbox provides "process models", represented by IDPROC objects, that
% are directly created in continuous-time. Let us create a continuous-time
% model containing 3 poles including an underdamped pair and one zero.
mlin7 = idproc('p3uz'); % template process model with 3 poles, 1 zero
mlin7 = pem(data3lin, mlin7)
%% Comparing Responses of Linear Models to Estimation data
% How good are these models? There are various measures of model quality
% both qualitative and quantitative. The most obvious measure is how
% closely the model's response fits the measured output signal when
% subjected to the same input. This can be determined by using the
% "compare" command. It creates a plot where the models responses are
% overlaid on the measured throttle angle data in data3lin. It shows a
% quantitative measure of goodness of fit using the formula:
% Fit = (1-norm(ydata-ymodel)/norm(ydata-means(ydata)))*100.
% where Fit is a number reflecting percent fit, ydata is the measured
% response (=data3lin.OutputData here) and ymodel is the model's response
% to input signal corresponding to ydata; here, that input signal is
% data3lin.InputData.
compare(data3lin, mlin1, mlin2, mlin3, mlin4, mlin5, mlin6, mlin7)
%%
% Based purely on fit to estimation data as selection criterion, we would
% select mlin3 as the "best model". However there are also other criteria
% to pay attention to:
% 1. Residual analysis: Are the model residual errors
% white (no auto-correlation) and independent of input. See "resid"
% command for more info.
%
% 2. Uncertainty analysis: Each model has certain parameters that are
% estimated by the software. How reliable are those estimates? Parameter
% uncertainty information is contained in a model's "CovarianceMatrix":
mlin3.CovarianceMatrix
%%
% However, viewing this matrix by itself is not very informative. We
% should look at more instructive "views" such as 1. s.d. marginal
% uncertainty on individual parameters, or even better, the effect of
% uncertainties on model's poles/zeros, step response or bode response.
% Such views give an idea about how reliable a model is and provide
% quantifiers for this reliability: which poles/zeros are estimated more
% reliably than others, or what frequency range the model is most
% reliable in.
%%
% View individual parameter uncertainties:
present(mlin3)
%%
% View effect of uncertainties on pole-zero locations, step and frequency
% responses:
figure
pzmap(mlin3, 'sd', 1)
figure
bode(mlin3, 'sd', 1, 'fill')
figure
step(mlin3, 'sd', 1, 'fill') % shows rather large uncertainty
%%
% Ultimately, the user is the best judge of model quality. The model has to
% be sufficiently accurate in the operating region of the system that the
% user is interested in capturing. Other considerations like cost and
% feasibility of implementation also play a role in selecting a candidate
% model from the set of competing choices.
%%
% The focus of this demo is nonlinear model creation. To motivate nonlinear
% approach, let us first look at the limitation of the linear models. They
% all do well on the detrended low-amplitude data. However, when validated
% on data sets containing higher input amplitudes, the fit is not good:
figure
compare(data1, mlin1, mlin2, mlin3, mlin4, mlin5, mlin6, mlin7)
figure
compare(data2, mlin3, mlin7)
% etc..
%%
% The linear models are not able to capture the hard stops at 15 and 90
% degrees even though they may capture the dynamics in the linear range. We
% clearly need nonlinear models.
%% Nonlinear ARX Models
% Nonlinear ARX models are nonlinear extensions of linear ARX models. This
% extension is achieved by allowing model regressors to be arbitrary,
% user-configurable (possibly nonlinear) mathematical functions and by
% letting the output to be a nonlinear function of its regressors (a linear
% ARX model by comparison uses a weighted sum of its regressors). The
% simplest Nonlinear ARX model one can create is one that is the same as
% the linear one, but allows for a non-zero offset. This is achieved by
% setting the nonlinear function of the model to 'linear' or simply [].
% Thus the simplest counterpart of model mlin1 is the following Nonlinear
% ARX model:
mnonlin1 = idnlarx([2 1 0], []);
%%
% The parameters of this nonlinear model can be estimated using the "nlarx"
% command. Suppose we use data3 as estimation data, we get:
mnonlin1 = nlarx(data3, mnonlin1, 'Focus', 'simulation')
%%
% The above estimation can also be carried out in one shot as follows:
mnonlin1 = nlarx(data3, [2 1 0], [], 'Focus', 'simulation');
%%
% If we now compare the responses of models mnonlin1 and mlin1 to data3, we
% find that their fits are almost identical, except that the nonlinear
% model is also able to capture the resting offset value of 15 degrees:
compare(data3, mlin1, mnonlin1)
%%
% This shows that having more (nonlinear) flexibility can improve model
% fidelity. However, if we compare the response of the model to other data
% sets, we find that it does not do well:
figure
compare(data1, mlin1, mnonlin1)
figure
compare(data2, mlin1, mnonlin1)
%%
% This means that the nonlinear model structure we used is not flexible
% enough to capture the nonlinear phenomena exhibited by the system. We
% need to add more teeth to our Nonlinear ARX model structure. Also, we
% must use data sets that excite the nonlinearities "sufficiently". data3
% is not a good candidate. data1 and data2 seem to be better candidates
% since their output plots show that the hard stop at 90 degrees is hit
% more obviously in them:
plot(data1, data2)
%%
% Hence we will use these two data sets together for further estimations.
% In order to use multiple data sets for estimation, we must first merge
% them into a "multi-experiment" data set using the "merge" command:
MultiExpData = merge(data1, data2)
%%
% Note that we have not detrended these data sets because the nonlinear
% models are expected to capture the offsets.
%%
% We have prepared the data now, but what model structure we are going to
% try? One option is to add targeted regressors to the model that are
% derived using physical insight into the nature of the system. The overall
% system seems to be behaving like a second order system (at least in
% linear range). When the throttle angle becomes 90 degrees, there is a
% hard stop which can be described by a very hard spring that resists the
% throttle angle to be greater than 90 degrees. Similarly, the 15 degree
% resting offset can be captured using a spring force that is activated
% only when the angle becomes less than 15 degrees and whose job is to
% restore the resting angle. Thus the nonlinear resistive force, which
% comes into play only when the throttle angle is either less than 15 or
% greater than 90 degrees can be expressed as:
%
% Fnl(t) = K1*(max(y(t),90)-90) + K2*(min(y(t),15)-15)
%
% where y(t) is the valve angle at time t. For simplicity we can assume K1
% = K2, so that the nonlinear spring force becomes: K1*(max(y(t), 90) +
% min(y(t),15)-105). We can include a formula such as "max(y(t), 90) +
% min(y(t),15)-105" as a regressor in the model and the job of the
% estimation method will be to determine the value of K1, in addition to
% previous coefficients. The model of this configuration can be estimated
% as follows:
mnonlin2 = nlarx(MultiExpData, [2 1 0], [], 'CustomRegressors', ...
{'max(Valve Angle(t-1),90)+min(Valve Angle(t-1),15)-105'}, 'Focus', 'simulation');
%%
% Or, in multiple steps as follows:
C1 = customreg(@(x)max(x,90)+min(x,15)-105, {'Valve Angle'}, 1, true);
mnonlin2 = idnlarx([2 1 0],[],'InputName', 'Step Command', ...
'OutputName', 'Valve Angle', 'CustomReg', C1);
mnonlin2 = nlarx(MultiExpData, mnonlin2, 'Focus','simulation')
%%
% In the above, we used the object representation of the custom regressor,
% which is more flexible and easier to manage than directly typing the
% formula as a string, which can get cumbersome if there are many custom
% regressors. Now compare this model to its estimation data:
compare(MultiExpData, mnonlin2)
%%
% We now see fairly good fits to the two data sets in MultiExpData although
% the model still seems to be having some trouble capturing the hard stop
% levels. A test for model's usefulness is how well it performs on an
% independent data set that is obtained under similar experimental
% conditions but not used for estimation. We have several such data sets:
% data3, data4 and data5.
figure, compare(data3, mnonlin2) % good
figure, compare(data4, mnonlin2) % not so good
figure, compare(data5, mnonlin2) % good
%%
% So our quick conclusion is that adding the physically motivated custom
% regressor makes a difference! This leads to two questions that are good
% to consider:
% 1. What if the physical insight is not handy? In many applications
% mathematical relationships between input-output variables might be
% difficult to realize; in fact, data-based "black box" modeling
% approach is used largely to get around this problem! Also, even if
% such equations could be derived they may not be amenable to
% representation as custom regressors. In such cases, we will approach
% the modeling task from purely a function-approximation point of view:
% we will try out various model structures whose forms will not be
% guided by any physical insights. We shall explore this approach in
% detail in the following sections.
%
% 2. Custom regressors need not correspond to physical relationships. It
% is actually quite common to create nonlinear models that employ a set
% of polynomials of various degrees as regressors. The idea then is to
% simply increase the model's flexibility in capturing arbitrary
% nonlinear phenomena without necessarily trying to relate individual
% regressors to physically meaningful behavior.
%% Nonlinear Models using Polynomial Regressors
% Let us first see how a nonlinear model that uses polynomial type custom
% regressors can be created. Note that the model will still be "linear in
% regressors", that is, the output will be expressed as a weighted sum of
% its regressors. Any nonlinear behavior is thus limited to the definition
% of its regressors. Later on, we will make use of nonlinear functions of
% regressors such as sigmoid and neural networks.
mnonlin3 = idnlarx([2 1 0], []); % nonlinear ARX model template
C2 = polyreg(mnonlin3, 'MaxPower', 3); % create polynomial type regressors; a set of 6 regressors
mnonlin3.CustomRegressors = C2; % insert the set of polynomial regressors as custom regressors of the model
getreg(mnonlin3) % display the set of model regressors
% Perform estimation with option to watch estimation progress.
mnonlin3 = nlarx(MultiExpData, mnonlin3, 'Focus', 'Simulation', 'Display', 'on')
%%
% This model works almost as good as mnonlin2, although the fit to data2
% (second experiment in estimation data) and data4 is not so good. This
% model thus makes use of regressors that are not motivated by physical
% insight. For easier validation, we create a new "validation" data set by
% merging data3, data4 and data5 into one multi-experiment data set. This
% will help us avoid issuing multiple "compare" commands, one for each data
% set.
ValidationData = merge(data3, data4, data5);
figure, compare(MultiExpData, mnonlin3) % comparison to estimation data sets
figure, compare(ValidationData, mnonlin3) % comparison to validation data sets
%%
% Note that the "polyreg" command is just a convenience function for
% quickly creating a set of polynomial type custom regressors. You can
% always create each regressor individually using the "customreg"
% constructor explicitly. Further note that the above model has 9
% regressors - 3 standard regressors added implicitly by defined model
% orders (=[2 1 0]) and the other 6 added as custom regressors explicitly.
% Refer to the product documentation for details on definition and
% management of regressors.
%% Estimating Models that Employ Nontrivial Nonlinear Functions
% All the Nonlinear ARX models estimated above used a linear weighted sum
% of regressors. System Identification Toolbox facilitates the use of
% several nonlinear functions of regressors. For example, you can create a
% wavelet network that transforms the regressors into model output by using
% a wavelet series. Similarly, you can use sigmoid networks, binary trees
% and multi-layer neural networks as nonlinear functions. These nonlinear
% functions may be used together with arbitrary, possibly nonlinear
% regressors, such as those defined above.
%%
% First, we create a Nonlinear ARX model that uses a Wavelet Network as its
% nonlinear function. We will create several models of this configuration
% by varying the types of regressors used.
mnonlin4 = nlarx(MultiExpData, [2 1 0], 'wavenet', 'Focus', 'Simulation'); % use only standard (implicit) regressors
% Use custom regressor C1 and determine number of wavelet units
% interactively
mnonlin5 = idnlarx([2 1 0], wavenet('Num','interactive'), 'InputName', 'Step Command', ...
'OutputName', 'Valve Angle','CustomRegressors', C1); % use custom regressor C1
mnonlin5 = nlarx(MultiExpData, mnonlin5, 'Focus', 'Simulation', 'Display', 'on');
% Use polynomial type custom regressors with default wavelet network
mnonlin6 = idnlarx([2 1 0], 'wavenet', 'CustomRegressors', C2); % use custom regressor set C2
mnonlin6 = nlarx(MultiExpData, mnonlin6, 'Focus', 'Simulation', 'Display', 'on');
%%
% Note that the nonlinear function, wavenet, can be specified in various
% ways; if a network with default configuration is to be used, you may
% simply specify its name as a string. However, if you need to configure
% the properties of the nonlinear function (such as setting its
% NumberOfUnits property), you must use the object form - invoke the
% nonlinear function's constructor and configure its properties. For model
% mnonlin5, the number of units was set to be determined interactively.
% This means that the user will be prompted to choose it based on the UOV
% criterion ("unexplained output variance") shown by a bar chart during
% estimation.
figure, compare(MultiExpData, mnonlin4, mnonlin5, mnonlin6)
figure, compare(ValidationData, mnonlin4, mnonlin5, mnonlin6)
%%
% Of the three, only mnonlin6 seems to provide something close to a
% satisfactory result; others do not work at all. Overall, use of wavelet
% networks for this problem does not seem to work, even though we have only
% scratched the surface in terms to exploring all the estimation options
% (such as model orders, other types of custom regressors, nonlinear
% least-squares search algorithm options, configuration of wavelet network
% properties etc). One of the main lessons in nonlinear model estimation
% methodology is to take a trial and error perspective and patiently try
% out a variety of forms and estimation options. In this demo, we just
% explore some of these options; the trials shown here are not exhaustive
% by any means.
%%
% Next we try sigmoid networks as choice of nonlinear function for the
% Nonlinear ARX model.
%%
% Create a Sigmoid Network based nonlinear ARX model of order [2 1 0] and
% default configuration of the sigmoid network. Set the maximum number of
% iterations to 50.
mnonlin7 = nlarx(MultiExpData, [2 1 0], 'sigmoid', 'Focus', 'Simulation',...
'MaxIter', 50, 'Display', 'on');
%%
% Create a model of order [3 3 0] that uses 10 sigmoid units.
NL = sigmoidnet('NumberOfUnits', 10);
mnonlin8 = nlarx(MultiExpData, [3 3 0], NL, 'Focus', 'Simulation', ...
'Display', 'on');
%%
% Use the above model but with additional polynomial regressors of orders 2
% and 3.
mnonlin9 = idnlarx([3 3 0], NL);
PolyReg = polyreg(mnonlin9, 'MaxPower', 3); % polynomial regressor set
mnonlin9.CustomRegressors = PolyReg;
mnonlin9 = nlarx(MultiExpData, mnonlin9, 'Display', 'on', 'Focus', 'Simulation');
figure, compare(MultiExpData, mnonlin6, mnonlin7, mnonlin8, mnonlin9)
figure, compare(ValidationData, mnonlin6, mnonlin7, mnonlin8, mnonlin9)
%%
% The additional flexibility provided by custom regressors helps it to
% improve the fit to estimation data. However, fit to validation data
% reduces indicating that the extra flexibility adds variability in the
% model that may render it less robust, or difficult to extrapolate. Based
% on the above analysis, mnonlin8 seems to be the best candidate for use.
%% Using a Linear Model to Initialize the Nonlinear ARX Model
% We started the modeling exercise by trying out a variety of linear
% models. This gave us an idea regarding the orders to use. With linear ARX
% models we can go further: we can use it directly to initialize the
% configuration of a nonlinear ARX model whose parameters may then be
% updated using the "nlarx" command. This gives us an alternative way of
% initializing nonlinear models which may work better than the default
% initialization scheme in cases where the system is weakly nonlinear and
% we already have a very good representation (i.e. model) of its linear
% component.
%%
% For example, we may have a good linear model of a system from experiments
% where it is subjected to low-amplitude excitations. We may want to
% "extend" this linear model by adding nonlinear elements in order to
% extend the applicability of the model into higher-amplitude input
% excitations that cause the nonlinearities (such as saturation) to kick
% in. Let us explore the syntax for achieving this type of initialization:
%%
% Suppose we estimate a linear ARX model of orders [3 3 1] for the low
% input amplitude dataset data3lin:
LinearModel = arx(data3lin, [3 3 1], 'Focus', 'simulation')
compare(data3lin, LinearModel)
%%
% The plot shows a decent fit to the data. Now suppose we want to create a
% sigmoid network based nonlinear model that uses the above linear model as
% its linear component. We can achieve this configuration by using the
% "idnlarx" command that creates a nonlinear ARX model:
mnonlin10 = idnlarx(LinearModel, 'sigmoidnet') % a template nonlinear ARX model
%%
% We use the linear model as an input argument to idnlarx command, in place
% of the (more usual) orders matrix. This creates a nonlinear model whose
% orders. Plus, its linear component (see
% mnonlin10.Nonlinearity.Parameters.LinearCoef) has been pre-computed using
% the coefficients of the linear model. We can now update all the
% parameters of the nonlinear model to maximize fit to a higher-amplitude
% nonlinear data as before:
mnonlin10 = nlarx(MultiExpData, mnonlin10, 'Display', 'on', 'Focus', 'Simulation');
%%
% The above multi-step configuration (use of idnlarx to create a template
% followed by estimation using nlarx) can be combined together into one
% estimation call as follows:
mnonlin10 = nlarx(MultiExpData, LinearModel, 'sigmoidnet', ...
'Display', 'on', 'Focus', 'Simulation')
%%
% Finally, let us compare mnonlin10 against a similar model that was
% estimated using default initialization scheme:
figure, compare(MultiExpData, mnonlin8, mnonlin10)
figure, compare(ValidationData, mnonlin8, mnonlin10)
%%
% mnonlin10 appears to be a good model based on validation test. Note that,
% in general, there is no automatic guarantee that a nonlinear ARX model
% created using a linear component (rather than orders) would work better.
% It simply offers an alternative to the default initialization scheme used
% when you specify the order matrix to create the nonlinear model. However,
% this alternative should be considered when a good linear model of the
% process is handy and when the nonlinearities are known to be
% weak/limited.
%% Hammerstein-Wiener Models
% Hammerstein-Wiener models are nonlinear extensions of linear transfer
% functions wherein the extension is achieved by adding memoryless
% nonlinear functions to the input and/or output signal(s) of the linear
% system. These models are also called "models with I/O nonlinearities".
% Very often, the nature of the nonlinear functions connected to the linear
% function are determined based on physical insight, but that is not a
% necessity. In the following we will explore two configurations: one where
% the nature of nonlinearity is determined by physical reasoning and
% another where we use arbitrary functions in a "black box" spirit.
%%
% The simplest counterpart of a linear transfer function (for example,
% model mlin7 or mlin3) in Hammerstein-Wiener structure is a model with no
% nonlinearities. This is created by using [] for nonlinearity in the
% model constructor (idnlhw) or its estimator (nlhw).
hw1 = idnlhw([4 4 0], [], []) % Input NL = Output NL = []
hw1 = nlhw(data3lin, hw1, 'Display', 'on')
%%
% The above two calls can be combined into a single call to "nlhw" by
% specifying the orders and nonlinearities directly as input arguments to
% "nlhw":
hw1 = nlhw(data3lin, [4 4 0], [], [], 'Display', 'on')
%%
% The above nonlinear model (hw1) is structurally similar to the linear
% model mlin3, although its parameters are different owing to differences
% in handling of initial states. Of course, hw1 is no good at modeling
% nonlinear behavior since it has no nonlinear elements.
compare(MultiExpData, hw1)
%%
% Let us now start building some meaningful Hammerstein-Wiener models. We
% know intuitively that the system has a linear range. At angles less than
% 15 degrees and greater than 90 degrees resistive forces kick in. The ones
% at 90 degrees are strong and can be considered almost as a complete hard
% stop. Such phenomena can be described by a saturation function. Thus
% adding a saturation function at the output of the model should capture at
% least the 15 and 90 degree steady-state signal levels present in the
% output, if not the transient behavior. The saturation of output can be
% achieved by adding a saturation nonlinearity at the output port of the
% linear system. In other words, let us create a Hammerstein-Wiener model
% that uses saturation as its "output nonlinearity" but has no input
% nonlinearities. The mathematical name for this configuration is a "Wiener
% model":
hw2 = nlhw(MultiExpData,[3 3 0], [], 'saturation'); % a template Wiener model
%%
% hw2 uses saturation nonlinearity in its default configuration. But we
% know more: we know that the saturation limits ought to be close to 15
% (lower limit) and 90 degrees (upper limit). We can add this information
% by customizing a saturation nonlinearity object before adding it to the
% Hammerstein-Wiener model:
SatNL = saturation([15 90]) % a saturation nonlinearity with prescribed saturation bounds
%%
% or, we could also do:
SatNL = saturation;
SatNL.LinearInterval = [15 90];
%%
% Now use SatNL as a component of the Hammerstein-Wiener model:
hw3 = nlhw(MultiExpData,[3 3 0], [], SatNL, 'Display', 'on');
hw3.OutputNonlinearity % post-estimation values of saturation limits
%%
% The above models (hw2, hw3) have implicitly used zero initial conditions.
% Sometimes estimation of initial conditions along with the model's
% parameters improves the fit to data.
hw4 = nlhw(MultiExpData,[3 3 0], [], SatNL, 'Display', 'on', 'InitialState', 'Estimate');
figure, compare(MultiExpData, hw2, hw3, hw4)
%%
% While hw2 and hw3 are more or less the same, hw4 is better at fitting the
% estimation data. Let us now validate these models on independent
% validation data sets:
figure, compare(ValidationData, hw2, hw3, hw4)
%%
% The validations are good for hw4, except that it performs markedly worse
% on third experiment in the validation data set. We will now explore if
% the situation can be improved by increasing model orders:
hw5a = nlhw(MultiExpData,[4 4 0], [], SatNL, 'Display', 'on', 'InitialState', 'Estimate');
hw5b = nlhw(MultiExpData,[4 4 0], [], SatNL, 'Display', 'on', 'InitialState', 'Zero');
figure, compare(MultiExpData, hw2, hw3, hw4, hw5a, hw5b)
figure, compare(ValidationData, hw2, hw3, hw4, hw5a, hw5b)
%%
% Thus, using orders [4 4 0] with the option to estimate initial states
% seems to help; hw5a seems to be a better model overall. However, the fits
% are not as good as those obtained using nonlinear ARX models. This, by
% itself, is not a cause for concern: not all model structures are suited
% for a particular problem. However, we can continue to explore other types
% of I/O nonlinearities in the Hammerstein-Wiener models. We can consider
% more flexible forms such as piecewise linear functions in place of
% saturation. Note that such a choice is made simply as a trial-and-error
% alternative (the idea being use of more flexible nonlinear elements) and
% is not guided by any physical insight.
%%
% Consider now a model structure that uses piecewise linear functions as a
% form of nonlinearity and order = [3 3 0]
hw6 = nlhw(MultiExpData,[3 3 0], [], 'pwlinear', 'MaxIter', 200);
hw7 = nlhw(MultiExpData,[3 3 0], pwlinear('NumberOfUnits',7), ...
pwlinear('NumberOfUnits',10), 'MaxIter', 200, 'Display', 'on');
figure, compare(MultiExpData, hw5a, hw6, hw7)
figure, compare(ValidationData, hw5a, hw6, hw7) % hw7 appears to validate well
%%
% There are several other types of nonlinear functions you could explore
% using in a Hammerstein-Wiener model. These include Wavelet Network
% (wavenet), 1-dimensional Polynomial (poly1d), Dead Zone (deadzone),
% Sigmoid Network (sigmoidnet).
%%
% Note that estimation command calls do not use 'Focus'/'simulation' as
% property-value pair input argument. This is because the
% Hammerstein-Wiener models can only minimize the simulation error between
% model and measured output data, while a nonlinear ARX model can minimize
% both prediction error (focus = 'prediction') and simulation error (focus
% = 'simulation')
%% Using a Linear Model to Initialize the Hammerstein-Wiener Model
% We started the modeling exercise by trying out a variety of linear
% models. This gave us an idea regarding the orders to use. With linear OE
% models (models created using "n4sid" or "oe") we can go further: we can
% use it directly to initialize the configuration of a Hammerstein-Wiener
% model whose parameters may then be updated using "nlhw" command. This
% gives an alternative way of initializing nonlinear models which may work
% better than the default initialization scheme in cases where the system
% is weakly nonlinear and we already have a very good representation (i.e.
% model) of its linear component.
%%
% For example, we may a good linear model of a system when it is subjected
% to low-amplitude excitation. We may want to "extend" this linear model by
% sandwiching it between nonlinear elements in order to extend the
% applicability of the model into higher-amplitude input excitations that
% cause the nonlinearities (such as saturation) to kick in. Let us explore
% the syntax for achieving this type of initialization:
%%
% Suppose we estimate a linear transfer function model of orders [4 3 0]
% for the low input amplitude dataset data3lin:
LM = oe(data3lin, [4 3 0])
compare(data3lin, LM)
%%
% The plot shows a decent fit to the data. Now suppose we want to create a
% nonlinear model that uses the above linear model (LM) as its linear
% component and pwlinear functions as its input and output nonlinearities.
% We can achieve this configuration by using the "idnlhw" command that
% creates a Hammerstein-Wiener model:
hw8 = idnlhw(LM, pwlinear('num',7), 'pwlinear')
%%
% We use the linear model as an input argument to idnlhw command, in place
% of the (more usual) orders matrix. This creates a nonlinear model whose
% orders are same as those of linear model LM. Plus, its linear component
% (see hw8.LinearModel) is same as LM. We can now update all the parameters
% of the nonlinear model to maximize fit to a higher-amplitude nonlinear
% data as before:
hw8 = nlhw(MultiExpData, hw8, 'Display', 'on', 'MaxIter', 200);
%%
% The above multi-step configuration (use of idnlhw to create a template
% followed by estimation using nlhw) can be combined together into one
% estimation call as follows:
hw8 = nlhw(MultiExpData, LM, pwlinear('num',7), 'pwlinear', 'Display', 'on', 'maxiter', 200);
hw8 = nlhw(MultiExpData, hw8, 'SearchMethod', 'lm'); % run more iterations using 'Levenberg-Marquardt' search method
%%
% Finally, let us compare the various Hammerstein-Wiener models
figure, compare(MultiExpData, hw5a, hw6, hw7, hw8)
figure, compare(ValidationData, hw5a, hw6, hw7, hw8)
%%
% hw8 appears to be a good model based on validation test. Note that, in
% general, there is no automatic guarantee that a Hammerstein-Wiener Model
% model created using a linear component (rather than orders) would work
% better. It simply offers an alternative to the default initialization
% scheme used when you specify the order matrix to create the nonlinear
% model. However, this alternative should be considered when a good linear
% model of the process is handy and when the nonlinearities are known to be
% weak/limited.
%% Residue Test
% In all of the above experiments, we have used fit to output as the sole
% criterion for judging the quality of a model. While that test is the
% primary qualification test, there are other tests of model quality that
% are often employed. One particular type of test, called "residue test"
% analyses the information content in the error signal, wherein the error
% is defined as the difference between the predicted response of the model
% and the measured data. The prediction error can be analyzed in two ways:
% by studying its auto-correlation ("whiteness test") and its correlation
% to input ("independence test"). These tests can be carried out using the
% "resid" command. For example, the residues of models mnonlin8 and hw8 can
% be studied as follows:
figure, resid(MultiExpData, mnonlin8)
figure, resid(ValidationData, mnonlin8)
figure, resid(MultiExpData, hw8)
figure, resid(ValidationData, hw8)
%%
% A good model should produce an error that is as close as possible as
% possible to white noise, indicating that it has no useful information
% left in it. This is indicated by the plotted values being inside the
% "statistically insignificant" region shown by the yellow region in the
% plot. The top axis shows error correlation (should be zero for all lags
% except 0) and cross correlation between input and error (should be zero
% for all lags). The model mnonlin8 seems to pass the residue test since
% the error correlation values are more or less inside the yellow region.
% However, the same cannot be said of the Hammerstein-Wiener model hw8
% where the error signal auto-correlation is significant. However, this is
% no cause for concern as long as the error is uncorrelated with inputs.
% This is because a Hammerstein-Wiener model, by structure, as no
% flexibility (i.e. dedicated dynamic elements) to model noise. This is
% unlike nonlinear ARX model which implicitly models both the measured
% behavior (relationship of input to output) and noise (i.e. output
% disturbances). Even with auto-correlated error, the Hammerstein-Wiener
% model can be considered a good model if it fits the validation data well
% and its residues are uncorrelated with input signal.
%% Concluding Remarks on Black Box Modeling
% This concludes our black-box modeling exercise. We created several
% linear, nonlinear ARX and Hammerstein-Wiener models of orders in the 2-4
% range. We explored several nonlinear functions such as Wavelet Network,
% Sigmoid Network, saturation and Piecewise linear functions as components
% of our nonlinear models. We configured the properties of these nonlinear
% functions in some cases by specifying the number of units for
% network-type nonlinear functions and setting linear interval for
% saturation. In the estimation process, we tweaked some estimation options
% such as MaxIter, and SearchMethod.
%
% Most experiments were trial-and-error in nature. However, we used
% physical insight to configure the model structures in some cases:
% constructing custom regressor for nonlinear ARX models, choosing
% saturation as output nonlinearity for Hammerstein-Wiener models as well
% as specifying the initial guess for saturation limits. Furthermore, we
% explored how a good linear model of ARX form and OE form (transfer
% function) may be directly used as a component of the nonlinear model;
% options for such incorporation exist both in the model constructors
% (idnlarx, idnlhw) as well as their estimators (nlarx, nlhw).
%% Grey Box Modeling of Throttle Dynamics: An ODE Coefficient Estimation Approach
% In many cases, the nonlinear dynamics of the observed phenomenon are
% well understood and mathematical equations for their motion can be
% derived. In a lumped-parameter representation, the equations of motion
% can often be represented by a set of nonlinear differential equations.
%%
% To study this approach, let us investigate the dynamics of throttle valve
% system more closely. The DC motor operating the valve responds very
% quickly to the step command. Hence compared to the throttle valve, the
% dynamics of the motor can be captured simply as a steady-state gain. The
% butterfly valve behaves like a mass-spring-damper system (second order),
% except for the hard stops. The hard stops can be described by a resistive
% hard spring. Hence the overall dynamics can be represented by a second
% order differential equation that uses a nonlinear resistive torsional
% spring:
%%
% J d^2/dt^2 + c dy/dt + ky + K sat(y,[15 90]) = b F(t)
%
% where J is the rotational inertia of the throttle valve, c is its damping
% coefficient, k is its torsional spring constant (linear stiffness) and K
% is the stiffness constant related to the nonlinear resistive force. The
% nonlinear displacement (displacement to which the nonlinear spring
% responds) is sat(y,[15 90]) is (max(y,90)-90+min(y,15)-15).
%%
% The parameters of this model are the coefficients J, c, k, K and b. We
% can reduce one parameter by dividing the whole equation by J. The
% (normalized) equation of motion forms the basis for grey-box estimation.
% We can compute the values of the constants c, k, K and b by following
% the following 3-step approach:
%%
%
% # Represent the ODE as a set of first order differential equations in a
% MATLAB or MEX function. The function must return the value of the
% output and state-derivatives at a given time, computed as a function of
% input values, state values, and parameter values.
%
% # Instantiate an IDNLGREY model object that uses the above ODE file by
% calling the IDNLGREY constructor. The "parameters" of the IDNLGREY
% model are the ODE coefficients.
%
% # Use the "pem" command to estimate the parameters of the IDNLGREY model.
%%
% If x1(t) = y(t) and x2(t) = dy/dt are used as state variables, the 2nd
% degree equation of motion can be represented by a set of two 1st order
% equations in state variables. This form is called "state-space" form. An
% ODE file that returns y(t), dx1(t)/dt and dx2(t)/dt as output arguments
% is throttleODE.m.
type throttleODE.m
%%
% The parameters defined by this ODE file are c, k, K and b, all normalized
% by moment of Inertia J.
%%
% Create an IDNLGREY model that wraps this ODE function in a dynamic model.
% To do this we need the following: (1) name of ODE file (2) number of
% inputs, outputs and states in the system (3) initial (guess) values of
% system parameters. Optionally we may also specify (4) the initial guess
% values for state values at time t = 0 and (5) model sample time. Let us
% consolidate this information first.
%%
% ODE file name:
ODEFile = 'throttleODE'; % file must exist in current working directory or be on MATLAB path
%%
% Dimensions [ny nu nx]
Orders = [1 1 2]; % we have 2 states: displacement x1(t) and velocity x2(t)
%%
% Initial parameter values
% For our example, some values could come from a linear modeling exercise:
% estimating a black box linear model of second order would suggest values
% for c, k and b. The nonlinear spring constant could be a large value
% (some multiple of linear spring constant k). The following values were
% chosen following such analysis.
c0 = 50;
k0 = 120;
K0 = 1e6; % mostly a wild guess
b0 = 4e4;
Par0 = {c0; k0; K0; b0}; % initial guess of parameter values
%%
% We also know that the rest position of the valve is 15 degrees. So initial
% value of state x1(t) is known. Initial velocity could be assumed to be
% zero. Now we have a reasonable guess for model's initial states:
x0 = [15; 0];
%%
% We want to create a continuous-time model (since our description is in
% terms of differential equations, not difference equations). So we choose
% the model sample time to be zero.
Ts = 0;
%%
% Now we are ready to call the IDNLGREY constructor using the above
% variables as input arguments:
Model0 = idnlgrey(ODEFile, Orders, Par0, x0, Ts);
%%
% We can assign input-output and state names to this model so that it is
% compatible with the estimation data:
Model0.InputName = 'Step Command';
Model0.OutputName = 'Valve Angle';
setinit(Model0,'Name',{'Position';'Velocity'}) % set state names
display(Model0)
%%
% Before we perform estimation, we can check how good the initial guess
% model Model0 is in emulating observed output. To do this, we compare the
% output signals in estimation data to simulated response of the model. In
% doing so, let us also use the model's own guess of initial states
% (chosen by using 'init'/'model' as property-value pair input to the
% "compare" command):
compare(MultiExpData, Model0, 'init', 'model')
%%
% The guess model gives about 60% fit to estimation data. Now let us
% perform an estimation using "pem" command to improve the fit to
% estimation data. Note that by default, the initial states are fixed to
% their initial values and only parameters are updated. You can also
% estimate initial state values, fix certain parameters to their original
% values and/or specify min/max bounds for estimation using the
% "Parameters" and "InitialStates" properties of the model. There are also
% some dedicated commands that simplify such settings: "setpar" and
% "setinit".
Model1 = pem(MultiExpData, Model0, 'Display', 'on', 'MaxIter', 20);
%%
% By default, the estimation algorithm uses 'lsqnonlin' search method which
% requires Optimization Toolbox. If that toolbox is not available on user's
% copy of MATLAB, then System Identification Toolbox's built-in solvers are
% used. For example, we could use 'Levenberg-Marquardt' solver that does
% not require Optimization Toolbox:
Model2 = pem(MultiExpData, Model0, 'Display', 'on', 'SearchMethod', 'lm', 'MaxIter', 40);
%%
% As before, let us compare the response of these models to estimation and
% validation data sets:
figure, compare(MultiExpData, Model1, Model2, 'Init', 'Model')
figure, compare(ValidationData, Model1, Model2, 'Init', 'Model')
%%
% The results are mixed: some outputs are reproduced well, others are not.
% The trick to getting good results in grey-box case is trying out various
% initial guesses for parameters, as well as freeing initial states for
% estimation. For example, we could add x2(0) to our list of entities to
% estimate. This will not only estimate x2(0) value for each experiment in
% the estimation dataset (MultiExpData has two experiments), but also
% change the parameter estimates by separating out the influence of
% non-zero initial velocity from overall response.
Model0.InitialStates(2).Fixed = false;
Model3 = pem(MultiExpData, Model0, 'Display', 'on', 'SearchMethod', 'lm', 'MaxIter', 40);
%%
% However, in doing so, we have estimated initial states that are
% tailor-made for estimation data sets. Validating using independent data
% sets may require initial states to be estimated separately for those
% datasets. This can be achieved by using 'Init'/'Estimate' in "compare"
% command.
figure, compare(MultiExpData, Model2, 'Init', 'Model') % estimated initial states may be used for estimation data
% Prepare for validation using 3 data sets. We have to create a new model
% because we are validating using a dataset that has a different number of
% experiments (3) than the estimation dataset (2).
x0val = {[15 15 15];[0 0 0]};
Model3a = idnlgrey(ODEFile, Orders, getpar(Model3), x0val, Ts);
Model3a.InputName = Model3.InputName;
Model3a.OutputName = Model3.OutputName;
%%
% The new model Model3a has same parameters as those of the estimated model
% Model3. Its initial states will be estimated to maximize fit to
% individual datasets in ValidationData.
figure, compare(ValidationData, Model3a, 'Init', 'Estimate')
%%
% To view/fetch parameter values use the "getpar" command.
getpar(Model2)
%% Uncertainty Analysis
% A nice feature of grey-box estimation is that its parameters are few and
% have clear physical meaning. The parameters of a black box model are
% often arbitrary coefficients. Because the parameters are well defined for
% grey-box models, it is possible to view the standard deviations which
% provides another measure of the reliability of their estimates. Remember
% that the estimated parameter values are simply nominal values of
% uncertain entities that have a covariance matrix associated with them. To
% view the parameter values of the model along with their +/- 1 std errors,
% use the "present" command:
present(Model2)
%%
% For those who are mathematically inclined, the entire covariance matrix
% may be fetched from the "CovarianceMatrix" property of the model, as in:
Cov = Model2.CovarianceMatrix
%%
% This concludes our discussion of grey-box modeling approach. With this
% approach there are two main tasks that users have to do:
% (1) Deriving accurate equations of motion that use well defined
% parameters.
% (2) Use reasonable initial guesses for parameter values.
%
% Some related requirements are:
% - Parameterize the equation in a well conditioned way. Avoid parameters
% that can potentially make the output values or state derivative values
% infinite for finite parameter values.
% - Reduce search space by fixing known parameters or prescribing min/max
% bounds on free parameters.
%% Simulink Blocks For Simulation and Code Generation
% Identified nonlinear models can be simulated using the "sim" command.
% They can be linearized about chosen operating points. They can be
% exported to Simulink. System Identification Toolbox provides dedicated
% blocks that facilitate importing estimated models into Simulink.
% Simulation of estimated models
sim(mnonlin8, getexp(MultiExpData,1))
sim(Model2, getexp(MultiExpData,2))
sim(hw8, getexp(ValidationData,3))
% Simulink block library
slident
open_system('Models_in_Simulink') % an example diagram
%%
% RTW supports code generation for black box models, as long as custom
% regressors are not used in them.