Products & Services Solutions Academia Support User Community Company

Learn more about Model-Based Calibration   

Statistics

Overview of Radial Basis Function Statistics

Let A be the matrix such that the weights are given by where X is the regression matrix. The form of A varies depending on the basic fit algorithm employed.

In the case of ordinary least squares, we have A = X'X.

For ridge regression (with regularization parameter ), A is given by A = X'X + I

Next is the Rols algorithm. During the Rols algorithm X is decomposed using the Gram-Schmidt algorithm to give X = WB, where W has orthogonal columns and B is upper triangular. The corresponding matrix A for Rols is then .

The matrix is called the hat matrix, and the leverage of the ith data point hi is given by the ith diagonal element of H. All the statistics derived from the hat matrix, for example, PRESS, studentized residuals, confidence intervals, and Cook's distance, are computed using the hat matrix appropriate to the particular fit algorithm.

Similarly PEV, given in the Toolbox Terms and Statistics Definitions as

becomes

PEV is computed using the form of A appropriate to the particular fit algorithm (ordinary least squares, ridge or rols).

GCV Criterion

Generalized cross-validation (GCV) is a measure of the goodness of fit of a model to the data that is minimized when the residuals are small, but not so small that the network has overfitted the data. It is easy to compute, and networks with small GCV values should have good predictive capability. It is related to the PRESS statistic.

The definition of GCV is given by Orr (4, see References).

where y is the target vector, N is the number of observations, and P is the projection matrix, given by I - XA-1XT. See Statistics for definition of A.

An important feature of using GCV as a criterion for determining the optimal network in our fit algorithms is the existence of update formulas for the regularization parameter . These update formulas are obtained by differentiating GCV with respect to and setting the result to zero. That is, they are based on gradient-descent.

This gives the general equation (from Orr, 6, References)

We now specialize these formulas to the case of ridge regression and to the Rols algorithm.

GCV for Ridge Regression

It is shown in Orr (4), and stated in Orr (5, see References) that for the case of ridge regression GCV can be written as

where is the "effective number of parameters" that is given by

where NumTerms is the number of terms included in the model.

For RBFs, 'p' is the effective number of parameters, that is, the number of terms minus an adjustment to take into account the smoothing effect of lambda in the fitting algorithm. When lambda = 0, the effective number of parameters is the same as the number of terms.

The formula for updating is given by where

In practice, the preceding formulas are not used explicitly in Orr (5, see References). Instead a singular value decomposition of X is made, and the formulas are rewritten in terms of the eigenvalues and eigenvectors of the matrix XX'. This avoids taking the inverse of the matrix A, and it can be used to cheaply compute GCV for many values of . See Statistics for definition of A.

GCV for Rols

In the case of Rols, the components for the formula

are computed using the formulas given in Orr [6; see References]. Recall that the regression matrix is factored during the Rols algorithm into the product X = WB. Let wj denote the jth column of W, then we have

and the "effective number of parameters" is given by

This is equivalent to 'p' (the effective number of parameters) defined in GCV for Ridge Regression.

The reestimation formula for is given by

where additionally

and

Note that these formulas for Rols do not require the explicit inversion of A. See Statistics for definition of A.

References

  1. Chen, S., Chng, E.S., Alkadhimi, Regularized Orthogonal Least Squares Algorithm for Constructing Radial Basis Function Networks, Int J. Control, 1996, Vol. 64, No. 5, pp. 829-837.

  2. Hassoun, M., Fundamentals of Artificial Neural Networks, MIT, 1995.

  3. Orr, M., Introduction to Radial Basis Function Networks, available from http://www.anc.ed.ac.uk/rbf/rbf.html.

  4. Orr, M., Optimizing the Widths of Radial Basis Functions, available from http://www.anc.ed.ac.uk/rbf/rbf.html.

  5. Orr, M., Regularisation in the Selection of Radial Basis Function Centers, available from http://www.anc.ed.ac.uk/rbf/rbf.html.

  6. Wendland, H., Piecewise Polynomials, Positive Definite and Compactly Supported Radial Basis Functions of Minimal Degree, Advances in Computational Mathematics 4 (1995), pp. 389-396.

  


Recommended Products

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.

 © 1984-2009- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS