Skip to Main Content Skip to Search
Product Documentation

Local Models

What Is Local Level?

When you select a local node (with the icon) in the model tree, the local level view appears. At the local level you can:

Note that after the two-stage model is calculated the local node icon changes to a two-stage icon ( ) to reflect this. See the model tree for clarification. The response node also has a two-stage icon, but produces the response level view instead.

The following example shows a local model of torque/spark curves. In this example the two-stage model has been calculated so you can compare the local fit and the two-stage fit on the plots.

The default view is the Model tab, with plots described below. You can click the Data tab to view plots of other variables. See Data Tab.

Using Local Model Plots

Local Special Plots

You can scroll through all the local models by using the up and down test buttons, type directly in the edit box or go directly to test numbers by clicking Select Test.

The lower plots are referred to as special plots as they can be different for different models.

The lower plot at the local level shows the local model fit to the data for the current test only, with the datum point if there is a datum model. If there are multiple inputs to the local model, a predicted/observed plot is displayed. In this case to examine the model surface in more detail you can use Model > Evaluate. See Model Evaluation Window.

To examine the local fit in more detail, double-click the arrows (indicated in the preceding figure) to hide the scatter plot and expand the lower plot. You can zoom in on parts of the plot by Shift-click-dragging or middle-click-dragging on the place of interest on the plot. Return to full size by double-clicking.

You can select the plot type from the drop-down menu at the top of the plot. Choices here can be:

Above are the right-click context menus for both plots. On both plots you can manipulate outliers with all the same commands available in the Outliers menu. See Outliers Menu (Local Level) for details.

The Print to Figure command opens a MATLAB figure plot showing the current plot. On the special plots you can switch the confidence intervals and legend on and off, and hide or show removed data. For both plots you can switch the display of Record Numbers on and off. This is similar to test number for global models but relates to individual records within tests.

Local Scatter Plots

The upper plots are referred to as scatter plots. They can show various scatter plots of statistics for assessing goodness-of-fit for the current local model shown. Upper scatter plots are replaced by an icon if you resize the Browser too small.

The statistics available for plotting are model dependent.

The preceding is an example drop-down menu on the scatter plot for changing x and y factors. In this case spark is the local input factor and torque is the response. The local inputs, the response, and the predicted response are always available in these menus. The observation number is also always available.

The other options are statistics that are model dependent, and can include residuals, weighted residuals, studentized residuals, and leverage. At local level these are internally studentized residuals.

Data Tab

When you click the Data tab at the local level, or select View > Data Plots, you can view plots of the data for the current test.

In the Plot Variables Setup dialog box, you can choose to view any of the data signals in the data set for the current test (including signals not being used in modeling).

Test Notes Pane

You can use the Test Notes pane to record information on particular tests. Each test has its own notes pane. Data points with notes recorded against them are colored in the global model plots. You choose the color using the Set Color button in the Test Notes pane.

Viewing Local Model Statistics

Diagnostic Statistics Pane

The Diagnostic Statistics pane drop-down menu is shown, where you can select the information to be displayed in the pane.

If there is not enough room there are scroll bars.

Pooled Statistics

These are seen at the local node (when two-stage modeling) in the Pooled Statistics table, and at the response node in the list of local models. If you have a selection of local or two-stage models, use these statistics to help you choose which model is best.

Local RMSE

Root mean squared error between the local model and the data for all tests. The divisor used for RMSE is the number of observations minus the number of parameters.

Two-Stage RMSE

Root mean squared error between the two-stage model and the data for all tests. You want this error to be small for a good model fit.

PRESS RMSE

Root mean squared error of predicted errors, useful for indicating overfitting; see PRESS statistic. The divisor used for PRESS RMSE is the number of observations. Not displayed for MLE models because the simple univariate formula cannot be used.

Two-Stage T^2

T^2 is a normalized sum of squared errors for all the response features models. You can see the basic formula on the Likelihood view of the Model Selection window.

Where , where Ci is the local covariance for test i. See blockdiag diagram following.

A large T^2 value indicates that there is a problem with the response feature models.

-log L

Log-likelihood function: the probability of a set of observations given the value of some parameters. You want the likelihood to be large, tending towards -infinity, so large negative is good.

For n observations x1,x2,..xn, with probability distribution , the likelihood is:

This is the basis of MLE.

which is the same as:

This assumes a normal distribution.

You can view plots of -log L in the Model Selection window, see Likelihood View.

Validation RMSE

Root mean squared error between the two-stage model and the validation data for all tests.

To explain blockdiag as it appears under T^2 in the Pooled statistics table: , where Ci is the local covariance for test i, is calculated as shown below.

Using the RMSE Explorer with Local Models

You can open the RMSE Explorer to view plots of the standard errors of all the tests, both overall and by response feature. This tool can help you quickly identify problem tests. You can navigate to a test of interest from the RMSE Explorer by double-clicking a point in the plot to select the test in the Model Browser local model view.

The plot displays one value of standard error per test, overall and for each response feature. As a best practice, first plot plain s_e against test number to get an idea of how the error is distributed and locate any tests with much higher errors. Right-click to toggle display of test numbers. Ideally, all the standard errors should be roughly the same value to satisfy the statistical assumptions for two-stage models. If these assumptions are not satisfied, error estimates for two-stage models may not be valid.

You can also use the X- and Y-axis factor drop-down lists to plot these standard errors against the global variables to examine the global distribution of error.

Removing Outliers and Updating Fits

Removing and Restoring Outliers

You can use the right-click context menus on all plots or the Outliers menu to remove and restore outliers.

Outliers Menu (Local Level).  

All the commands except Remove All Data and Copy Outliers are also available in the right-click context menus on all plots.

Updating Fits

When you remove an outlier from your local model, it refits immediately. Other dependent fits also need updates. You can choose when to update the other fits. Removing an outlier can affect several other models. Removing an outlier from a best local model changes all the response features for that two-stage model. The global models all change; therefore the two-stage model must be recalculated. For this reason the local model node returns to the local (house) icon and the response node becomes blank again. If the two-stage model has a datum model defined, and other models within the test plan are using a datum link model, they are similarly affected.

To update fits, either:

Outlier Selection Criteria

You can select outliers as those satisfying a condition on the value of some statistic (for example, residual > 3), or by selecting those points that fall in a region of the distribution of values of that statistic.

For example, assume that residuals are normally distributed and select those with p-value > 0.9. You can also select outliers using the values of model input factors.

The drop-down menu labeled Select using contains all the available criteria, shown in the following example.

The options available in this menu change depending on the type of model currently selected. The options are exactly the same as those found in the drop-down menus for the x- and y-axis factors of the scatter plots in the Model Browser (local level and global level views).

In the preceding example, the model selected is the knot response feature, so knot and Predicted knot appear in the criteria list, plus the global input factors; and it is a linear non-MLE model, so Cook's Distance and Leverage are also available.

The range of the selected criteria (for the current data) is indicated above the Value edit box, to give an indication of suitable values. You can type directly in the edit box. You can also use the up/down buttons on this box to change the value (incrementing by about 10% of the range).

Distribution.  You can use the Distribution drop-down menu to remove a proportion of the tail ends of the normal or t distribution. For example, to select residuals found in the tails of the distribution making up 10% of the total area:

Residuals found in the tails of the distribution that make up 10% of the total area are selected. If you had a vast data set, approximately 10% of the residuals would be selected as outliers.

As shown, residuals found beyond the value of in the distribution are selected as outliers. is a measure of significance; that is, the probability of finding residuals beyond is less than 10%. Absolute value is used (the modulus) so outliers are selected in both tails of the distribution.

The t distribution is used for limited degrees of freedom.

If you select None in the Distribution drop-down menu, you can choose whether or not to use the absolute value. That is, you are selecting outliers using the actual values rather than a distribution. Using absolute value allows you to select using magnitude only without taking sign into account (for example, both plus and minus ranges). You can select No here if you are only interested in one direction: positive or negative values, above or below the value entered. For example, selecting only values of speed below 2000 rpm.

The Select using custom MATLAB file check box enables the adjacent edit box. Here you can choose a function file that selects outliers. Type the name of the file and path into the edit box, or use the browse button.

In this file you define a MATLAB function of the form:

function outIndices = funcname (Model, Data, Names)

Model is the current MBC model.

Data is the data used in the scatter plots. For example, if there are currently 10 items in the drop-down menus on the scatter plot and 70 data points, the data make up a 70 x 10 array.

Names is a cell array containing the strings from the drop-down menus on the scatter plot. These label the columns in the data (for example, spark, residuals, leverage, and so on).

The output, outIndices, must be an array of logical indices, the same size as one column in the input Data, so that it contains one index for each data point. Those points where index = 1 in outIndices are highlighted as outliers; the remainder are not highlighted.

Local Level Tools

Local Level: Toolbar

This toolbar appears when a local node is selected in the model tree.

The eight left icons remain constant throughout the levels. They are for project and node management, and the help button, and here the print icon is enabled as there are plots in this view. See Project Level: Toolbar for details on these buttons. In the example shown the slider bar has been dragged to hide the Help button.

Local Level: Menus

File Menu.  Only the New (child node) and Delete (current node) functions change according to the node level currently selected. Otherwise the File menu remains constant.

See File Menu.

Window and Help Menus.  The Window and Help menus remain throughout the Model Browser, offering access to different windows, general help and context help.

See Window Menu and Help Menu.

Model Menu (Local Level).  

See also

View Menu (Local Level).  

Calculating Two-Stage Models and Response Features

The Response Features List at the bottom of the local view shows a list of all the response features calculated for the local model.

To calculate a two-stage model, click the Select button in the Response Features List pane. The Model Selection window opens. This step is required before two-stage models can be calculated. A two-stage model using the local and global models is formed by using Select. After calculating the two-stage model, you can compare the local fit and the two-stage fit on the local level plots.

The list view displays the number of parameters and observations, the value of any Box-Cox transformation (1 indicates no transform), and the values of RMSE and PRESS RMSE (linear models only) for each response feature model. For definitions of RMSE and PRESS RMSE, see Summary Table. For information on Box-Cox transforms, see Box-Cox Transformation.

Click New to add a new response feature model (or Delete to remove one). For more information see the Test Plans List Pane. The contents of this pane change in different views; it always contains the child nodes of the node selected in the Model Tree (and the New, Delete, and Select buttons). At the local level it contains a list of response features.

  


Recommended Products

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.

 © 1984-2012- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS