|
|
|
| R2011b Documentation → Model-Based Calibration Toolbox | |
Learn more about Model-Based Calibration Toolbox |
|
| Contents | Index |
| On this page… |
|---|
Introducing Matching Data to Designs |
We provide an example project to illustrate the process of matching experimental data to designs.
Experimental data is unlikely to be identical to the desired design points. You can use the Cluster Plot view in the Data Editor to compare the actual data collected with your experimental design points. Here you can select data for modeling. If you are interested in collecting more data, you can update your experimental design by matching data to design points to reflect the actual data collected. You can then optimally augment your design (using the Design Editor) to decide which data points it would be most useful to collect, based on the data obtained so far.
You can use an iterative process: make a design, collect some data, match that data with your design points, modify your design accordingly, then collect more data, and so on. You can use this process to optimize your data collection process in order to obtain the most robust models possible with the minimum amount of data.
To see the data matching functions, select File > Open Project and browse to the file Data_Matching.mat in the mbctraining directory.
Click the Spark Sweeps node in the model tree to change to the test plan view, as shown.
Here you can see the two-stage test plan with model types and inputs set up. The global model has an associated experimental design (which you could view in the Design Editor). You are going to use the Data Editor to examine how closely the data collected so far matches to the experimental design.

Click the Select Data button (
) in the toolbar.
The Data Editor appears.
You need a Cluster View to examine design and data points. Right-click a view in the Data Editor and select Current View > Cluster View.

In the Cluster Plot you can see colored areas containing points. These are "clusters" where closely matching design and data points have been selected by the matching algorithm.
Tolerance values (derived initially from a proportion of the ranges of the variables) are used to determine if any data points lie within tolerance of each design point. Data points that lie within tolerance of any design point are matched to that cluster. Data points that fall inside the tolerance of more than one design point form a single cluster containing all those design and data points. If no data points lie within tolerance of a design point, it remains unmatched and no cluster is plotted.

Notice the shape formed by overlapping clusters. The example shown outlined in pink is a single cluster formed where a data point lies within tolerance of two design points.
Note that on this plot you can see other unselected points that appear to be contained within this cluster. You need to track points through other factor dimensions using the axis controls to see where points are separated beyond tolerance. You will do this in a later step of this tutorial, Understanding Clusters.
To edit tolerance values, select Tools > Tolerances.
The Tolerance Editor appears. Here you can change the size of clusters in each dimension. Observe that the LOAD tolerance value is currently 100. This accounts for the elongated shape (in the LOAD dimension) of the clusters in the current plot, because this tolerance value is a high proportion of the total range of this variable.

Click the LOAD edit box and enter 20, as shown. Click OK.
Notice the change in shape of the clusters in the Cluster Plot view.

Shift click (center-click) and drag to zoom in on an area of the plot, as shown. You can double-click to return to the full size plot.

Click a cluster to select it. Selected points or clusters are outlined in pink. If you click and hold, you can inspect the values of global variables at the selected points (or for all data and design points if you click on a cluster). You can use this information to help you decide on suitable tolerance values if you are trying to match points.
You need to ensure you are displaying a Cluster Information list view to select or exclude points. The Data Editor retains memory of previous data views and if you had a cluster plot in your saved settings then this plot is used.
If you do not already have a Cluster Information list view displayed, right-click the Cluster Plot view and select Split Vertically. A new view appears underneath the cluster plot. Right-click the new view and select Current Plot > Cluster Information.

Notice that the Cluster Information list view shows the details of all data and design points contained in the selected cluster. You use the check boxes here to select or exclude data or design points. Click different clusters to see a variety of points. The list view shows the values of global variables at each point, and which data and design points are within tolerance of each other. Your selections here determine which data will be used for modeling, and which design points will be replaced by actual data points.
If you are not interested in collecting more data, then there is no need to make sure the design is modified to reflect the actual data. All data (except those you exclude by clearing the check boxes) will be used for modeling.
However, if you want your new design (called Actual Design) to accurately reflect what data has been obtained so far, for example to collect more data, then the cluster matching is important. All data points with a selected check box will be added to the new Actual Design, except those in red clusters. The color of clusters indicates what proportion of selected points it contains as follows:
Green clusters have equal numbers of selected design and selected data points. The data points will replace the design points in the Actual Design.
Note that the color of all clusters is determined by the proportion of selected points they contain; excluded points (with cleared check boxes) have no effect. Your check box selections can change cluster color.
Blue clusters have more data points than design points. All the data points will replace the design points in the Actual Design.
Red clusters have more design points than data points. These data points will not be added to your design as the algorithm cannot choose which design points to replace, so you must manually make selections to deal with red clusters if you want to use these data points in your design. The example Cluster Information list view shows a selected red cluster with more design than data points.
If you don't care about the Actual Design (for example, if you do not intend to collect more data) and you are just selecting data for modeling, then you can ignore red clusters. The data points in red clusters are selected for modeling.
Right-click the Cluster Plot and select Viewer Options > Select Unmatched Data. Notice that the remaining unmatched data points appear in the Cluster Information list view. Here you can use the check boxes to select or exclude unmatched data in the same way as points within clusters.
Select a cluster, then use the drop-down menu to change the Y-Axis factor to INJ. Observe the selected cluster now plotted in the new factor dimensions of SPEED and INJ.
You can use this method to track points and clusters through the dimensions. This can give you a good idea of which tolerances to change in order to get points matched. Remember that points that do not form a cluster may appear to be perfectly matched when viewed in one pair of dimensions; you must view them in other dimensions to find out where they are separated beyond the tolerance value. You can use this tracking process to decide whether you want particular pairs of points to be matched, and then change the tolerances until they form part of a cluster.
Clear the Equal Data and Design check box in the Cluster Plot view. You control what is plotted using these check boxes.

This removes the green clusters from view, as shown. These clusters are matched; you are more likely to be interested in unmatched points and clusters with uneven numbers of data and design points. Removing the green clusters allows you to focus on these points of interest. If you want your new Actual Design to accurately reflect your current data, your aim is to get as many data points matched up to design points as possible; that is, as few red clusters as possible.
Clear the check box for More Data than Design. You may also decide to ignore blue clusters, which contain more data points than design points. These design points will be replaced by all data points within the cluster. An excess of data points is unlikely to be a concern.
However, blue clusters may indicate that there was a problem with the data collection at that point, and you may want to investigate why more points than expected were collected.

Select one of the remaining red clusters. Both of these have two design points within tolerance of a single data point.
Choose one of the design points to match to the data point, then clear the check box of the other design point. The cleared design point remains unchanged in the design. The selected design point will be replaced by the matched data point.
Notice that the red cluster disappears. This is because your selection results in a cluster with an equal number of selected data and design points (a green cluster) and your current plot does not display green clusters.
Repeat for the other red cluster.
Now all clusters are green or blue. There are two remaining unmatched data points.
Clear the Unmatched Design check box to locate the unmatched data points. Select Unmatched Design check box again — you need to see design points to decide if any are close enough to the data points that they should be matched.
Locate and zoom in on an unmatched data point. Select the unmatched data point and a nearby design point by clicking, then use the axis drop-down menus to track the candidate pair through the dimensions. Decide if any design points are close enough to warrant changing the tolerance values to match the point with a design point.
Recall that you can right-click the Cluster View and select Viewer Options > Select Unmatched Datato display the remaining unmatched data points in the Cluster Information list view. Here you can use the check boxes to select or exclude these points. If you leave them selected, they will be added to the Actual Design.
These steps illustrate the process of matching data to designs, to select modeling data and to augment your design based on actual data obtained. Some trial and error is necessary to find useful tolerance values. You can select points and change plot dimensions to help you find suitable values. If you want your new Actual Design to accurately reflect your experimental data, you need to make choices to deal with red clusters. Select which design points in red clusters you want to replace with the data points. If you do not, then these data points will not be added to the new design.
When you are satisfied that you have selected all the data you want for modeling, close the Data Editor. At this point, your choices in the cluster plots will be applied to the data set and a new design called Actual Design will be created. All the changes are determined by your check box selections for data and design points.
All data points with a selected check box are selected for modeling. Data points with cleared check boxes are excluded from the data set. Changes are made to the existing design to produce the new Actual Design. All selected data will be added to your new design, except those in red clusters. Selected data points that have been matched to design points (in green and blue clusters) replace those design points.
All these selected data points become fixed design points (red in the Design Editor) and appear as Data in Design (pink crosses) when you reopen the Data Editor.
This means these points will not be included in clusters when matching again. These fixed points will also not be changed in the Design Editor when you add points, though you can unlock fixed points if you want. This can be very useful if you want to optimally augment a design, taking into account the data you have already obtained.
See the reference section Matching Data to Designs for more information.
See also the reference section on all aspects of data handling in the toolbox, Data.
![]() | Test Groupings | Tutorial: Feature Calibration | ![]() |

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.
| © 1984-2012- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |