| MATLAB® | ![]() |
| On this page… |
|---|
Data Brushing with the Variable Editor |
Shared variables in linked figures are highlighted in all figures when data in one is brushed. They also highlight when you open the variables in the Variable Editor.
The Variable Editor also has a Data Brushing tool. It has no Data Linking tool, however, because in the Variable Editor, variables are always "live," and their data sources therefore respond immediately to any changes you make in the Variable Editor. This means that whenever you place it in data brushing mode, brush marks and changes to data values you make in the Variable Editor appear in linked plots.
If you have linked plots of matrix data with observations across rows and where each column represents a distinct, related quantity, brushing any observation—whether in a graph or the Variable Editor—highlights all observations in the same row, as you can see in the next image.

For more information about the using the Variable Editor, see Viewing and Editing Workspace Variables with the Variable Editor in the MATLAB® Desktop Tools and Development Environment documentation and see the reference page for openvar.
A datatip is a small display associated with an axes that reads
out individual data observation values from a 2-D or 3-D graph. You
create datatips by mouse clicks on graphs using the Data
Cursor tool
from the figure toolbar.
When you select this tool, you are in data cursor mode—signified
by a hollow cross-hair cursor—in which you identify x-, y-, and z-values
of data points you click. Like data points you brush, export such
values to the workspace.
For descriptions of data cursor properties and how to use them, see Data Cursor — Displaying Data Values Interactively in the MATLAB Graphics documentation and see the reference page for datacursormode.
The default behavior of datatips is to simply display the XData, YData, and ZData values of the selected observations as text in a box. Sometimes this information is not helpful by itself, and you might want to replace or augment it with other information. You can modify this behavior to display other facts connected to observations. You customize datatip behavior by constructing a datatip text update function (in M-code) to construct text strings for display in datatips and then instructing data cursor mode to use your function instead of the default one.
Customize data cursor update functions to display information such as
Names associated with x-, y-, and z-values
Weights associated with x-, y-, and z-values
Differences in x-, y-, and z-values from the mean or their neighbors
Transformations of values (e.g., normalizations or to different units of measure)
Related variables
You can create datatip text update functions to display such information and change their behavior on the fly. You can even make the update function behave differently for distinct observations in the same graph if your update function or the code calling it can distinguish groups of them. The next section contains an example of coding and using a customized data cursor update function.
The extended example that follows begins by using datatips to explore the incidence of fatal traffic accidents tabulated for U.S. states, with respect to state populations. The example extends this analysis to brush, link, and map the data to discover spatial patterns in the data. Each section of the example has four or fewer steps. By executing them all, you gain insight into the data set and become familiar with useful graphical data exploration techniques.
Censuses of population and other national government statistics are valuable sources of demographic and socioeconomic data. An important aspect of census data is its geography, i.e., the regions to which a given set of statistics applies, and at what level of granularity. When exploring census data, you frequently need to identify what geographic unit any given observation represents.
This example uses datatips to show place names and statistics for individual observations. You pass place names and the data matrix to a custom text update function to enable this. The place names are for U.S. states and the District of Columbia. If all these names were placed as labels on the x-axis, they would be too small or too crowded to be legible, but they are readable one at a time as datatips.
The example also illustrates how sorting a data matrix by rows can enhance interpretation when the original ordering (in this case alphabetical by state) provides no special insight into relationships among observations and variables.
Datatips can present other information beyond x-, y- and z-values. Read through the example function labeldtips, which takes three more parameters than a default callback, and displays the following information:
Its y-value
Deviation from an expected y-value
Percent deviation from the expected y-value
The observation's label (state name)
Because it customizes datatips, the function must be an M-file that you invoke from the Command Window or from a script.
function output_txt = labeldtips(obj,event_obj,...
xydata,labels,xymean)
% Display an observation's Y-data and label for a datatip
% obj Currently not used (empty)
% event_obj Handle to event object
% xydata Entire data matrix
% labels State names identifying matrix row
% xymean Ratio of y to x mean (avg. for all obs.)
% output_txt Datatip text (string or string cell array)
% This datacursor callback calculates a deviation from the
% expected value and displays it, Y, and a label taken
% from the cell array 'labels'; the data matrix is needed
% to determine the index of the x-value for looking up the
% label for that row. X values could be output, but are not.
pos = get(event_obj,'Position');
x = pos(1); y = pos(2);
output_txt = {['Y: ',num2str(y,4)]};
ydev = round((y - x*xymean));
ypct = round((100 * ydev) / (x*xymean));
output_txt{end+1} = ['Yobs-Yexp: ' num2str(ydev) ...
'; Pct. dev: ' num2str(ypct)];
idx = find(xydata == x,1); % Find index to retrieve obs. name
% The find is reliable only if there are no duplicate x values
[row,col] = ind2sub(size(xydata),idx);
output_txt{end+1} = cell2mat(labels(row));
Copy this code into an M-file and save it as labeldtips.m in your working directory or somewhere on your MATLAB path.
To use this update function, first declare it as a callback in a data cursor object:
hdt = datacursormode;
set(hdt,'UpdateFcn',{@labeldtips,hwydata,statelabel,usmean})
hdt is the handle of a data cursor mode object for the figure you want to explore; declare the function's name and formal arguments as a cell array. The call to datacursormode puts the current figure in data cursor mode.
The following steps show how you load statistical data for U.S. states, plot some of it, and enter data cursor mode to explore the data:
Load U.S. state data statistics from the National Transportation Safety Highway Administration and the Bureau of the Census and look at the variables:
load 'accidents.mat' whos Name Size Bytes Class datasources 3x1 2568 cell hwycols 1x1 8 double hwydata 51x17 6936 double hwyheaders 1x17 1874 cell hwyidx 51x1 408 double hwyrows 1x1 8 double statelabel 51x1 3944 cell ushwydata 1x17 136 double uslabel 1x1 86 cell
The data set has 51 observations for 17 variables.
The state-by-state statistics; the double 51-by-17 matrix hwydata
The variable (column) names; the 1-by-17 text cell array hwyheaders
The state names; the 51-by-1 text cell array statelabel
Values for the entire United States for the 17 variables; the 1-by-17 matrix ushwydata
The label for the US values; the 1-by-1 cell array uslabel
Metadata describing data sources; the 3-by-1 cell array datasources
(Not required) To help you interpret
graphs of it, the data matrix and labels have been presorted by rows
to be in ascending order of total state population. The 51-by-1 vector hwyidx contains indices from the presorting (the data were
originally in alphabetic order)
You
should not carry out this step now, but if you ever want to resort
the rows of the data array and state labels alphabetically, you could
do the following:
[hwydata hwyidx] = sortrows(hwydata,1); statelabel = statelabel(hwyidx);
(The first column of the hwydata matrix contains Census Bureau state IDs that ascend in alphabetical order.)
Plot a line graph of the population by state as x versus the number of traffic fatalities per state as y:
hf1 = figure; plot(hwydata(:,14),hwydata(:,4)); xlabel(hwyheaders(14)) ylabel(hwyheaders(4))
Because the state observations are sorted by population size, the graph is monotonic in x. The larger a population a state has, the more variation in traffic accident fatalities it tends to show.

Compute the per capita rate of traffic fatalities for the entire United States; in the next part of this example, the data cursor update function uses this average to compute an expected value for each state you query:
usmean = ushwydata(4)/ushwydata(14) usmean = 1.5150e-004
The statistic shows that nationally, about 150 per 100,0000 people die in traffic accidents every year.
Use usmean to compute the smallest and largest expected values by multiplying it by the smallest and largest state populations, and draw a line connecting them:
line([min(hwydata(:,14)) max(hwydata(:,14))],...
[min(hwydata(:,14))*usmean max(hwydata(:,14)*usmean)],...
'Color','m');

Note The magenta line is not a regression line; it is a trend line that plots the number of traffic deaths that a state of a given size would have if all states obeyed the national average. |
You can now explore the graphed data with the example custom data cursor update function labeldtips (which must be on the MATLAB path or in the current directory). labeldtips displays state names and y-deviations.
Turn on data cursor mode and invoke the custom callback:
hdt = datacursormode;
set(hdt,'DisplayStyle','window');
% Declare a custom datatip update function to display state names
set(hdt,'UpdateFcn',{@labeldtips,hwydata,statelabel,usmean})The data cursor 'window' display style sends datatip output to a small window that you can move anywhere within the figure. This display style is best suited to datatips that contain more text than just x-, y-, and z-values. The labeldtips callback remains active for that figure until you use set to replace it with another function (or empty, to restore the default data cursor behavior). Click the right-most point on the blue graph.

The datatip shows that California has the largest population and the largest number of traffic fatalities, 4120. However, it had 1012, or 20%, fewer fatalities than predicted by the national average.
The next data point to the left depicts Texas. Click that data point or press the left arrow to show its datatip.

Texas had 3583 fatalities, which is 424 (13%) more than the expected value. To see results from other states, move the datatip by dragging the black square or using the left or right arrow to step it along the graph. If you know a little about U.S. geography, you might observe a pattern.
The ninth column of hwydata, labeled "Fatalities per 100K Licensed Drivers," is related to population. Plot a histogram of this variable to see which states have fewer or more fatalities per driver. To do this, link the plots to their data, and brush either of them.
Open a new figure and plot a histogram of Fatalities per 100K Licensed Drivers in it:
hf2 = figure hist(hwydata(:,9),5) xlabel(hwyheaders(9))
Link both the line graph and the histogram to their data sources in hwydata:
linkdata(hf1) linkdata(hf2)
You can also click the Data
Linking tool
on the two figures. The
first figure links automatically; the histogram does not because linkdata cannot determine with certainty the YDataSource for histograms. The Linked Plot information
bar on top of the histogram informs you No Graphics have
data sources. Cannot link plot: fix it.
Click fix it to open the Specify Data Source Properties dialog box. Type hwydata(:,9) into the YDataSource edit box and click OK.

The Linked Plot information bar displays the data source you identified. The histogram looks like this.

Now that you have linked both graphs to a common data set, you can brush portions of one to see the effect on the other.
It isn't necessary, but you might want to dock the plots in a figure group so you can see them side by side.
Select the Data Brushing tool
on
the histogram plot. Brush the three right-most bars in the histogram;
they represent higher values that range from 25 to 48 fatalities per
100,000 drivers.

Notice which observations light up on the line graph. Not only are these states with smaller populations, they are also states with above-average numbers of traffic fatalities.
Click the line graph to make it the active figure and select its Data Brushing tool. Click all the observations you can that fall below the straight line average. You need to hold the Shift key down to make multiple selections, whether by clicking or dragging. You might want to zoom in on the left side of the graph to brush properly there. What do you see happening on the histogram?
The hwydata matrix contains geographic location information in the form of latitude-longitude coordinates of a centroid for each state. You can make a crude map by generating a scatter plot of these coordinates, using longitude as x and latitude as y. If you link the scatter plot, you can brush all the plots at once.
To provide a context for the map, plot an outline map of the conterminous United State. Obtain the latitude and longitude coordinates required from the demo MAT-file uspoly.mat:
hf3 = figure; load usapolygon patch(uslon,uslat,[1 .9 .8],'Edgecolor','none'); hold on

When projected into the figure. the map is distorted to fit the aspect ratio of the axes.
Map the centroid longitude and latitude as a scatter plot with filled circles. Plot a rectangle over part of the map, as follows:
scatter(hwydata(:,2),hwydata(:,3),36,'b','filled');
xlabel('Longitude')
ylabel('Latitude')
rectangle('Position',[-115,25,115-77,36-25],...
'EdgeColor',[.75 .75 .75])

The x- and y-limits change, shrinking the map, because the data matrix contains observations for Alaska and Hawaii, but the map outline file does not include these states.
Dock the map underneath the other two figures. Brush the map after turning on the Data Linking and Data Brushing tools for its figure. Drag across the gray rectangle with the Data Brushing tool to highlight just the southeastern and southwestern states. What you see should look like this.

Data brushing and linking reveals that almost all the states with above-average traffic fatality rates are in the southern part of the U.S.
Using graphic data exploration, you have identified some intriguing regularities in this data. However, you have not identified any causes for the patterns you found. That will take more work on with the data, and possibly additional data sets, along with some hypotheses and models.
![]() | Making Graphs Responsive with Data Linking | Regression Analysis | ![]() |
| © 1984-2008- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |