MATLAB Examples

Create Geographic Bubble Chart from Tabular Data

Geographic bubble charts are a way to visualize data overlaid on a map. For data with geographic characteristics, these charts can provide much-needed context. In this example, you import a file into MATLAB® as a table and create a geographic bubble chart from the table variables (columns). Then you work with the data in the table to visualize aspects of the data, such as population size.

Contents

Import File as Table

Load the sample file counties.xlsx, which contains records of population and Lyme disease occurrences by county in New England. Read the data into a table using readtable.

counties = readtable('counties.xlsx');

Create Basic Geographic Bubble Chart

Create a geographic bubble chart that shows the locations of counties in New England. Specify the table as the first argument, counties. The geographic bubble chart stores the table in its SourceTable property. The example displays the first five rows of the table. Use the 'Latitude' and 'Longitude' columns of the table to specify locations. The chart automatically sets the latitude and longitude limits of the underlying map, called the basemap, to include only those areas represented by the data. Assign the GeographicBubbleChart object to the variable gb. Use gb to modify the chart after it is created.

figure
gb = geobubble(counties,'Latitude','Longitude');
head(gb.SourceTable, 5)
ans =

  5x19 table

    FIPS     ANSICODE     Latitude    Longitude        CountyName         State      StateName      Population2010    HousingUnits2010     LandArea     WaterArea     Cases2010    Cases2011    Cases2012    Cases2013    Cases2014    Cases2015    Cases2014_1    Cases2015_1
    ____    __________    ________    _________    ___________________    _____    _____________    ______________    ________________    __________    __________    _________    _________    _________    _________    _________    _________    ___________    ___________

    9001    2.1279e+05    41.228      -73.367      'Fairfield County'     'CT'     'Connecticut'    9.1683e+05        3.6122e+05          1.6185e+09    5.4916e+08    331          305          225          443          437          427          437            427        
    9003    2.1234e+05    41.806      -72.733      'Hartford County'      'CT'     'Connecticut'    8.9401e+05        3.7425e+05          1.9039e+09    4.0213e+07    187          167          143          288          291          335          291            335        
    9005     2.128e+05    41.792      -73.235      'Litchfield County'    'CT'     'Connecticut'    1.8993e+05             87550          2.3842e+09    6.2166e+07     88          118           67          187          168          202          168            202        
    9007     2.128e+05    41.435      -72.524      'Middlesex County'     'CT'     'Connecticut'    1.6568e+05             74837          9.5649e+08    1.8068e+08    125          109           93          181          155          241          155            241        
    9009     2.128e+05     41.35        -72.9      'New Haven County'     'CT'     'Connecticut'    8.6248e+05          3.62e+05          1.5657e+09    6.6705e+08    240          249          213          388          459          474          459            474        

You can pan and zoom in and out on the basemap displayed by the geobubble function. geobubble displays the data over a default basemap. To use another basemap, you must have an Internet connection or you must have previously downloaded the basemaps from MathWorks.

Visualize County Populations on the Chart

Use bubble size (diameter) to indicate the relative populations of the different counties. Specify the Population2010 variable in the table as the value of the SizeVariable parameter. In the resultant geographic bubble chart, the bubbles have different sizes to indicate population. The chart includes a legend that describes how diameter expresses size.

figure
gb = geobubble(counties,'Latitude','Longitude',...
                        'SizeVariable','Population2010');

geobubble scales the bubble diameters linearly between the values specified by the SizeLimits property.

Visualize Lyme Disease Cases by County

Use bubble color to show the number of Lyme disease cases in a county for a given year. To display this type of data, the geobubble function requires that the data be a categorical value. Initially, none of the columns in the table are categorical but you can create one. For example, you can the discretize function to create a categorical variable from the data in the Cases2010 variable. The new variable, named Severity, groups the data into three categories: Low, Medium, and High. Use this new variable as the ColorVariable parameter. These changes modify the table stored in the SourceTable property, which is a copy of the original table in the workspace, counties. Making changes to the table stored in the GeographicBubbleChart object avoids affecting the original data.

gb.SourceTable.Severity = discretize(counties.Cases2010,[0 50 100 500],...
                                 'categorical', {'Low', 'Medium', 'High'});
gb.ColorVariable = 'Severity';

Handle Undefined Data

When you plot the severity information, a fourth category appears in the color legend: undefined. This category can appear when the data you cast to categorical contains empty values or values that are out of scope. Judging from the size of the undefined bubble, it appears that the case occurrence data may have more than 500 cases. Because our High category is intended to include all values above 100, set the bin edge to be high enough (5000 instead of 500) to include this data. This change eliminates the undefined category.

gb.SourceTable.Severity = discretize(counties.Cases2010,[0 50 100 5000],...
                                 'categorical', {'Low', 'Medium', 'High'});

Unlike the color variable, when geobubble encounters an undefined number (NaN) in the size, latitude, or longitude variables, it ignores the value.

Choose Bubble Colors

Use a color gradient to represent the Low-Medium-High categorization. geobubble stores the colors as an m-by-3 list of RGB values in the BubbleColorList property.

gb.BubbleColorList = autumn(3);

Reorder Bubble Colors

Change the color indicating high severity to be red rather than yellow. To change the color order, you can change the ordering of either the categories or the colors listed in the BubbleColorList property. For example, initially the categories are ordered Low-Medium-High. Use the reordercats function to change the categories to High-Medium-Low. The categories change in the color legend.

neworder = {'High','Medium','Low'};
gb.SourceTable.Severity = reordercats(gb.SourceTable.Severity,neworder);

Adding Titles

When you display a geographic bubble chart with size and color variables, the chart displays a size legend and color legend to indicate what the relative sizes and colors mean. When you specify a table as an argument, geobubble automatically uses the table variable names as legend titles, but you can specify other titles using properties.

title 'Lyme Disease in New England, 2010'
gb.SizeLegendTitle = 'County Population';
gb.ColorLegendTitle = 'Lyme Disease Severity';

Refine Chart Data

Looking at the Lyme disease data, the trend appears to be that more cases occur in more densely populated areas. Looking at locations with the most cases per capita might be more interesting. Calculate the cases per 1000 people and display it on the chart.

gb.SourceTable.CasesPer1000 = gb.SourceTable.Cases2010 ./ gb.SourceTable.Population2010 * 1000;
gb.SizeVariable = 'CasesPer1000';
gb.SizeLegendTitle = 'Cases Per 1000';

The bubble sizes now tell a different story than before. The areas with the largest populations tracked relatively well with the different severity levels. However, when looking at the number of cases normalized by population, it appears that the highest risk per capita has a different geographic distribution.