MATLAB Examples

Fuel Economy Analysis

This demo is an example of performing data mining on historical fuel economy data. We have data from various cars built from year 2000 up to 2012.

Import Data into Table

Import from Excel using modified auto-generated function from Import Tool

```carData = importYearXLS(2007); ```

Table Summary

Display basic statistical summary

```summary(carData(:,{'RatedHP','MPG', 'CO2'})) ```
```Variables: RatedHP: 2595x1 double Values: min 76 median 236 max 631 MPG: 2595x1 double Values: min 9.8 median 24.8 max 66.6 CO2: 2595x1 double Values: min 131 median 352 max 878 NaNs 257 ```

Visualize

Plot MPG versus Rated Horsepower

```createMPGFigure(carData.RatedHP, carData.MPG); ```

Examine Grouping Effects of Categorical Data

```% Convert Car-Truck and City-Highway to categoricals carData.Car_Truck = categorical(carData.Car_Truck); carData.City_Highway = categorical(carData.City_Highway); % In order to extract all "cars": carIDs = carData.Car_Truck == 'car'; % In order to extract "city" data for "trucks": city_truckIDs = (carData.City_Highway == 'city' & carData.Car_Truck == 'truck'); % City versus Highway cityIDs = carData.City_Highway == 'city'; highwayIDs = carData.City_Highway == 'highway'; ```

Distributions

Examine the distribution of MPG grouped by City or Highway

```mpgDistribution(carData.MPG(cityIDs), carData.MPG(highwayIDs)) ```

Grouped Visualizations

Scatter plot by group.

```figure gscatter(carData.RatedHP, carData.MPG, ... {carData.Car_Truck, carData.City_Highway}, ... '', '.', 10, 'on', 'Rated Horsepower', 'MPG') ```

Look at additional data: Engine Compression and CO2.

Then show a matrix of scatter plots by group

```figure gplotmatrix([carData.RatedHP, carData.Comp], [carData.MPG, carData.CO2], ... {carData.Car_Truck, carData.City_Highway}, ... '', '.', 10, 'on', '', {'Rated Horsepower', 'Compression'}, {'MPG', 'CO2'}) ```

Grouped Statistics

Perform group statistics based on specified grouping variables.

```varfun(@mean, carData,'InputVariables',{'RatedHP', 'MPG'},... 'GroupingVariables',{'City_Highway', 'Car_Truck'}) ```
```ans = City_Highway Car_Truck GroupCount mean_RatedHP mean_MPG ____________ _________ __________ ____________ ________ city_car city car 672 253.17 22.693 city_truck city truck 627 246.28 18.501 highway_car highway car 671 251.09 35.542 highway_truck highway truck 625 246.76 27.459 ```

Analysis of Variance (ANOVA)

One way, 2-way, and n-way ANOVA are available.

```anovan(carData.MPG, {carData.Car_Truck, carData.City_Highway}, ... 'varnames', {'Veh. Type', 'MPG Type'}, ... 'model', 'interaction'); ```

Boxplots

Boxplots are integral part of grouped statistics. It provides useful visualization for grouping effects.

```figure boxplot(carData.MPG, {carData.Car_Truck, carData.City_Highway}, 'notch','on') ```

Extract Data for Curve Fitting

Create these variables for Curve Fitting App

```RatedHPCity = carData.RatedHP(cityIDs); MPGCity = carData.MPG(cityIDs); % Use the App to develop a curve fit. ```

Curve Fitting

Equation:

```MPG = b1 + b2 * 1/RatedHP
```

We can solve this using the Curve Fitting Tool

```cftool(carData.RatedHP, carData.MPG)
```

The following is a modified version of the auto-generated m-file from cftool.

```cf = createMPGFit(carData.RatedHP, carData.MPG); ```

Plot Data and Model

The result from the Curve Fitting Toolbox has a plot method for displaying the result graphically. We can choose to display the prediction bounds for the fit.

```figure hh = plot(cf, 'r', carData.RatedHP, carData.MPG, 'predobs', 0.95); hh(2).LineWidth = 2; for ii = [3 4] hh(ii).LineStyle = '-'; hh(ii).Color = [0 0.5 0]; end ```

Plot of Data and Model (for different groups)

We will apply the similar modeling technique to the data for different combinations of groups (Car-Truck and City-Highway)

```% Model different combinations modelMPG(carData, 'car', 'city') modelMPG(carData, 'car', 'highway') modelMPG(carData, 'truck', 'city') modelMPG(carData, 'truck', 'highway') ```
```ans = Linear model: ans(x) = a + b*1/x Coefficients (with 95% confidence bounds): a = 10.12 (9.528, 10.72) b = 2663 (2546, 2779) ans = Linear model: ans(x) = a + b*1/x Coefficients (with 95% confidence bounds): a = 21.33 (20.58, 22.09) b = 3005 (2857, 3153) ans = Linear model: ans(x) = a + b*1/x Coefficients (with 95% confidence bounds): a = 8.473 (7.579, 9.368) b = 2314 (2115, 2514) ans = Linear model: ans(x) = a + b*1/x Coefficients (with 95% confidence bounds): a = 16.26 (15.11, 17.42) b = 2589 (2332, 2846) ```