## Documentation |

This example shows how to use moving average filters and resampling to isolate the effect of periodic components of the time of day on hourly temperature readings, as well as remove unwanted line noise from an open-loop voltage measurement. The example also shows how to remove noise spikes from a signal and replace the missing sample data by using a median filter.

On this page… |
---|

Extracting Average Differences |

Smoothing is how we discover important patterns in our data while leaving out things that are unimportant (i.e. noise). We use filtering to perform this smoothing. The goal of smoothing is to produce slow changes in value so that it's easier to see trends in our data.

Sometimes when you examine input data you may wish to smooth the data in order to see a trend in the signal. In our example we have a set of temperature readings in Celsius taken every hour at Logan Airport for the entire month of January, 2011.

load bostemp days = (1:31*24)/24; plot(days, tempC), axis tight; ylabel('Temp (\circC)'); xlabel('Time elapsed from Jan 1, 2011 (days)'); title('Logan Airport Dry Bulb Temperature (source: NOAA)');

Note that we can visually see the effect that the time of day has upon the temperature readings. If you are only interested in the daily temperature variation over the month, the hourly fluctuations only contribute noise, which can make the daily variations difficult to discern. To remove the effect of the time of day, we would now like to smooth our data by using a moving average filter.

In its simplest form, a moving average filter of length N takes the average of every N consecutive samples of the waveform.

To apply a moving average filter to each data point, we construct our coefficients of our filter so that each point is equally weighted and contributes 1/24 to the total average. This gives us the average temperature over each 24 hour period.

hoursPerDay = 24; coeff24hMA = ones(1, hoursPerDay)/hoursPerDay; avg24hTempC = filter(coeff24hMA, 1, tempC); plot(days, [tempC avg24hTempC]); legend('Hourly Temp','24 Hour Average (delayed)','location','best'); ylabel('Temp (\circC)'); xlabel('Time elapsed from Jan 1, 2011 (days)'); title('Logan Airport Dry Bulb Temperature (source: NOAA)');

Note that the filtered output is delayed by about twelve hours. This is due to the fact that our moving average filter has a delay.

Any symmetric filter of length N will have a delay of (N-1)/2 samples. We can account for this delay manually.

fDelay = (length(coeff24hMA)-1)/2; plot(days, tempC, 'b', ... days-fDelay/24, avg24hTempC, 'g'); axis tight; legend('Hourly Temp','24 Hour Average','location','best'); ylabel('Temp (\circC)'); xlabel('Time elapsed from Jan 1, 2011 (days)'); title('Logan Airport Dry Bulb Temperature (source: NOAA)');

**Extracting Average Differences**

Alternatively, we can also use the moving average filter to obtain a better estimate of how the time of day affects the overall temperature. To do this, first, subtract the smoothed data from the hourly temperature measurements. Then, segment the differenced data into days and take the average over all 31 days in the month.

figure deltaTempC = tempC - avg24hTempC; deltaTempC = reshape(deltaTempC, 24, 31).'; plot(1:24, mean(deltaTempC)), axis tight; title('Mean temperature differential from 24 hour average'); xlabel('Hour of day (since midnight)'); ylabel('Temperature difference (\circC)');

**Weighted Moving Average Filters**

Other kinds of moving average filters do not weight each sample equally.

Another common filter follows the binomial expansion of (1/2,1/2)^n This type of filter approximates a normal curve for large values of n. It is useful for filtering out high frequency noise for small n. To find the coefficients for the binomial filter, convolve [1/2 1/2] with itself and then iteratively convolve the output with [1/2 1/2] a prescribed number of times. In this example, use five total iterations.

h = [1/2 1/2]; binomialCoeff = conv(h,h); for n = 1:4 binomialCoeff = conv(binomialCoeff,h); end figure fDelay = (length(binomialCoeff)-1)/2; binomialMA = filter(binomialCoeff, 1, tempC); plot(days, tempC, 'b', ... days-fDelay/24, binomialMA, 'g'); axis tight; legend('Hourly Temp','Binomial Weighted Average','location','best'); ylabel('Temp (\circC)'); xlabel('Time elapsed from Jan 1, 2011 (days)'); title('Logan Airport Dry Bulb Temperature (source: NOAA)');

Another filter somewhat similar to the Gaussian expansion filter is the exponential moving average filter. This type of weighted moving average filter is easy to construct and does not require a large window size.

You adjust an exponentially weighted moving average filter by an alpha parameter between zero and one. A higher value of alpha will have less smoothing.

alpha = 0.45; exponentialMA = filter(alpha, [1 alpha-1], tempC); plot(days, tempC, 'b', ... days-fDelay/24, binomialMA, 'g', ... days-1/24, exponentialMA,'r'); axis tight; legend('Hourly Temp', ... 'Binomial Weighted Average', ... 'Exponential Weighted Average','location','best'); ylabel('Temp (\circC)'); xlabel('Time elapsed from Jan 1, 2011 (days)'); title('Logan Airport Dry Bulb Temperature (source: NOAA)');

Zoom in on the readings for one day.

axis([3 4 -5 2]);

You'll note that by smoothing the data, the extreme values were somewhat clipped.

To track the signal a little more closely, you can use a weighted moving average filter that attempts to fit a polynomial of a specified order over a specified number of samples in a least-squares sense.

As a convenience, you can use the function `sgolayfilt` to implement a Savitzky-Golay smoothing filter. To use `sgolayfilt`, you specify an odd-length segment of the data and a polynomial order strictly less than the segment length. The `sgolayfilt` function internally computes the smoothing polynomial coefficients, performs delay alignment, and takes care of transient effects at the start and end of the data record.

cubicMA = sgolayfilt(tempC, 3, 7); quarticMA = sgolayfilt(tempC, 4, 7); quinticMA = sgolayfilt(tempC, 5, 9); plot(days, [tempC cubicMA quarticMA quinticMA]); legend('Hourly Temp','Cubic-Weighted MA', 'Quartic-Weighted MA', ... 'Quintic-Weighted MA','location','southeast'); ylabel('Temp (\circC)'); xlabel('Time elapsed from Jan 1, 2011 (days)'); title('Logan Airport Dry Bulb Temperature (source: NOAA)'); axis([3 5 -5 2]);

Sometimes it is beneficial to resample a signal in order to properly apply a moving average.

In our next example, we sampled the open-loop voltage across the input of an analog instrument in the presence of interference from 60 Hz AC power line noise. We sampled the voltage with a 1 kHz sampling rate.

load openloop60hertz fs = 1000; t = (0:numel(openLoopVoltage)-1) / fs; plot(t,openLoopVoltage); ylabel('Voltage (V)'); xlabel('Time (s)'); title('Open-loop Voltage Measurement');

Let's attempt to remove the effect of the line noise by using a moving average filter.

If you construct a uniformly weighted moving average filter, it will remove any component that is periodic with respect to the duration of the filter.

There are roughly 1000 / 60 = 16.667 samples in a complete cycle of 60 Hz when sampled at 1000 Hz. Let's attempt to "round up" and use a 17-point filter. This will give us maximal filtering at a fundamental frequency of 1000 Hz / 17 = 58.82 Hz.

plot(t,sgolayfilt(openLoopVoltage,1,17)); ylabel('Voltage (V)'); xlabel('Time (s)'); title('Open-loop Voltage Measurement'); legend('Moving average filter operating at 58.82 Hz', ... 'location','southeast');

Note that while the voltage is significantly smoothed, it still contains a small 60 Hz ripple.

We can significantly reduce the ripple if we resample the signal so that we capture a complete full cycle of the 60 Hz signal by our moving average filter.

If we resample the signal at 17 * 60 Hz = 1020 Hz, we can use our 17 point moving average filter to remove the 60 Hz line noise.

fsResamp = 1020; vResamp = resample(openLoopVoltage, fsResamp, fs); tResamp = (0:numel(vResamp)-1) / fsResamp; vAvgResamp = sgolayfilt(vResamp,1,17); plot(tResamp,vAvgResamp); ylabel('Voltage (V)'); xlabel('Time (s)'); title('Open-loop Voltage Measurement'); legend('Moving average filter operating at 60 Hz','location','southeast');

Sometimes your data can exhibit unwanted noise "spikes" that last for short durations of time. Moving average filters can be unduly influenced by large short-term deviations in the signal value.

We have corrupted our previous example's signal with several large spikes.

spikeSignal = zeros(size(openLoopVoltage)); spikeSignal(1:100:2000) = -6; noisyLoopVoltage = openLoopVoltage + spikeSignal; plot(t, noisyLoopVoltage) ylabel('Voltage (V)'); xlabel('Time (s)'); title('Open-loop Voltage Measurement (spikes added)');

Let's see what the signal looks like after resampling and using the Savitzky-Golay filter.

vResamp = resample(noisyLoopVoltage, fsResamp, fs); tResamp = (0:numel(vResamp)-1) / fsResamp; vAvgResamp = sgolayfilt(vResamp,1,17); plot(tResamp,vAvgResamp); ylabel('Voltage (V)'); xlabel('Time (s)'); title('Open-loop Noisy Voltage Measurement (spikes added)'); legend('Moving average filter operating at 60 Hz','location','southeast');

Note that resampling and applying our weighted moving average filter still removes the unwanted 60 Hz data and attenuates the spikes, however the spikes are spread out over the length of the filter.

A median filter replaces each element in the data with the median value over the length of the filter. We can use the filter to remove the spikes.

We will choose our filter order to be 3 (two samples more than the width of our spikes).

medfiltLoopVoltage = medfilt1(noisyLoopVoltage, 3); vResamp = resample(medfiltLoopVoltage, fsResamp, fs); tResamp = (0:numel(vResamp)-1) / fsResamp; vAvgResamp = sgolayfilt(vResamp,1,17); plot(tResamp,vAvgResamp); ylabel('Voltage (V)'); xlabel('Time (s)'); title('Open-loop Noisy Voltage Measurement (spikes added)'); legend('Median and moving average filtered','location','southeast');

For more information on filtering and resampling see the Signal Processing Toolbox.

Reference: M. Kendall and A. Stuart, "The Advanced Theory of Statistics", Griffin, 1983, Vol 3.

Was this topic helpful?