Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Average calculation over non-uniform samples

Subject: Average calculation over non-uniform samples

From: Lior Hadad

Date: 22 Sep, 2010 21:39:05

Message: 1 of 18

Hello
I have a vector contains a time(= t) and a value( = y).
The sample rate is non-uniform and the value y between two samples is constant and equal to the first sample.
Is there any easy way of getting a windowed moving average of the vector?

e.g:
t = [5 61 100 101 102 105 2000 2002];
y = [1 0 1 2 5 0 1 0];
Explain: before t = 5 the value y is zero at t=5 the y change to 1 and stay constant till t<61 at t=61 the value change to zero till t<100 and so on…
So the signal behavior is like stairs(t,y);

I need to calculate the average y over a sliding window.
The vector is pretty huge(more than 1e6 samples) so efficiency and size need to be taken care.

Thanks

Subject: Average calculation over non-uniform samples

From: Walter Roberson

Date: 22 Sep, 2010 21:47:24

Message: 2 of 18

On 10-09-22 04:39 PM, Lior Hadad wrote:

> I have a vector contains a time(= t) and a value( = y).
> The sample rate is non-uniform and the value y between two samples is
> constant and equal to the first sample. Is there any easy way of getting
> a windowed moving average of the vector?

Based upon your example, I do understand the data structure, but I would ask
about the properties of the window. Is the window purely based upon the
current sample and the N-1 previous samples, or is the window centered on the
current sample and involves look-aheads ?

Also, what output representation do you want? The same kind of representation
could be used: providing you don't want something like an exponential die-off,
once you are N samples in to a constant block, the moving average would become
equal to the constant and would hold until the next threshold.

I gather from your example of time 2000 and 2002 that any given window might
include more than one transition. Is there a particular limit on the number of
transitions possible, other than the possibility of every sample in the window
being different?

Also, to cross-check just in case: the times are all integral, right?

Subject: Average calculation over non-uniform samples

From: Sean

Date: 22 Sep, 2010 22:07:04

Message: 3 of 18

"Lior Hadad" <lior_hadad@yahoo.com> wrote in message <i7dt1p$r28$1@fred.mathworks.com>...
> Hello
> I have a vector contains a time(= t) and a value( = y).
> The sample rate is non-uniform and the value y between two samples is constant and equal to the first sample.
> Is there any easy way of getting a windowed moving average of the vector?
>
> e.g:
> t = [5 61 100 101 102 105 2000 2002];
> y = [1 0 1 2 5 0 1 0];
> Explain: before t = 5 the value y is zero at t=5 the y change to 1 and stay constant till t<61 at t=61 the value change to zero till t<100 and so on…
> So the signal behavior is like stairs(t,y);
>
> I need to calculate the average y over a sliding window.
> The vector is pretty huge(more than 1e6 samples) so efficiency and size need to be taken care.
>
> Thanks

One way, though it will have to expand the entire vector:
%%%
t = [1 5 61 100 101 102 105 2000 2002];
y = [nan 1 0 1 2 5 0 1 0];

clear idx
idx(t) = 1;
idx = cumsum(idx);
wsz = 7; %window size

newy = y(idx);
averaged = smooth(newy,wsz,'moving');

Subject: Average calculation over non-uniform samples

From: Lior Hadad

Date: 22 Sep, 2010 23:23:24

Message: 4 of 18

Walter Roberson <roberson@hushmail.com> wrote in message <i7dtm1$3f8$1@canopus.cc.umanitoba.ca>...
> On 10-09-22 04:39 PM, Lior Hadad wrote:
>
> > I have a vector contains a time(= t) and a value( = y).
> > The sample rate is non-uniform and the value y between two samples is
> > constant and equal to the first sample. Is there any easy way of getting
> > a windowed moving average of the vector?
>
> Based upon your example, I do understand the data structure, but I would ask
> about the properties of the window. Is the window purely based upon the
> current sample and the N-1 previous samples, or is the window centered on the
> current sample and involves look-aheads ?

Example that may help, let’s assume that
window_size = 80,
start_point = -100
So the center of the window is: t= -100 + (80/2) = -60
At this point (win_center = -60) I will get an average of zero and none of the samples are inside the window.
The window start moving till win_end is equal to the first sample t=5 until this point all the output are the same and are equal to zero,
From that point every move of the window will give different result for example win_start = -75 -> win_end = 5 ->ouput = 0, win_start = -74 -> win_end = 6 ->ouput =1/80, win_start = -73 -> win_end = 7 ->ouput =2/80 … and so on over all time line,


>
> Also, what output representation do you want? The same kind of representation
> could be used: providing you don't want something like an exponential die-off,
> once you are N samples in to a constant block, the moving average would become
> equal to the constant and would hold until the next threshold.

The output vector need to contain a time and an average value, only changes in the average value will need to be in the vector
>
> I gather from your example of time 2000 and 2002 that any given window might
> include more than one transition. Is there a particular limit on the number of
> transitions possible, other than the possibility of every sample in the window
> being different?

no, there is no limit to the number of transition in a single window
>
> Also, to cross-check just in case: the times are all integral, right?

Yes

Subject: Average calculation over non-uniform samples

From: Lior Hadad

Date: 22 Sep, 2010 23:44:10

Message: 5 of 18

"Sean " <sean.dewolski@nospamplease.umit.maine.edu> wrote in message <i7dum8$ctl$1@fred.mathworks.com>...
> "Lior Hadad" <lior_hadad@yahoo.com> wrote in message <i7dt1p$r28$1@fred.mathworks.com>...
> > Hello
> > I have a vector contains a time(= t) and a value( = y).
> > The sample rate is non-uniform and the value y between two samples is constant and equal to the first sample.
> > Is there any easy way of getting a windowed moving average of the vector?
> >
> > e.g:
> > t = [5 61 100 101 102 105 2000 2002];
> > y = [1 0 1 2 5 0 1 0];
> > Explain: before t = 5 the value y is zero at t=5 the y change to 1 and stay constant till t<61 at t=61 the value change to zero till t<100 and so on…
> > So the signal behavior is like stairs(t,y);
> >
> > I need to calculate the average y over a sliding window.
> > The vector is pretty huge(more than 1e6 samples) so efficiency and size need to be taken care.
> >
> > Thanks
>
> One way, though it will have to expand the entire vector:
> %%%
> t = [1 5 61 100 101 102 105 2000 2002];
> y = [nan 1 0 1 2 5 0 1 0];
>
> clear idx
> idx(t) = 1;
> idx = cumsum(idx);
> wsz = 7; %window size
>
> newy = y(idx);
> averaged = smooth(newy,wsz,'moving');

smooth function is not part of my MATLAB package(MATLAB 7), so I cant test this solution

Subject: Average calculation over non-uniform samples

From: ImageAnalyst

Date: 23 Sep, 2010 02:29:19

Message: 6 of 18

I don't have smooth either. You can use conv() - it's the convolution
operation and if you pass it the right kernel, it can average forwards
and backwards, or just in one direction.

If you need to have your time's all perfectly uniform, then you can
use interp1() to make your new time samples uniformly sampled.

I didn't understand this at all: "the value y between two samples is
constant and equal to the first sample. " So let's take your data:
t = [5 61 100 101 102 105 2000 2002];
y = [1 0 1 2 5 0 1 0];
and let's take the 4th and 5th samples. So "the value y between the
two samples" would be 5-2=3. But then you say it "is constant and
equal to the first sample." In other words it should equal 1, which
is the y value of the first sample. Clearly this is not the case for
the 4th and 5th pair, nor for the 5th and 6th pair. So what's going
on? Do you want to clarify that (not that it makes any difference for
taking a moving average)?
-ImageAnalyst

Subject: Average calculation over non-uniform samples

From: Greg Heath

Date: 23 Sep, 2010 06:25:38

Message: 7 of 18

On Sep 22, 10:29 pm, ImageAnalyst <imageanal...@mailinator.com> wrote:
> I don't have smooth either.  You can use conv() - it's the convolution
> operation and if you pass it the right kernel, it can average forwards
> and backwards, or just in one direction.
>
> If you need to have your time's all perfectly uniform, then you can
> use interp1() to make your new time samples uniformly sampled.
>
> I didn't understand this at all:  "the value y between two samples is
> constant and equal to the first sample. "  So let's take your data:
> t = [5 61 100 101 102 105 2000 2002];
> y = [1 0 1 2 5 0 1 0];
> and let's take the 4th and 5th samples.  So "the value y between the
> two samples" would be 5-2=3.  But then you say it "is constant and
> equal to the first sample."  In other words it should equal 1, which
> is the y value of the first sample.  Clearly this is not the case for
> the 4th and 5th pair, nor for the 5th and 6th pair.  So what's going
> on?  Do you want to clarify that (not that it makes any difference for
> taking a moving average)?
> -ImageAnalyst

I think he means

y(time) = y(n) for t(n) <= time < t(n+1)

Hope this helps.

Greg

Subject: Average calculation over non-uniform samples

From: Lior Hadad

Date: 23 Sep, 2010 08:06:06

Message: 8 of 18

Greg Heath <heath@alumni.brown.edu> wrote in message <f5a8cf30-8856-44e7-8499-20fd53320447@u13g2000vbo.googlegroups.com>...
> On Sep 22, 10:29 pm, ImageAnalyst <imageanal...@mailinator.com> wrote:
> > I don't have smooth either.  You can use conv() - it's the convolution
> > operation and if you pass it the right kernel, it can average forwards
> > and backwards, or just in one direction.
> >
> > If you need to have your time's all perfectly uniform, then you can
> > use interp1() to make your new time samples uniformly sampled.
> >
> > I didn't understand this at all:  "the value y between two samples is
> > constant and equal to the first sample. "  So let's take your data:
> > t = [5 61 100 101 102 105 2000 2002];
> > y = [1 0 1 2 5 0 1 0];
> > and let's take the 4th and 5th samples.  So "the value y between the
> > two samples" would be 5-2=3.  But then you say it "is constant and
> > equal to the first sample."  In other words it should equal 1, which
> > is the y value of the first sample.  Clearly this is not the case for
> > the 4th and 5th pair, nor for the 5th and 6th pair.  So what's going
> > on?  Do you want to clarify that (not that it makes any difference for
> > taking a moving average)?
> > -ImageAnalyst
>
> I think he means
>
> y(time) = y(n) for t(n) <= time < t(n+1)
The formula is correct
>
> Hope this helps.
>
> Greg

Subject: Average calculation over non-uniform samples

From: Bruno Luong

Date: 23 Sep, 2010 09:14:05

Message: 9 of 18

Is this what you want?

% Data, f(x) = yi for x(i) <= x < x(i+1)
x = cumsum(rand(1,100)); % NOT equidistance
y = sin(x/5) + 0.1*randn(1,100);

% g(x) := Integral of f, is linear piecewise
breaks = [-inf x inf];
coefs = zeros(length(x)+1,2);
dx = diff(x);
coefs(2:end,1) = y;
coefs(3:end,2) = cumsum(y(1:end-1).*dx);
pp = mkpp(breaks,coefs);
g = @(x) ppval(pp,x);

% Compute the sliding average using g
xi = linspace(x(1),x(end));
win = 5; % window size
left = linspace(x(1),x(end)-win);
right = left + win;
ximid = 0.5*(left+right);
smooth = (g(right)-g(left))/win;

% Check
xmid = 0.5*(x(1:end-1)+x(2:end));
plot(xmid,y(1:end-1),'.r', ...
     ximid, smooth,'b');

% Bruno

Subject: Average calculation over non-uniform samples

From: Bruno Luong

Date: 23 Sep, 2010 10:59:07

Message: 10 of 18

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message
>
> % g(x) := Integral of f, is linear piecewise
> breaks = [-inf x inf];

ppval does not handle well left break with -Inf, so it is better to make this modification:

% g(x) := Integral of f, is linear piecewise
infinity = max(abs(x));
breaks = [-infinity x infinity];

...

Bruno

Subject: Average calculation over non-uniform samples

From: Bruno Luong

Date: 23 Sep, 2010 12:50:06

Message: 11 of 18

Sorry, make it:

>
> % g(x) := Integral of f, is linear piecewise
> infinity = realmax();
> breaks = [-infinity x infinity];
>
> ...

% Bruno

Subject: Average calculation over non-uniform samples

From: Lior Hadad

Date: 23 Sep, 2010 22:22:05

Message: 12 of 18

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <i7fidt$ml4$1@fred.mathworks.com>...
> Sorry, make it:
>
> >
> > % g(x) := Integral of f, is linear piecewise
> > infinity = realmax();
> > breaks = [-infinity x infinity];
> >
> > ...
>
> % Bruno

Bruno,
Thank you very much for your help
Lior

Subject: Average calculation over non-uniform samples

From: Bruno Luong

Date: 24 Sep, 2010 10:14:03

Message: 13 of 18

"Lior Hadad" <lior_hadad@yahoo.com> wrote in message
> Bruno,
> Thank you very much for your help
> Lior

Lior, I must warn you that the cumulative way has limited precision when summing data on a long period of time, because the average is computed from the difference of two large numbers. If you want to avoid run into those issues, you might consider to split the data in overlapping smaller periods and apply the method on each of them. Though you should be still fine with 1e6 # of data.

Bruno

Subject: Average calculation over non-uniform samples

From: Greg Heath

Date: 24 Sep, 2010 13:39:28

Message: 14 of 18

On Sep 23, 4:06 am, "Lior Hadad" <lior_ha...@yahoo.com> wrote:
> Greg Heath <he...@alumni.brown.edu> wrote in message <f5a8cf30-8856-44e7-8499-20fd53320...@u13g2000vbo.googlegroups.com>...
> > On Sep 22, 10:29 pm, ImageAnalyst <imageanal...@mailinator.com> wrote:
> > > I don't have smooth either.  You can use conv() - it's the convolution
> > > operation and if you pass it the right kernel, it can average forwards
> > > and backwards, or just in one direction.
>
> > > If you need to have your time's all perfectly uniform, then you can
> > > use interp1() to make your new time samples uniformly sampled.
>
> > > I didn't understand this at all:  "the value y between two samples is
> > > constant and equal to the first sample. "  So let's take your data:
> > > t = [5 61 100 101 102 105 2000 2002];
> > > y = [1 0 1 2 5 0 1 0];
> > > and let's take the 4th and 5th samples.  So "the value y between the
> > > two samples" would be 5-2=3.  But then you say it "is constant and
> > > equal to the first sample."  In other words it should equal 1, which
> > > is the y value of the first sample.  Clearly this is not the case for
> > > the 4th and 5th pair, nor for the 5th and 6th pair.  So what's going
> > > on?  Do you want to clarify that (not that it makes any difference for
> > > taking a moving average)?
> > > -ImageAnalyst
>
> > I think he means
>
> > y(time) = y(n)      for        t(n) <= time < t(n+1)
>
> The formula is correct

I didn't disagree with your formula.

I was trying to clarify what the acompanying words meant.

Greg

Subject: Average calculation over non-uniform samples

From: Lior Hadad

Date: 24 Sep, 2010 21:14:05

Message: 15 of 18

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <i7htlb$inr$1@fred.mathworks.com>...
> "Lior Hadad" <lior_hadad@yahoo.com> wrote in message
> > Bruno,
> > Thank you very much for your help
> > Lior
>
> Lior, I must warn you that the cumulative way has limited precision when summing data on a long period of time, because the average is computed from the difference of two large numbers. If you want to avoid run into those issues, you might consider to split the data in overlapping smaller periods and apply the method on each of them. Though you should be still fine with 1e6 # of data.
>
> Bruno

Bruno,
Can you explain in more detail the precision problem?
Lior

Subject: Average calculation over non-uniform samples

From: Bruno Luong

Date: 25 Sep, 2010 06:33:25

Message: 16 of 18


> Can you explain in more detail the precision problem?

The way my method works is following.

Given f(x) piecewise constant, I first compute the definite integral

g(x) := integral (xmostleft -> x) of f(t) dt

Assuming f(t) >= 0 so to illustrate the phenomenon. This function has value growing when x is large. In the code that translates to the CUMSUM(.) statement.

When the average needs to be calculated, we compute the average using this relationship:
average(x+window/2) = g(x+window) - g(x).

For large x, g(x+window) and g(x) are large but they difference is more or less in the magnitude of f(.). So the code subtract two large quantities, which is not good with the limited precision of floating point.

Here is an example where I have data point exponentially spaced, but constant and equal to pi.

% Data, f(x) = yi for x(i) <= x < x(i+1)
x = [0 1 100 100 1000 10000 100000 1000000]
y = [pi pi pi pi pi pi pi pi];

% g(x) := Integral of f, is linear piecewise
breaks = [-inf x inf];
coefs = zeros(length(x)+1,2);
dx = diff(x);
coefs(2:end,1) = y;
coefs(3:end,2) = cumsum(y(1:end-1).*dx);
pp = mkpp(breaks,coefs);
g = @(x) ppval(pp,x);

% function to compute the average on window of length 100
win = 100
average= @(x) (g(x+win)-g(x))/win;

% numerical error
average(0)-pi
average(1000000)-pi
%%%

So as I said, for *very long* time period, you might need to split them in smaller intervals and carry out the average on each of them separately.

% Bruno

Subject: Average calculation over non-uniform samples

From: Lior Hadad

Date: 25 Sep, 2010 20:17:05

Message: 17 of 18

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <i7k53l$2nu$1@fred.mathworks.com>...
>
> > Can you explain in more detail the precision problem?
>
> The way my method works is following.
>
> Given f(x) piecewise constant, I first compute the definite integral
>
> g(x) := integral (xmostleft -> x) of f(t) dt
>
> Assuming f(t) >= 0 so to illustrate the phenomenon. This function has value growing when x is large. In the code that translates to the CUMSUM(.) statement.
>
> When the average needs to be calculated, we compute the average using this relationship:
> average(x+window/2) = g(x+window) - g(x).
>
> For large x, g(x+window) and g(x) are large but they difference is more or less in the magnitude of f(.). So the code subtract two large quantities, which is not good with the limited precision of floating point.
>
> Here is an example where I have data point exponentially spaced, but constant and equal to pi.
>
> % Data, f(x) = yi for x(i) <= x < x(i+1)
> x = [0 1 100 100 1000 10000 100000 1000000]
> y = [pi pi pi pi pi pi pi pi];
>
> % g(x) := Integral of f, is linear piecewise
> breaks = [-inf x inf];
> coefs = zeros(length(x)+1,2);
> dx = diff(x);
> coefs(2:end,1) = y;
> coefs(3:end,2) = cumsum(y(1:end-1).*dx);
> pp = mkpp(breaks,coefs);
> g = @(x) ppval(pp,x);
>
> % function to compute the average on window of length 100
> win = 100
> average= @(x) (g(x+win)-g(x))/win;
>
> % numerical error
> average(0)-pi
> average(1000000)-pi
> %%%
>
> So as I said, for *very long* time period, you might need to split them in smaller intervals and carry out the average on each of them separately.
>
> % Bruno


First I want to thank you for the useful information,
I have another question related to the non uniform vectors; maybe you can help me with this,
I have two vectors that are non-uniform sampled and I need to do a vector addition for those vectors,
The rule is as before: y = y(i) for x(i) <= x < x(i+1)

Vector1:
Magnitude: vec1_t = [5 7 10 20 21 22 23]
Sample_time: vec1_y = [0 1 0 2 0 1 0]

Vector2:
Magnitude : vec2_t = [5 6 9 18]
Sample_time : vec2_y = [0 4 1 0]

The Expected result:
vec1 + vec2 :
vec1_2_t = [5 6 7 9 10 18 20 21 22 23]
vec1_2_y = [0 4 5 2 1 0 2 0 1 0]

Subject: Average calculation over non-uniform samples

From: Bruno Luong

Date: 25 Sep, 2010 21:34:04

Message: 18 of 18

"Lior Hadad" <lior_hadad@yahoo.com> wrote in message <i7llc1$6bb$1@fred.mathworks.com>...
> "Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <i7k53l$2nu$1@fred.mathworks.com>...
> >
> > > Can you explain in more detail the precision problem?
> >
> > The way my method works is following.
> >
> > Given f(x) piecewise constant, I first compute the definite integral
> >
> > g(x) := integral (xmostleft -> x) of f(t) dt
> >
> > Assuming f(t) >= 0 so to illustrate the phenomenon. This function has value growing when x is large. In the code that translates to the CUMSUM(.) statement.
> >
> > When the average needs to be calculated, we compute the average using this relationship:
> > average(x+window/2) = g(x+window) - g(x).
> >
> > For large x, g(x+window) and g(x) are large but they difference is more or less in the magnitude of f(.). So the code subtract two large quantities, which is not good with the limited precision of floating point.
> >
> > Here is an example where I have data point exponentially spaced, but constant and equal to pi.
> >
> > % Data, f(x) = yi for x(i) <= x < x(i+1)
> > x = [0 1 100 100 1000 10000 100000 1000000]
> > y = [pi pi pi pi pi pi pi pi];
> >
> > % g(x) := Integral of f, is linear piecewise
> > breaks = [-inf x inf];
> > coefs = zeros(length(x)+1,2);
> > dx = diff(x);
> > coefs(2:end,1) = y;
> > coefs(3:end,2) = cumsum(y(1:end-1).*dx);
> > pp = mkpp(breaks,coefs);
> > g = @(x) ppval(pp,x);
> >
> > % function to compute the average on window of length 100
> > win = 100
> > average= @(x) (g(x+win)-g(x))/win;
> >
> > % numerical error
> > average(0)-pi
> > average(1000000)-pi
> > %%%
> >
> > So as I said, for *very long* time period, you might need to split them in smaller intervals and carry out the average on each of them separately.
> >
> > % Bruno
>
>
> First I want to thank you for the useful information,
> I have another question related to the non uniform vectors; maybe you can help me with this,
> I have two vectors that are non-uniform sampled and I need to do a vector addition for those vectors,
> The rule is as before: y = y(i) for x(i) <= x < x(i+1)
>
> Vector1:
> Magnitude: vec1_t = [5 7 10 20 21 22 23]
> Sample_time: vec1_y = [0 1 0 2 0 1 0]
>
> Vector2:
> Magnitude : vec2_t = [5 6 9 18]
> Sample_time : vec2_y = [0 4 1 0]
>
> The Expected result:
> vec1 + vec2 :
> vec1_2_t = [5 6 7 9 10 18 20 21 22 23]
> vec1_2_y = [0 4 5 2 1 0 2 0 1 0]

pp1 = mkpp([vec1_t inf],vec1_y)
pp2 = mkpp([vec2_t inf],vec2_y)
vec_t_sum = union(vec1_t,vec2_t)
vec_y_sum = ppval(pp1,vec_t_sum)+ppval(pp2,vec_t_sum)

% Bruno

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us