# How to compute the average path from scattered data (and its variance)?

13 views (last 30 days)

Show older comments

I have a set of 2-dimensional vectors which contain each one the positions of a robot during experiments. Each vector has different size. I would like to compute a path that corresponds to the average of all paths (and possibly the variance). The idea is to represents in a compact way the result of all the experiments.

I found a function, but I don't understand how it works. Are there any solution to my problem? I tried to pad all the vectors such that they have the same size and then use mean() command, but results are poor.

Here a plot of some paths I want to average:

##### 0 Comments

### Answers (4)

William Rose
on 27 Sep 2021

This is a good question which arises in slightly different forms in a wide array of problems. One example: how to average an ensemble of GPS position recordings, each of which corresponds to the same course. Another (from my work): find th average hip or knee angle (in 3D), over one stride, when you have a recording of a person walking for many strides on a treadmill, and the stride lengths and durations vary somewhat.

I assume the recordngs do not all have the same number of points.

Resample each vector to have 1000 elements, using interp1(), then average them with mean().

This is probably the best approach in the absence of ther information.

The most obvious "other information" would be a time stamp for each position. If your data is sampled at uniform intervals, then the array index is a time stamp. If you want the average position at each time, and if the recordings are of varying length, then pad at the beginning with the initial position, or pad at the end with the final position, to get vectors of uniform length, then average them. You said you did something like this and the results were bad. But I don't know how you padded. A different padding choice could help.

Adam Danz
on 27 Sep 2021

Edited: Adam Danz
on 27 Sep 2021

It looks like the following assumptions can be made:

- The robot starts at the same location
- There is a uniform temporal sampling interval.

If this is true, why not just average the (x,y) coordinates across all trials?

Alternatively, you could compute the 2D density of (x,y) values using histcounts2 and then use those data to compute the path of highest density within a 2D grid. This approach would require lots of repetitions (more than what is shown in your sample image).

##### 2 Comments

Adam Danz
on 28 Sep 2021

William Rose
on 28 Sep 2021

Since you said you do not care about time, it is OK to normalize all the paths to have the same number of steps. Here is a script that

- Generates 8 random paths with a different numbers of steps for each
- Computes 8 paths with the same number of steps, by interpolation
- Computes the mean+-SD path
- Plots the mean path and plots 1-SD ellipse at 20 points along the path

The generates the plots below. Each run will have different semi-random walks. Good luck.

##### 3 Comments

William Rose
on 29 Sep 2021

William Rose
on 29 Sep 2021

@Alberto Bacchin, here's another plot you can make, if you use the mean-of-the-normalized paths approach (pathSimulateAndAverage.m). The plot shows the raw paths, mean path, and 90% and 95% confidence regions.

The script pathSimulateAndAverage.m makes the paths and the plot above. It calls plotFilledEllipse.m (attached). Thanks to @Star Strider for providing the basis of plotFilledEllipse().

The script calls chi2inv() in the Stats and Machine Learning Toolbox. If you do not have that toolbox, then, if desired confidence interval (ci)=.90, .95, .99, replace chi2inv(ci,2) with 4.605, 5.991, 9.210.

##### 0 Comments

### See Also

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!