# How to plot the average and std shade of 4 different datasets?

1 view (last 30 days)
Afonso Silva on 25 Jul 2021
Commented: dpb on 28 Jul 2021
Hello, I'm trying to wrap my ahead around this problem but I'm working with big datasets and I'm not too familiar with Matlab capabilities and functions.
I have 4 different datasets, with different (but similar) sizes, same starting and end points (x coordinate), but different values in the middle.
Let's say, something like this:
dataset1_x = [0,2,3,5,7,4,3,2]
dataset1_y = [30,40,50,66,55,45,40,30]
dataset2_x = [0,1,2,4,6,7,5,4,2]
dataset2_y = [30,42,48,57,59,60,55,45,32]
dataset3_x = [0,3,5,6,7,6,5,4,3,2]
dataset3_y = [30,35,40,45,50,55,64,50,40,35]
Also, as you can see from the example datasets, they represent a cycle, and I don't know if that's a problem. Actual graphic representation of my datasets below:
Instead of plotting each dataset, I would like to plot an average line, and add the standard deviation as a shade around it.
My first problem is the fact that the datasets are different. I thought about interpolating values, getting an approximated version of each dataset with the same size as the others, but I don't know how to do it consistently throughout each dataset.
My second problem would be how to plot it. But, if the first problem is solved, I think I could use this code:
x = 1 : 300;
curve1 = log(x);
curve2 = 2*log(x);
plot(x, curve1, 'r', 'LineWidth', 2);
hold on;
plot(x, curve2, 'b', 'LineWidth', 2);
x2 = [x, fliplr(x)];
inBetween = [curve1, fliplr(curve2)];
fill(x2, inBetween, 'g');
Using upper (average value + std) and lower (average value - std) limits for the 2 curves to be filled inbetween.
Right? Or that wouldn't work?
Any help would be massively appreciated

dpb on 25 Jul 2021
I'm rushed for time so not complete solution, but the outline of how to proceed...
[mxx,ix1]=max(dataset1_x); % get the max of dataset x value and index to first
mnx=min(dataset1_x); % and the minimum x value
x=linspace(mnx,mxx); % compute an x vector over the range
y1=interp1(dataset1_x(1:ix1),dataset1_y(1:ix1),x,'pchip'); % interpolate the outbound region
plot(dataset1_x,dataset1_y)
hold on
...
Then continue on for the return section by using x corresponding values of x within the range over x to match that range as well.
You can then do an average and stddev of those matching-length vectors to produce the average values overall.
To put the two pieces together at the same points overall could be done by using a fixed dx instead of a fixed number of points as in linspace the complication with your datasets as above is the ending position is not back to the origin so there's nothing between the last return point back to the origin.
Now thinking about it, It might be simpler with spline interpolant than with interp1; I believe (altho I didn't go check for confirmation) that it will take the x vector as it is whereas interp1 cannot have any duplicated values and must be ordered.
Hopefully that'll get you started...I gotta' run!
PS:
For coding ease, I'd suggest to convert sequentially named variables to a cell array so that can write looping expressions over the number of curves instead of having to write out each variable name explicitly which means having to duplicate all the same code over and over...
dpb on 28 Jul 2021
Kewl...glad to be able to help.

R2021a

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!