Mean (with confidence bands) for signals sampled at different timepoints

I have ratings from multiple subjects, which I'd like to average and display with confidence bands to show cross-subject variability, more or less like in this random plot I found in some paper (where 3 separate conditions are shown):
Once I have the mean and confidence intervals at each timepoint, I figured I could then plot those either with the ciplot Exchange function, or with something like:
fill(x,y_CI, colours, 'FaceAlpha',0.2, 'LineStyle','-');
The problem is that each of my subjects' rating curve is sampled at different time points, for instance one subject might have ratings at [0.5s, 1.1s, ..], and another at [0.2s, 0.6s, ..].
I could manually define a linspace with 100ms-or-so bins and manually compute an average + CI for each. But this seems hacky, and I'm sure a more elegant & simple way exists, although I found no built-in Matlab function for it. One answer to this question suggests using defining and synchronising timetables, but this probably doesn't work in my case as the total number of ratings differs across subjects.
Any help would be greatly appreciated!

11 Comments

" one subject might have ratings at [0.5s, 1.1s, ..], and another at [0.2s, 0.6s, .. "
I presume those are both from a time zero for each subject so one can't time-shift each to a new origin by subtracting first point...
Is the observation value at the origin something known so can set that value?
Is this a continuous measurement or ...???
About all one could hope to do is some sort of interpolation to put on a common time basis; how much faith one could place in this would be something we have no way to judge, but MATLAB has tools to do such in a trivally simple manner -- the timetable and retime.
However, I would add the caution that just because you can do something doesn't mean that you should.
Thanks for your reply!
"I presume those are both from a time zero for each subject so one can't time-shift each to a new origin by subtracting first point..." - you are correct. The time-shift has already been done, all subjects start at time=0.
"Is the observation value at the origin something known so can set that value?" - yes, sorry to not have mentioned. The starting value is always 50.
The y axis is a continuous measurement, yes (for all intents and purposes).
And as for the present dataset, I think interpolation would be fine. However from the documentation, it's not very clear to me what the difference would be between interpolating with retime as oposed to with synchronize
OK, that's good enough description.
What I forgot to ask is what form/where are the exsiting data for starting point to process? Just how/which tools/tricks to use can be significantly different, depending...
In general, I'd probably still go the timetable route; either synchonize or retime would be able to do the job of interpolating; the difference is that retime works within a given timetable while synchronize matches one to another.
I also realize I'm not sure about the comment above of " the total number of ratings differs across subjects" -- are we computing means by subject or across subjects? -- either is doable, but just need to know precisely what it is we're after here.
As per usual, ti would be a lot easier to provide specific code if you could attach a couple of the sample files to have something to poke at that actually fits the real problem...
"What I forgot to ask is what form/where are the exsiting data for starting point to process? Just how/which tools/tricks to use can be significantly different, depending..." - Not sure I get that, sorry. I have the data as an indexed structure with fields, the starting point being e.g. song(1).subject(1).xTime(1), with y value stored in song(1).subject(1).ySlider(1). I attach the mat file here.
Each subject gave a different number of ratings for the same piece, e.g. one may have 90 samples and another 100. The mean we are computing should be across subjects, for any given window of time. I guess defining 100ms or narrower bins would be fine, I just want to prevent a lot of if and for clauses..
Wowsers!!! What a mess!!!
What is the rating we're trying to do something with? And what does time have to do with ratings along with all the other variables?
How did you get this array of struct created -- where did the data for that come from? I'd probably go back to that point and build a more suitable data storage scheme that matched up with what was trying to do more easily instead.
These are all ratings of perceived tension in a musical piece that subjects listened to. The data was stored in JSON files that I converted to cell arrays using fread and jsondecode.
But in what way is my variable structure suboptimal, and what can be improved? I indeed had a feeling I was making things more complicated, but (since I'm no programmer!..) wasn't really sure how, and how I can store, manipulate and plot variables in a more straightforward way.
Think would have to see how the data are stored initially that you retrieved to be able to judge that -- and an idea of just what it is that you really need to keep (do you really need all the stuff or are you just interested in some subset but saved it all "just because"?
So, they're "rating" thie piece while it's being listened to during the piece,I gather? Which is why there are various times/ratings?
Sorry, was trying to keep this simple to aid with answering, but I evidently made this harder to understand instead:) Thanks a lot for your patience.
I attach one CSV (JSON) file here.
Basically, subjects were rating 2 different stimuli, each in 3 different conditions (paying attention to various aspects of the stimulus). For the purposes of my question, we can just take one subejct, for any of the stimuli, in any one given condition.
OK, so I see from whence came the struct array -- the jsondecode routine does its best although it fails on some subfields it appears.
But, I'm totally at sea in knowing what any of it means (if anything) and what has bearing on what you think you want to try to do with it...
" we can just take one subejct, for any of the stimuli, in any one given condition..."
There's only a "Condition" variable for struct's whose rt ("Real time" in some units, maybe?") is -1 which doesn't seem to make the above supposition. There's something called "conditions" in a vector; what, if anything, does that have to do with the other and which is what you're after? And, if it has to do with the vector; what's that correlate to and how with anything else? It's an absolute mess that only its mother could love...
Haha, well at least this thread taught me a funny new expression in English, if nothing else :)
Don't worry about all other details, I'm just trying to get signals resampled or aligned that were originally sampled at different timepoints.
Using `synchronize` with spline gave me artefactual interpolations, with y values that go way outside of the 0-100 range. So I am now trying retime on each individual signal, to force it into a 100ms sampling rate.
"Using `synchronize` with spline gave me artefactual interpolations, with y values that go way outside of the 0-100 range."
Yes, not too surprising with a spline if the data have points of inflection; the spline will fit a quadratic through each subsequent set of points -- it will go through each point identically, but is not constrained in between -- I forget whether the method matches even first derivative at the breakpoints or not; my thinking is "not".
Would have to see what particular datasets you're looking at look like to have any specific recomendations other than the obvious(?) of using 'linear' interpolation between the existing data.

Sign in to comment.

Answers (0)

Products

Release

R2019b

Asked:

on 11 Aug 2022

Commented:

dpb
on 14 Aug 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!