How to use "retime" to resample data with NaN rows

I have two sets of data with 1000 rows by 10 columns. My data is stored in a timetable and has categorical data inside. I would like to resample one set of data down-sampling (from 1 seconds to 30 seconds) and another up-sampling (from 5 minutes to 30 seconds) using "retime" . In both data, there is a portion contains NaNs and I would like to skip scanning these NaNs rows when using retime. If I directly use retime to resample the data, the data will be interpolated.
Is there a way to do that?

 Accepted Answer

For down-sampling, you can use the functions with built-in capabilities to avoid scanning NaNs such as “median”, then use the function handle to call “median” function. In the “median” function, you can use “omitnan” flag to avoid scanning the NaN values. An example code can be as follows:
downsampledData = retime(data,'regular', @(x) median(x,'omitnan'),'TimeStep',dt); 
For up-sampling, the main idea is to find the blocks of rows in the upsampled result (the data sampled at each 5 minutes) that should be all missing data, remove those rows, and then recreate them with missing data. The biggest chunk of code is finding those blocks of rows, but it is simple fast code. It is hard to say if it is just easier to find their start and stop, and loop over them, but the code below is vectorized.
The complication here is that you want to interpolate up to the time of the first row in each block of missing data, and resume interpolation immediately after the time of the last row in each block, so it is not possible to leverage "next" or "previous" interpolation to mark rows in the upsampled data, you need one at the end and the other at the beginning of each block.
Please check the code below:
% load data.mat
% upsample to from 5 minutes to 30sec timestep, interpolating across missing data
dt = minutes(.5);
upsampledData = retime(data,'regular','nearest','TimeStep',dt);
% find blocks of missing data in original data
mask = isnan(data.Temp); % assumes entire rows are missing
starts = [mask(1); (diff(mask)>0)];
stops = [(diff(mask)<0);~mask(end)];
% mark corresponding blocks in upsampled data
upsampledData.mask = zeros(height(upsampledData),1);
upsampledData.mask(data.Datetime(starts & ~stops)) = 1; % leave singleton blocks alone
upsampledData.mask(data.Datetime(stops & ~starts)) = -1;% leave singleton blocks alone
upsampledData.mask = cumsum(upsampledData.mask);
upsampledData.mask(data.Datetime(stops)) = 1; % mark singleton blocks as missing
% delete upsampled rows that should be missing, recreate them with missing data
upsampledData(upsampledData.mask==1,:) = [];
upsampledData.mask = [];
upsampledData = retime(upsampledData,'regular','fillwithmissing','TimeStep',dt);

More Answers (0)

Categories

Products

Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!