Why is replacing datetimes in a large array slow?
13 views (last 30 days)
Show older comments
I've got a set of large data of which I know the time every 100 points. To get the other points, I'm trying to interpolate between the known times using linspace. While the linspace command seems to be quiet fast, replacing the datetime values in my initialized array seems to go slower the larger the array is. Doubling array size, also doubles writing time.
a 30.000.000 point array takes 0.24 seconds to overwrite only 100 points. This seems way too much to me.
Why is the writing time proportional to array size? And more importantly: how can I reduce this run time?
I have checked, but running the linspace and writing to ans, is sub milisecond fast, so it really has to do with writing to a large array.
short_length=300000;
random = rand(short_length,1)/1001; %random timeshift
DAQ_PC_datetime_short=datetime('now')+seconds((0:0.001:0.001*(short_length-1))'+random); %generate fictive times
DAQ_PC_datetime = NaT(length(DAQ_PC_datetime_short)*100,1); %init array
DAQ_PC_datetime(100:100:100*length(DAQ_PC_datetime_short)) = DAQ_PC_datetime_short; %set known values
% DAQ_PC_datetime = DAQ_PC_datetime'; %sizing
DAQ_PC_datetime(1:99) = DAQ_PC_datetime(100) + seconds(-.099:0.001:-.001); %extrapolate at start
for n=100:100:100*length(DAQ_PC_datetime_short)-1 %interpolate in the middle
tic
DAQ_PC_datetime(n:n+100) = linspace(DAQ_PC_datetime(n),DAQ_PC_datetime(n+100),101); %linear interpolation
toc
end
17 Comments
Eike Blechschmidt
on 9 Aug 2021
I have a data store that reads in chunks of data from hard disk. I then have a second data store on top that which analyses the buffered data and returns parts of it (e.g. one period/cycle of a really long sin wave). For each period/cycle I do an analysis as shown above. So I allocate memory the number of cycles and not the amount of data I expect (around 1000x amount of cycles). According to that I travels the number of total periods/cycles present in the data. I just saw that the code above does not fit to the timing I provided as I separated the mean part of the date time section into a separate call and stored it a temporary variable. Then the tic was after the read call and before the two date time calls. I’m sorry I’m on my phone so I could not write it in code again. I will post it tomorrow.
Eike Blechschmidt
on 10 Aug 2021
...
timing(1000000, 3)
while hasdata(ds)
i = i + 1;
[data, info] = read(ds);
tic();
sum = sum + sum(data.T);
num = num + numel(data.T);
p(i) = mean(data.p);
timing(i, 1) = toc();
tic();
val = mean(data.Timestamp(info.Max)); % where info.Max changes every read and always just contains two indices
timing(i, 2) = toc();
tic();
t(i) = val;
timing(i, 3) = toc();
end
...
After running the code i just summed up all the timings to get a total spent as posted before.
Answers (0)
See Also
Categories
Find more on Data Type Identification in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!