Efficiently Averaging Large Sets: Store now, Average later vs. Update now

1 view (last 30 days)
I am wondering if it is more efficient to store large amounts of data and average later or to update the average as data is generated and reduce the storage requirement.
I am looping over the number of samples, S. All vectors and matrices are of unique dimension and are organized in cells. The result of each loop iteration is N new vectors and N new matrices. .
Sooner or later, I need to average S column vectors N times, and S Matrices N times.
I see two fundamental approaches to this problem:
1) Create N vectors and N matrices and update them every loop iteration to reflect the average:
, where n is an indexing term, A represents the average, and represents the new entry.
So there would be storage requirements and of these calculations.
2) Store vectors and matrices, and then call the MatLab function mean() times to recover the averages.
So there would be storage requirements and the mean() function would be called times.
My gut tells me calculations is a losing proposition, but I wonder if someone with more expertise than me could comment? Thank you!
  2 Comments
Aghamarsh Varanasi
Aghamarsh Varanasi on 24 Mar 2020
Choosing one of the approach that you have mentioned should totally depend on your design and other constraints. For example, If you wanted to develop the algorithm for a system with memory limitations, then you could use the first approach. If you wanted less computation, you can use the second approach.
Mark Rzewnicki
Mark Rzewnicki on 25 Mar 2020
Ah, the dreaded "it depends." I figured as much, just looking for different insights. Thanks!

Sign in to comment.

Answers (0)

Tags

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!