|
ImageAnalyst <imageanalyst@mailinator.com> wrote in message <e9d4bece-d445-48ef-8088-593b134fd7cd@k4g2000yqb.googlegroups.com>...
> Short on memory? Just how large are your arrays? What is "quite
> long"? Hundreds of megabytes? If they're only a few thousand
> elements long I probably wouldn't worry about it - it'll be lightning
> fast the way you have it plus it's readable, intuitive, and
> understandable. (Sometimes I find the "one liner" solutions so
> cryptic and hard to understand that it's not worth it in terms of code
> readability.)
I agree with you, complex one liner vectorizations aren't always the way to go. However, in this case perhaps some more clever methods are needed.
I have typically 1...10 million elements in the data vector*. What is problematic is the weighting factors since they can vary many orders of magnitude (for example w=1e16...1e22). This means that even if I normalize the weighting factors (smallest one to unity), I still get million copies of the data element with the largest weight (w=1...1e6).
Now, depending on the distribution of the weighting factors, I can have 1e6*1e6 elements in the above poor man's duplicated vector and 1e6*1e6 loop evaluations are needed to calculate those. So it's a problem with both memory and number crushing power.
Here follows a bit more realistic (and more vectorized) example of what I'm trying to do. With this method I'd need to be able to use nd = 1e6 and wmagn = 1e6 which is just not possible in terms of memory or CPU power!
nd=1e3; % number of data points
wmagn=1e3; % max weighting magnitude
d=rand(nd,1); % my data
w=randn(nd,1); % weighting factors
w=round((w+abs(min(w)))/(max(w)-min(w))*wmagn);
% create a new data vector
n = 1;
for(i=1:length(d))
a=n+w(i);
dnew(n:a-1)=repmat(d(i),w(i),1);
n=a;
end
hist(dnew);
*) The data I'm dealing with are, for example, particle energies from a kinetic plasma simulation. The w factors are statistical weights of the simulation particles and I'm trying to calculate weighted (energy) spectra.
|