"Integration" over three circular areas in an efficient way; parfor already in use

Question

Bene Died on 4 Feb 2016

1
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/266584-integration-over-three-circular-areas-in-an-efficient-way-parfor-already-in-use

Closed: MATLAB Answer Bot on 20 Aug 2021

I'm trying to simulate image simulation in lithographic systems and was lucky to find the book "Computational Lithography" which provides some m-files as a base.

Now I'm trying to implement my own code and optimize it for CPU-parallel computing. After hours of (more less) trial and error I thought, I could try to ask somebody who's more acknowledged.

The basic task: I have three circular functions

1.) the illumination aperture J

2.) the microscope objective pupil and P1

3.) its complex conjugate P2

All of them have an arbitrary function within the circle. For getting the transfer-function I have to shift the objective function and its conjugate over the entire grid and calcullate the product of the three functions while the illumination stays in the centre. Shown in the figure (source: http://www.iue.tuwien.ac.at/phd/kirchauer/node62.html):

The result I get is correct so far. The code is as follows:

disp('integrate TCC');
N = size(P, 2);
midway=round((N+1)/2);
% generate Illumination aperture
TCC = zeros(N^2, N^2);
startTime = cputime;
limit_x = N^2;
limit_y = N^2;
% get integration boundaries
Po_binary = im2bw(abs(P)+1,1);  % Make sure, that even if the pupil is complex or negative, the entire radius is determined
radius_Po = round(sum(Po_binary(:,round(size(Po_binary,1)/2-1)))/2)+1;
radius_Pc = radius_Po * S;
parfor x=1:limit_x
    P_1=P;
    P_2=conj(P);
    for y=1:limit_y
          % diffraction orders in X-direction
          index_m=mod(x-1,N)+1-midway;
          index_p=floor((x-1)/N)+1-midway;
          % diffraction orders in Y-direction
          index_n=mod(y-1,N)+1-midway;
          index_q=floor((y-1)/N)+1-midway;
          % First integration boundary condition: Po&Po^* have to have an overlap bis Pc
          if( sqrt(index_m^2+index_n^2) > (radius_Po + radius_Pc) & sqrt(index_p^2+index_q^2) > (radius_Po + radius_Pc) )
              TCC(x,y)=0;
          elseif( sqrt((index_m-index_p)^2+(index_n-index_q)^2) > 2*radius_Po)
              TCC(x,y)=0;
          else
          if (index_m>0)
              P_1=[P(index_m+1:N,1:N);zeros(index_m,N)];
          else
              P_1=[zeros(abs(index_m),N);P(1:N+index_m,1:N)];
          end
          if (index_p>0)
              P_1=[P_1(1:N,index_p+1:N),zeros(N,index_p)];
          else
              P_1=[zeros(N,abs(index_p)),P_1(1:N,1:N+index_p)];
          end
          if (index_n>0)
              P_2=[P(index_n+1:N,1:N);zeros(index_n,N)];
          else
              P_2=[zeros(abs(index_n),N);P(1:N+index_n,1:N)];
          end
          if (index_q>0)
              P_2=[P_2(1:N,index_q+1:N),zeros(N,index_q)];
          else
              P_2=[zeros(N,abs(index_q)),P_2(1:N,1:N+index_q)];
          end
          TCC(x,y)=sum(sum(J.*P_1.*P_2));
          end
       end
      %disp(x);
  end

You can see, that I simply shift the pupils and sum up the product. I was wondering weather there can be made any further optimizations to speed it up. The result is a 51^4 Matrix which takes roughly 3 Minutes to calculate.

Any suggestions would give me great pleasure!