Force even distribution over logical

1 view (last 30 days)
Hi.
I have a logical matrix C (230x312 logical) that represents locations in a country along a grid of longitudes and latitudes. true represents locations that are within the country's borders and false represents locations that aren't within the country's borders.
C looks something like this (scaled down)
C = [0 0 0 0 1 0 0 1 1 0 ;...
0 1 1 0 0 0 1 1 0 1;...
0 1 1 1 1 1 1 0 0 0;...
0 1 1 1 1 1 1 1 1 0;...
1 1 1 1 1 1 1 1 1 1;...
0 1 1 1 1 1 1 1 0 0;...
0 0 1 1 1 0 0 1 0 0;...
0 0 0 0 1 0 0 0 0 0;...
0 0 0 0 0 1 0 0 0 0];
In total, there are about 40000 locations within the borders.
I also have a vectors, for example D (1x40000 single) containing data, where D corresponds to find(C). so if
vC = find(C);
then
D(i) %contains data that corresponds to vC(i)
There are various vectors like D that contain longitudes, latitudes, and other statistical data. I am trying to take out around 1000 to 3000 locations as evenly distributed across the country as possible. Because the country C is not a perfect square
locs = vC(1:floor(length(vC)./40):end);
does not return an even distribution.
idx = randperm(40000,1000);
locs = vC(idx);
returns slightly better results - however because of the amount of locations to choose from, there are almost always clusters of locations that are too close together. I also tried using randsample() with a weight for each location (according to the number of locations east/west & north/south from the respective location). But because of the large number of locations to choose from, this does not return better results than randperm().
The only other thing I can think of would be to use a while loop and calculate the distance between each location in locs using the longitudes and latitudes and keep the loop going until all locations are at least x kilometers apart and at most y km apart. I'm pretty sure that this would take forever though.
So, if anyone has a better/faster idea - I'd be forever in your debt!
Thanks!
P.S. It might also be worth noting that the prime factors of the actual total number of locations are [2] and [23063].

Accepted Answer

Marc Jakobi
Marc Jakobi on 7 Nov 2015
Nevermind, I just found a solution:
N = 1000; %number of locations in sample
R = length(find(in))/(size(in,1)*size(in,2)); %ratio polygon/square that contains polygon
Ns = N./R; %number of locations needed in square
pol = single(C);
pol(C) = (1:length(D)); %indexes of data
pol = pol'; %this is because of the way matrices are indexed in MATLAB and necessary in my case
pol2 = pol(:); %convert to vector
dv = floor(size(in,1)*size(in,2)./Ns); %approximate step needed
Slocs = pol2(1:dv:end); %sample from square that contains polygon (evenly distributed)
locs = Slocs(Slocs ~= 0); %locs contains approximately N locations evenly distributed across the polygon in C

More Answers (0)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!