"Very" high dimensional convex hulls

Question

2 votes

I am trying to generate points that are uniformly distributed over a convex hull in a high dimensional space.

What I have as input is a set of N points in D dimensions. I want to sample uniformly over the convex hull of this set points. The best solution that I have found so far is by using Matt Jacobson's "vert2lcon" (file 30892 on the FEX) to convert from vertex representation to a linear constrain representation (i.e. inequalities defining the facets of the hull), and then to use Tim Benham's "cprand" (file 34208 on the FEX) to sample uniformly over the hull.

This works great for a few dimensions, however, the hull's that I am interested in are very high dimensional, somewhere between 30-60 dimensions usually. When I try to use vert2lcon on all D~30 dimensions MATLAB crashes. The reason is that vert2lcon constructs the convex hull of my points by calling "convhulln", which in turn calls "Qhull", and the crash report says that the error is happening during the mex function for "Qhull". My guess is that it can't handle that many dimensions.

Any suggestions?

I have thought of trying to take a subset of the dimensions at a time, but I'm not sure that that would work (thought about the example of a sphere, if you take the one dimensional orthogonal projections, and compute their convex hulls you get three lines parallel to the coordinate axes, the 3D convex hull of these convex hulls is going to be an octahedron so it definitely doesn't reproduce the sphere).

7 Comments
Show 5 older comments Hide 5 older comments

Matt J on 30 May 2013

I did this in python using cvxopt with less then 170 lines of code (including I/O)

Does this mean you already have a solution that you like? What is the status of the problem?

Oliver on 30 May 2013

No, no, that was just what someone else posted at stackoverflow (I was quoting their post). I have no idea how they did it. I'm still just as stuck as ever. It just seemed that the method they suggested was promising perhaps.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Matt J on 29 May 2013

Edited: Matt J on 29 May 2013

Open in MATLAB Online

1 vote

One idea is to generate samples uniformly over a hypercube and throw out the ones that are outside of the convex hull without actually computing the hull itself. But, since I don't know which points are on the hull apriori without constructing it I'm not sure how to do this either.

You could do that by solving the QP (e.g., using lsqlin)

 min d(w) = norm(V.'*w-y).^2
s.t. sum(w)=1
     w>=0

where y is the point you want to test for inclusion in the convex hull. If the optimal d(w) is zero, then y is in the hull

You haven't mentioned how big N is so I can't know in advance how practical this would be for you.

4 Comments
Show 2 older comments Hide 2 older comments

Matt J on 30 May 2013

Edited: Matt J on 30 May 2013

Open in MATLAB Online

How are you initializing lsqlin? I would recommend you find the closest V(i,:) to y,

[~,imin]=min(sum(bsxfun(@minus,V.',y).^2))

Then set w(imin)=1 and all other w(i) to 0.

Oliver on 30 May 2013

Open in MATLAB Online

Unfortunately it took just as long with this modification. Here's the code, with fabricated data points in the columns of "c_temp":

nsamples = 1e2;
npts = 1e3;
ndims = 34;
c_temp = rand(ndims,npts);
%---set up QP problem---%
Aeq = ones(1,npts);
beq = 1;
lb = zeros(npts,1);
ndims = size(c_temp,1);
options.LargeScale = 'off'; %system is underdetermined, so a warning is thrown and it defaults to medium scale anyways
temp_samples = zeros(ndims,0);
N = 0;
while N < nsamples
    %---sample hypercube---%
    samples = rand(ndims,nsamples);
      %---keep the ones inside of the hull without explicitly computing it---%
      d = zeros(nsamples,1);
      for i = 1:nsamples
          y = samples(:,i);
          [~,imin] = min(sum(bsxfun(@minus,c_temp,y).^2));
          x0 = zeros(npts,1);
          x0(imin) = 1;
          tic
          x = lsqlin(c_temp,y,[],[],Aeq,beq,lb,[],x0,options);
          toc
          d(i) = norm(c_temp*x-y).^2; %distance from closest point inside of the hull, C*x, to the test point, y.
      end
      in = (d == 0);
      temp_samples = [temp_samples,samples(:,in)];
      %---how many samples have we kept so far---%
      N = size(temp_samples,2);
  end
  temp_samples = temp_samples(:,1:nsamples);

Sign in to comment.

Answer 2

Matt J on 30 May 2013

Open in MATLAB Online

2 votes

Here's another possibility based on linear programming and separating plane ideas. Define

 A=bsxfun(@minus, V,y(:));
 f=sum(A,1);

A separating hyperplane exists if there is a vector, x satisfying

A*x<=0
f*x<0

which can be found by running a linear program solver with objective f*x and constraints A*x<=0 until a feasible solution satisfying f*x<0 is reached. This would probably require just one iteration of LINPROG starting at the initial feasible point x0=0 and using the 'active set' algorithm option.

11 Comments
Show 9 older comments Hide 9 older comments

Oliver on 31 May 2013

Open in MATLAB Online

There is one strange thing though. Just for kicks I tried to see if the method would work on points that I know apriori are inside the hull, in particular I used the points that define the hull as my samples, and in 34D it said that none of the points that define the hull are inside of it! What is going on? I tried the same thing on 3D and found that, while most of the points that define the hull were correctly identified as belonging to it, not all of them were!

%---generate random points---%
ndims = 3;
npts = 1000;
V = rand(npts,ndims);
%---fake samples---%
nsamples = npts;
samples = V;
%---LP algorithm for finding separating hyperplane axis---%
in = false(nsamples,1);
options = optimset('algorithm','active-set','largescale','off','display','off');
C = V;
x0 = zeros(ndims,1);
for i = 1:nsamples
    y = samples(i,:);
    A = -bsxfun(@minus,y,C);
    b = zeros(npts,1);
    f = sum(A,1).';
    x = linprog(f,A,b,[],[],[],[],x0,options);
    if all(x == x0)
        in(i) = true;
    end
end

Matt J on 1 Jun 2013

Edited: Matt J on 3 Jun 2013

in particular I used the points that define the hull as my samples, and in 34D it said that none of the points that define the hull are inside of it! What is going on?

Only points strictly in the interior of the polyhedron will fail the separation criterion f*x<0.

Also, one can only ever hope to distinguish between points that are either strictly in the interior of the polyhedron or strictly in the exterior. Boundary points can never be reliably tested by any method because floating point errors can make them look either slightly inside or slightly outside the polyhedron.

BTW, you should also be setting MaxIter=1. You probably also need some upper and lower bounds, e.g. lb=-ones(D,1), ub=-lb.

Sign in to comment.

Answer 3

Matt J on 29 May 2013

Open in MATLAB Online

0 votes

Taking random convex combinations of the vertices V(i,:) like in the following definitely won't work?

   W=rand(M,N);
     W=bsxfun(@rdivide,W,sum(W,2));
   samples=W*V; %rows of V are the vertices.

6 Comments
Show 4 older comments Hide 4 older comments

Oliver on 3 Jun 2013

Edited: Oliver on 3 Jun 2013

The application has to do with mapping from one space to another. Each of the dimensions of my convex hull corresponds to the coefficient of a generalized fourier series. Call these coefficients the "c" coefficients. In reality it is infinite dimensional, but I'm looking at just the first ~30 independent, non-degenerate, lowest order terms. The allowable values of the coefficients define a convex hull in the infinite dimensional fourier space (and all orthogonal projections are also convex, which is why the first 3 or 30 dimensions taken alone also define a convex hull).

I have another set of fourier coefficients, call them the "t" coefficients, which, under certain assumptions, can be expressed as functions, of the "c" coefficients. I am trying to look at the unrestricted convex hull of the "t" coefficients in their fourier space, and compare it to the convex hull of the "t" coefficients that are computed from the "c" coefficients. This convex hull is smaller and fully contained within the unrestricted hull. I want to quantify how much smaller and what it "looks" like.

My idea for doing this was sampling uniformly over the convex hull of the "c" coefficients in their fourier space, then calculating the corresponding "t" coefficients for each of these points, then compute the convex hull and compare. (For comparison I'm looking at low dimensional orthogonal projections, but the mapping from c -> t requires many of the c coefficients, i.e. a single t coefficient may depend on the values of 30 c coefficients, which is why I need to sample over a 30 dimensional projection of the full infinite dimensional c hull).

So, to answer your question, in reality I don't need statistical uniformity for this to work. I just need to make sure that I've sampled from the entire c-hull. Taking random convex combinations does not accomplish this because all of the samples are near the center so I don't get anything near the boundary. A uniform distribution would just be the most efficient way to fill the hull with points, but any other kind of distribution that has some samples from all over the hull would be acceptable.

Oliver on 4 Jun 2013

Open in MATLAB Online

The mapping is non-linear, and is of the form:

t(m) = sum(alpha(i,j,k).*c(i).*c(j).*c(k))

Where t is a vector of the t-coefficients, and c is a vector of the c-coefficients, and alpha is a constant dependent upon the indices i,j,k. In other words, t is a linear combination of products of 3 c coefficients.

It is true, quantifying the size and shape is difficult. I was hoping that one might get a qualitative feel for their relative sizes by looking at the ratios of the volumes of the convex hulls of t and c, in various 3D orthogonal projections.

From your suggestions, and our discussion it appears that the best solution to this whole issue is to find a way to avoid dealing with 30+ dimensions in the first place. I think I've found a way to do that. The first 5 t coefficients depend only on 6 c coefficients, so I can sample from an orthogonal projection of c-space that is of dimension 6, and follow the procedure above. This makes explicit construction of the convex hull feasible, and allows me to use your vert2lcon function and cprnd to sample without doing rejection sampling. 6D still takes a while (2 days to do 1e6 samples on my machine), but its definitely doable.

Thank you so much for your time and suggestions, they really helped me attack the problem from different angles, which ultimately I think was what I needed to do. I wish I could just accept our whole conversation as an answer, but being unable to, I just selected one of your (many) helpful answers.

Matt J on 4 Jun 2013

OK, glad you found a solution!

Sign in to comment.

"Very" high dimensional convex hulls

7 Comments
Show 5 older comments Hide 5 older comments

Accepted Answer

4 Comments
Show 2 older comments Hide 2 older comments

More Answers (2)

11 Comments
Show 9 older comments Hide 9 older comments

6 Comments
Show 4 older comments Hide 4 older comments

Categories

Products

Tags

Community Treasure Hunt

"Very" high dimensional convex hulls

7 Comments Show 5 older comments Hide 5 older comments

Accepted Answer

4 Comments Show 2 older comments Hide 2 older comments

More Answers (2)

11 Comments Show 9 older comments Hide 9 older comments

6 Comments Show 4 older comments Hide 4 older comments

Categories

Products

Tags

See Also

Community Treasure Hunt

7 Comments
Show 5 older comments Hide 5 older comments

4 Comments
Show 2 older comments Hide 2 older comments

11 Comments
Show 9 older comments Hide 9 older comments

6 Comments
Show 4 older comments Hide 4 older comments