Is there a bug in Multidimensional Input to mhsample?

Question

2 votes

I was trying to sample from a multidimensional pdf using `mhsample` but was getting the error

Index exceeds matrix dimensions.
Error in mhsample (line 174)
    x0(acc,:) = y(acc,:); % preserves x's shape.

However when I edited line 174 to

x0(:,acc) = y(:,acc); % preserves x's shape.

The function appears to have worked.

Some sample code to try:

start = zeros(1,40);
delta = .5;
proppdf = @(x,y) unifpdf(y-x,-delta,delta);
proprnd = @(x) x + rand(1,40)*2*delta - delta;
pdf_log = @(x) sum(log(mvnpdf(x, zeros(1,40), eye(40))));
smpl = mhsample(start,100,'logpdf',pdf_log,'proppdf',proppdf, 'proprnd',proprnd);

Is this a bug in the multidimensional implementation of mhsample or have I missed something?

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Tom Lane on 17 Dec 2014

Open in MATLAB Online

0 votes

It looks like there's something that could be improved. The error message was no help. I think you want to specify a multivariate proposal distribution, not the univariate uniform one you have. Maybe this:

proppdf = @(x,y) prod(unifpdf(y-x,-delta,delta),2);

1 Comment
Show -1 older comments Hide -1 older comments

Erik Hauri on 23 Jan 2018

Its not about proppdf - Robert suggests a row-column mixup. (:,acc) instead of (acc,:) which makes sense since acc is given in the 2nd column when working in only 1 dimension (start is a scalar).

Sign in to comment.

Answer 2

Erik Hauri on 23 Jan 2018

0 votes

I have observed the same. Taking the 2nd example in the documentation (sampling data from a standard normal distribution), after making your modification to mhsample - if start and delta are provided as row vectors, then the calculation works and results in x are given as column vectors.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 3

Don Mathis on 24 Jan 2018

Open in MATLAB Online

0 votes

In the posted code, the problem is that the proposal distribution function 'proppdf' does not return a scalar probability density as required. Instead it returns a vector.

According to the documentation (<http://www.mathworks.com/help/stats/mhsample.html)>,

" proppdf takes two arguments as inputs with the same type and size as start. ... The proposal distribution q(x,y) gives the probability density for choosing x as the next point when y is the current point."

In the posted code, ‘start’ has size [1,40]. If I pass two vectors of that size to proppdf I do not get the required scalar. Instead I get a vector:

>> rng(0);
>> start = rand(1,40);
>> proppdf(start, 2*start)
ans =
  Columns 1 through 17
     1     1     0     1     0     1     0     0     1     1     0     1     1     1     1     1     0
  Columns 18 through 34
     0     0     1     1     0     1     1     0     0     1     0     1     0     0     1     0     0
  Columns 35 through 40
     1     1     1     1     0     0

proppdf was defined like this:

proppdf = @(x,y) unifpdf(y-x,-delta,delta);

But unifpdf is a univariate pdf, and interprets “y-x” as a row vector of scalar x’s, not a single multivariate x.

Maybe you intended to combine the 40 components of x to get the multivariate uniform density. If the components are independent, that’s just the product. If we do that, we get valid sampling:

Here is an editied version of the code that draws 1000 samples and plots the chain:

start = zeros(1,40);
delta = .5;
proppdf = @(x,y) prod(unifpdf(y-x,-delta,delta)); 
proprnd = @(x) x + rand(1,40)*2*delta - delta; 
pdf_log = @(x) sum(log(mvnpdf(x, zeros(1,40), eye(40)))); 
smpl = mhsample(start,1000,'logpdf',pdf_log,'proppdf',proppdf, 'proprnd',proprnd);
figure;
plot(smpl);
title('1000 samples');

We can also verify that the proposal distribution was used correctly. We can see that no two consecutive points differed by more than 0.5 on any component, which seems to be the intent of the proposal distribution:

>> max(max(abs(diff(smpl))))
ans =
    0.5000

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 4

Erik Hauri on 24 Jan 2018

Open in MATLAB Online

0 votes

No, I think the original intention (and this was my intention) was that x and y were vectors of scalars, and that proppdf should thus produce a vector of the same dimension as x. In the above application each element of the vector 'start' would be an initial value for the MH routine, with its corresponding values of y (and proppdf, proprnd, etc). In this way, proppdf (and thus unifpdf) should not produce a single scalar, but rather a vector of scalars corresponding the value of its associated array element in 'start'.

The code below I think shows that in unifpdf(y-x,A,B) the function unifpdf properly identifies the quantity (y-x) as the difference between elements of the vectors y and x.

y=1:0.1:1.9; >> x=0.1:0.1:1; >> A=0.1*ones(1,10); >> B=2:1:11; >> unifpdf(y-x,A,B)

ans =

    0.5263    0.3448    0.2564    0.2041    0.1695    0.1449    0.1266    0.1124    0.1010    0.0917

>> xx=x'; >> yy=y'; >> AA=A'; >> BB=B'; unifpdf(yy-xx,AA,BB)

ans =

Using prod(unifpdf(......)) produces a single scalar, not a vector as required for a multidimensional MH routine.

1 Comment
Show -1 older comments Hide -1 older comments

Don Mathis on 25 Jan 2018

Open in MATLAB Online

Ok, are you saying that you would like to run 40 univariate chains in a single call to mhsample? You're sampling from a univariate distribution? If so, here is how you can do that:

% Run concurrent univariate chains:
nsamples = 1000;
nchain = 40;                                        % Number of chains to run.
start = zeros(nchain,1);                            % A column vector of starting points.
delta = .5;
proppdf = @(x,y) unifpdf(y-x,-delta,delta);         % This must return a column vector.
proprnd = @(x) x + rand(nchain,1)*2*delta - delta;  % This must return a column vector.
pdf_log = @(x) log(normpdf(x, 0, 1));               % This must return a column vector.
smpl = mhsample(start,nsamples,'logpdf',pdf_log,'proppdf',proppdf, 'proprnd',proprnd, ...
                'nchain', nchain);                  % Pass 'nchain' to run multiple concurrent chains.
% smpl is a 3D array of size [nsamples nvars nchain]
size(smpl)
% Plot chains 13 and 25 on the same plot
figure;
hold on
plot(smpl(:,1,13),'r');
plot(smpl(:,1,25),'b');
legend('Chain 13','Chain 25')
% Plot all chains on the same plot
figure;
plot(squeeze(smpl));
title('All chains');

Some notes:

1. The 'start' argument needs to have one row per chain. Since we're sampling from a univariate distribution, 'start' has only one column.

2. Each of the 3 function handles must return a column vector (one value per chain).

3. You need to pass the 'nchain' argument to mhsample.

Do I understand correctly that this is what you're looking for?

Sign in to comment.

Answer 5

Don Mathis on 25 Jan 2018

Open in MATLAB Online

0 votes

We now have on this page examples of how to run a single multivariate chain , and how to run multiple univariate chains . Here is how you can run multiple multivariate chains:

%%Multiple multivariate chains
nsamples = 1000;                                        % Number of samples per chain.
nchain = 50;                                            % Number of chains to run.
dim = 3;                                                % Dimensionality of the random variable.
start = zeros(nchain,dim);                              % A matrix of starting points.
delta = .5;
proppdf = @(x,y) prod(unifpdf(y-x,-delta,delta),2);     % This must return a [nchain 1] vector.
proprnd = @(x) x + rand(nchain,dim)*2*delta - delta;    % This must return a [nchain dim] matrix.
pdf_log = @(x) log(mvnpdf(x, zeros(1,dim), eye(dim)));  % This must return a [nchain 1] vector.
smpl = mhsample(start,nsamples,'logpdf',pdf_log,'proppdf',proppdf, 'proprnd',proprnd, ...
                'nchain', nchain);                      % Pass 'nchain' to run multiple concurrent chains.
% smpl is a 3D array of size [nsamples nvars nchain]
size(smpl)
% Plot the components of chain 13 on a single plot
figure;
hold on
plot(smpl(:,:,13));

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 6

Erik Hauri on 25 Jan 2018

0 votes

Ah OK - use nchain and dimension 'start' as a column vector rather than a row vector. This makes sense, thank you!

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Is there a bug in Multidimensional Input to mhsample?

0 Comments
Show -2 older comments Hide -2 older comments

Answers (6)

1 Comment
Show -1 older comments Hide -1 older comments

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Products

Tags

Community Treasure Hunt

Is there a bug in Multidimensional Input to mhsample?

0 Comments Show -2 older comments Hide -2 older comments

Answers (6)

1 Comment Show -1 older comments Hide -1 older comments

0 Comments Show -2 older comments Hide -2 older comments

0 Comments Show -2 older comments Hide -2 older comments

1 Comment Show -1 older comments Hide -1 older comments

0 Comments Show -2 older comments Hide -2 older comments

0 Comments Show -2 older comments Hide -2 older comments

Categories

Products

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments