Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
why mvnrnd works with a singular covariance matrix

Subject: why mvnrnd works with a singular covariance matrix

From: Farzin Zareian

Date: 11 May, 2010 17:09:05

Message: 1 of 7

Hello Matlab users,

I am puzzled to know how matlab generates random vectors chosen from a multivariate normal distribution with a mean vector MU and a singular covariance matrix SIGMA.

I am trying to generate large number (say 2000) of random vectors with a target MU and SIGMA. I develop the MU and SIGMA from available data. Lets assume there are 7 variables, and I only have 3 realizations from measured data. This means that the covariance matrix is not full rank (is singular). I am puzzled to see mvnrnd(MU,SIGMA,2000) still works and generates the random vectors.

I guess there are routines embedded in mvnrnd.m that takes care of the singularity issue.

thanks a bunch for your kind help
farzin

Subject: why mvnrnd works with a singular covariance matrix

From: Peter Perkins

Date: 11 May, 2010 17:49:17

Message: 2 of 7

On 5/11/2010 1:09 PM, Farzin Zareian wrote:
> I am trying to generate large number (say 2000) of random vectors with a
> target MU and SIGMA. I develop the MU and SIGMA from available data.
> Lets assume there are 7 variables, and I only have 3 realizations from
> measured data. This means that the covariance matrix is not full rank
> (is singular). I am puzzled to see mvnrnd(MU,SIGMA,2000) still works and
> generates the random vectors.
>
> I guess there are routines embedded in mvnrnd.m that takes care of the
> singularity issue.

Farzin, there are, and you can look at them simply by opening the
mvnfunction in the editor:

"edit mvnrnd"

The short answer is that all MVNRND need do is find a matrix T such that
T*T' = S, where S is you desired cov matrix. T need not be a Cholesky
factor.

Consider this simple example that starts from the opposite direction (in
a kind of mixed math/MATLAB notation):

x ~ N(0,1)
[y1; y2] = [1; 1]*x
    => [y1; y2] ~ MNV([0; 0], [1 1; 1 1])

and certainly that cov matrix [1 1; 1 1] is singular.

However, the question of whether what you generate using your (probably
very) poor estimate of the covariance is useful, is debatable. Hope
this helps.

Subject: why mvnrnd works with a singular covariance matrix

From: Roger Stafford

Date: 11 May, 2010 20:45:22

Message: 3 of 7

"Farzin Zareian" <zareian@uci.edu> wrote in message <hsc2vg$lj8$1@fred.mathworks.com>...
> Hello Matlab users,
>
> I am puzzled to know how matlab generates random vectors chosen from a multivariate normal distribution with a mean vector MU and a singular covariance matrix SIGMA.
>
> I am trying to generate large number (say 2000) of random vectors with a target MU and SIGMA. I develop the MU and SIGMA from available data. Lets assume there are 7 variables, and I only have 3 realizations from measured data. This means that the covariance matrix is not full rank (is singular). I am puzzled to see mvnrnd(MU,SIGMA,2000) still works and generates the random vectors.
>
> I guess there are routines embedded in mvnrnd.m that takes care of the singularity issue.
>
> thanks a bunch for your kind help
> farzin

  There is nothing inherently wrong with having singular covariance matrices, provided they are based on adequately large quantities of data. Such singularities simply mean that the jointly normal random variables involved are constrained by one or more linear equalities, as in Peter's example where y1-y2 = 0. Any 'mvnrnd' function worth its salt should be able to handle such covariance matrices, regardless of any singularities.

  However, from your description it sounds as though your data was very limited and therefore, as Peter indicates, the results are questionable. With three "realizations" I would go further and say the results would be almost meaningless. With seven variables you need a very large number of statistically valid observations in order to determine a reliable covariance matrix.

Roger Stafford

Subject: why mvnrnd works with a singular covariance matrix

From: Farzin Zareian

Date: 11 May, 2010 21:52:04

Message: 4 of 7

Peter and Roger

Many thanks for your instructive responses. After reading your comments and the few other sources I can see your points and now I have better handle on the situation.

As my last question, what is the measure of reliability of the fitted distribution to data. As you mentioned, with three realizations it is meaningless to fit a seven dimensional joint normal distribution. How many realizations is the minimum for getting meaningful results for number of variables equal to N?

many thanks
farzin

Subject: why mvnrnd works with a singular covariance matrix

From: Roger Stafford

Date: 12 May, 2010 02:05:20

Message: 5 of 7

"Farzin Zareian" <zareian@uci.edu> wrote in message <hscji4$s04$1@fred.mathworks.com>...
> Peter and Roger
>
> Many thanks for your instructive responses. After reading your comments and the few other sources I can see your points and now I have better handle on the situation.
>
> As my last question, what is the measure of reliability of the fitted distribution to data. As you mentioned, with three realizations it is meaningless to fit a seven dimensional joint normal distribution. How many realizations is the minimum for getting meaningful results for number of variables equal to N?
>
> many thanks
> farzin

  I was afraid you might ask that question. I can only answer it in a general way.

  To begin with, if you have a single random variable, consider the effect of using a number of independent observations to estimate its mean value, and then ask what one can expect the standard deviation to be between this estimate and the true mean. You will find that it is the original random variable's standard deviation divided by the square root of the number of observations. In other words if you make a hundred independent observations, your accuracy of mean estimate only gets better from that of a single observation by a factor of ten. Something similar pertains to sample estimates made of variance values for the random variable. Also with more than one random variable one gets a similar situation in estimating covariance values. It takes a lot of observations to make them accurate.

  A second consideration is that as the number of variables increases, the assumption of joint normality becomes increasingly critical. Forget that assumption for a moment and imagine you have a standard two-dimensional chess board representing eight discrete values for each of two random variables. Clearly you will have to have enough samples to put a fairly large number of pairs in each square to make a good estimate of joint probability - a good-sized multiple of 64 at the very least. Now change this to a seven-dimensional chess board corresponding to your seven variables, if you can imagine such a thing. If you want to have several samples in each - what shall we call them - each seven-dimensional "square", you would need a healthy multiple of eight to the power seven, which would be some large multiple of two million samples. It is precisely to avoid such horrific requirements
that statisticians are so eager to assume joint normality in so many situations. It greatly reduces the number of observations that would be necessary in their work.

  However, this reasoning should at least strike a note of caution to you that in a seven-variable situation, either you had better be awfully sure of joint normality, or the number of samples should be a good deal more than the above square-root-of-the-number-of-observations criterion would indicate.

  Now as you see I have done a lot of hand-waving over this question, but perhaps it will help you a little, even if it is unwelcome information.

  (By the way, I too was a graduate student at UCI in years past - in fact at the time they first opened.)

Roger Stafford

Subject: why mvnrnd works with a singular covariance matrix

From: Farzin Zareian

Date: 12 May, 2010 03:23:04

Message: 6 of 7

Hi Roger

Its great to be guided by an UCI alumnus. Many thanks for your detailed response.

I can totally understand the situation and the large level of uncertainty in my estimates. I have no choice other than assuming that my seven variables are joint normal.

I will try to dig into the literature to see if I can find a measure by which I can quantify the level of uncertainty for the joint normal distribution parameters estimates. Please drop me a line if anything comes to your mind.

I will post my findings here for other's reference.

best
farzin

Subject: why mvnrnd works with a singular covariance matrix

From: Greg Heath

Date: 12 May, 2010 08:56:37

Message: 7 of 7

On May 11, 11:23 pm, "Farzin Zareian" <zare...@uci.edu> wrote:
> Hi Roger
>
> Its great to be guided by an UCI alumnus. Many thanks for your detailed response.
>
> I can totally understand the situation and the large level of uncertainty in my estimates. I have no choice other than assuming that my seven variables are joint normal.
>
> I will try to dig into the literature to see if I can find a measure by which I can quantify the level of uncertainty for the joint normal distribution parameters estimates. Please drop me a line if anything comes to your mind.
>
> I will post my findings here for other's reference.

Most of my successful simulations based on Gaussian
distributions tend to have an observation to estimated
parameter ratio that exceeds 13, i.e. , for each
Gaussian

Nobs = ceil(r*Np) for r >~13

where

Np = n+n*(n+1)/2

Typically r ~ 30 tends to be sufficient for the amount
of precision that I require.

If you want to get something more reliable, go to
the statistics handbooks to find expressions for
the standard deviation of estimates for means,
variances, and correlation coefficients given
that the joint distribution is Gaussian.

Hope this helps.

Greg

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us