MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn moreOpportunities for recent engineering grads.

Apply TodayTo resolve issues starting MATLAB on Mac OS X 10.10 (Yosemite) visit: http://www.mathworks.com/matlabcentral/answers/159016

Asked by John on 12 Apr 2012

Hello,

How would you transform variables with uniform distribution [0,1] to variables with a normal normal distribution in Matlab?

Thank you

John

*No products are associated with this question.*

Answer by Kye Taylor on 12 Apr 2012

Accepted answer

I will assume that your uniform random variables are stored in an array just like the one created with the commands

X = rand(2000,1); % the number of points is arbitrary

To create a sample of random variables drawn from a normal distribution with parameters (mu,sigma) defined as

mu = 0; sigma = 1;

use the command

Y = mu + sqrt(2)*sigma*erfinv(2*X-1);

The right hand side of the equation above is the inverse of the CDF associated with a Normal(mu,sigma) random variable... see http://en.wikipedia.org/wiki/Inverse_transform_sampling

John on 12 Apr 2012

Hello Kye,

Thank you for your reply.

Yes - I think this is what I want to do.

Although, when I plot a histogram of Y it does not have the typical Normal shape?

What I am really doing is:

I have some data that I know follows a logNormal distribution.

I transformed the data to uniform using its CDF.

And now I'm trying to transform it to normal.

Here are my commands:

DT=load('Departure Times (hr).txt');

Y = cdf('Lognormal',DT,2.2268,0.36631);

So, if I get the mean and SD of my uniform variables and apply your formula will that transform it to Normal?

Thank you

John

Answer by Peter Perkins on 12 Apr 2012

John, presumably you know about the randn function, which generates standard normal values. So I guess that your original question really was, "if I have uniform values, how do I transform them," and not, "how do I generate normal random values."

You said "normal normal distribution". I can't tell if this is a typo, or if you mean "standard normal", i.e. N(mean=0, std=1). If you mean, "transform to the normal distribution that corresponds to the lognormal," then all this is kind of pointless, since you can just take the log of data drawn from a lognormal to transform it to normal. But if you really mean "transform to a standard normal", then

1) As a philosophical point (and I suspect that this is what Kye was getting at when he said, "The CDF of a random variable is NOT a uniform random variable"), you're doing a standard thing in a theoretically invalid context. If you draw values from a fully known LN(mu,sigma), and use cdf('logn',x,mu,sigma) to transform them, the exact distribution of the transformed values is indeed U(0,1). But (I assume) you don't know the true lognormal distribution that your data came from, and so the transformation you're doing is only approximate. What you end up with will be as if drawn from something only approximately standard uniform. But since the assumption of log-normality is an approximation anyway, ... People do do this, but you have to be careful. Ask yourself why you are making this transformation, and if there is another way to attack whatever problem you are trying to solve.

2) You seem to have the Statistics Toolbox. So to transform from lognormal to uniform, you can use logncdf (or use the cdf function as you did), and to transform from uniform to normal, you can use the norminv function (or use the icdf function). But you have to use the right parameters in each case. For the lognormal->uniform, you'll want to use the mu/sigma lognormal parameters as MATLAB defines them. For the uniform->normal transformation, you'll want to use the mu/sigma normal parameters of your target distribution (which are just 0 and 1, if you do mean "standard normal").

When Kye said, "... you are only evaluating the lognormal **density**", I think he meant "lognormal **cumulative probability**. Your use of the cdf function *is* the correct transformation, modulo the above caveats (1) about using estimated parameters. But you may find logncdf and norminv simpler than erfcinv and cdf/icdf.

John on 13 Apr 2012

Hello Peter and Kye,

Thank you for taking the time to respond in detail to my question. I appreciate the effort.

Briefly what I am doing is modelling dependent random variables using a copula function. In order to do this I believe the method is to first to transform the random variables to a uniform distribution using their CDF. Next transform the uniform variables to normal variables using inverse standard normal distribution. Then this allows you to estimate the product normal distribution between the normal variables.

What Peter is suggest is pretty much exactly what I want to do.

I used an automatic distribution fitting tool in excel to find the distribution of the random variables in the first place and that is where I got the shape parameters and mu and sigma etc.

However is there a way in matlab to transform the distribution to a uniform distribution without knowing the distribution in the first place.

Also a Dagum distribution is the best fit for my data but it is not a supported cdf in Matlab. Log normal is actually only the 10th best fit for my data. This is why I am asking if there is a method to transform to uniform without having to use a support CDF?

Thank you

John

Peter Perkins on 13 Apr 2012

Lots of people using copulas use nonparametric marginals to transform to the unit hypercube. See this extended example

http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/nonparametricCDFdemo.html

that ships with the Statistics Toolbox. You may also find this section of the Statistics Toolbox User Guide helpful:

<http://www.mathworks.com/help/toolbox/stats/brklrj3.html#bqttfgl-1>

Answer by Kye Taylor on 12 Apr 2012

You know that the histogram shouldn't exactly match a Gaussian... Nevertheless, if you use the command above, you are guaranteed that Y will be sampled according to a normal distribution with parameters mu and sigma. Try adding more bins and more points to see the normal's shape:

mu = 0; sigma = 1; X = rand(5000,1); Y = mu + sqrt(2)*sigma*erfinv(2*X-1);

[n,xout] = hist(Y,50);

figure,hold on plot(xout,n/sum(n)/(xout(2)-xout(1)),'k.') plot(xout, 1/sqrt(2*pi*sigma^2)*exp(-(xout-mu).^2/2/sigma^2)); legend('Observed density', 'actual density');

## 0 Comments