Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
manipulate data to better fit a Gaussian Distribution

Subject: manipulate data to better fit a Gaussian Distribution

From: Francesco Perrone

Date: 19 Mar, 2013 10:15:07

Message: 1 of 7

Hi all,

I have got a question concerning normal distribution (with mu = 0 and sigma = 1).

Let say that I firstly call randn or normrnd this way

x = normrnd(0,1,[4096,1]); % x = randn(4096,1)

Now, to assess how good x values fit the normal distribution, I call

[a,b] = normfit(x);

and to have a graphical support

histfit(x)

Now come to the core of the question: if I am not satisfied enough on how x fits the given normal distribution, how can I optimize x in order to better fit the expected normal distribution with 0 mean and 1 standard deviation?? Sometimes because of the few representation values (i.e. 4096 in this case), x fits really poorly the expected Gaussian, so that I wanna manipulate x (linearly or not, it does not really matter at this stage) in order to get a better fitness.

I'd like remarking that I have access to the statistical toolbox.

I thank you all in advance.

Subject: manipulate data to better fit a Gaussian Distribution

From: Torsten

Date: 19 Mar, 2013 10:42:06

Message: 2 of 7

"Francesco Perrone" <francesco86perrone@yahoo.it> wrote in message <ki9dra$56c$1@newscl01ah.mathworks.com>...
> Hi all,
>
> I have got a question concerning normal distribution (with mu = 0 and sigma = 1).
>
> Let say that I firstly call randn or normrnd this way
>
> x = normrnd(0,1,[4096,1]); % x = randn(4096,1)
>
> Now, to assess how good x values fit the normal distribution, I call
>
> [a,b] = normfit(x);
>
> and to have a graphical support
>
> histfit(x)
>
> Now come to the core of the question: if I am not satisfied enough on how x fits the given normal distribution, how can I optimize x in order to better fit the expected normal distribution with 0 mean and 1 standard deviation?? Sometimes because of the few representation values (i.e. 4096 in this case), x fits really poorly the expected Gaussian, so that I wanna manipulate x (linearly or not, it does not really matter at this stage) in order to get a better fitness.
>
> I'd like remarking that I have access to the statistical toolbox.
>
> I thank you all in advance.

Increase the number of sampling points (4096 in your example)
or
try another random number generator for a normally distributed random variable.

Best wishes
Torsten.

Subject: manipulate data to better fit a Gaussian Distribution

From: Francesco Perrone

Date: 19 Mar, 2013 10:50:07

Message: 3 of 7

"Torsten" wrote in message <ki9fdu$91c$1@newscl01ah.mathworks.com>...
> "Francesco Perrone" <francesco86perrone@yahoo.it> wrote in message <ki9dra$56c$1@newscl01ah.mathworks.com>...
> > Hi all,
> >
> > I have got a question concerning normal distribution (with mu = 0 and sigma = 1).
> >
> > Let say that I firstly call randn or normrnd this way
> >
> > x = normrnd(0,1,[4096,1]); % x = randn(4096,1)
> >
> > Now, to assess how good x values fit the normal distribution, I call
> >
> > [a,b] = normfit(x);
> >
> > and to have a graphical support
> >
> > histfit(x)
> >
> > Now come to the core of the question: if I am not satisfied enough on how x fits the given normal distribution, how can I optimize x in order to better fit the expected normal distribution with 0 mean and 1 standard deviation?? Sometimes because of the few representation values (i.e. 4096 in this case), x fits really poorly the expected Gaussian, so that I wanna manipulate x (linearly or not, it does not really matter at this stage) in order to get a better fitness.
> >
> > I'd like remarking that I have access to the statistical toolbox.
> >
> > I thank you all in advance.
>
> Increase the number of sampling points (4096 in your example)
> or
> try another random number generator for a normally distributed random variable.
>
> Best wishes
> Torsten.

It's quite a simplistic method.

Unfortunately, I cannot magnify the number of representations because of some reasons I will not explain here in detail (theory beyond the code I am writing). Besides, what else random generator may I use?

I do believe that is a way to "force" data better fitting the expected normal distribution.

Regards,
Francesco

Subject: manipulate data to better fit a Gaussian Distribution

From: Torsten

Date: 19 Mar, 2013 11:30:07

Message: 4 of 7

"Francesco Perrone" <francesco86perrone@yahoo.it> wrote in message <ki9fsv$abp$1@newscl01ah.mathworks.com>...
> "Torsten" wrote in message <ki9fdu$91c$1@newscl01ah.mathworks.com>...
> > "Francesco Perrone" <francesco86perrone@yahoo.it> wrote in message <ki9dra$56c$1@newscl01ah.mathworks.com>...
> > > Hi all,
> > >
> > > I have got a question concerning normal distribution (with mu = 0 and sigma = 1).
> > >
> > > Let say that I firstly call randn or normrnd this way
> > >
> > > x = normrnd(0,1,[4096,1]); % x = randn(4096,1)
> > >
> > > Now, to assess how good x values fit the normal distribution, I call
> > >
> > > [a,b] = normfit(x);
> > >
> > > and to have a graphical support
> > >
> > > histfit(x)
> > >
> > > Now come to the core of the question: if I am not satisfied enough on how x fits the given normal distribution, how can I optimize x in order to better fit the expected normal distribution with 0 mean and 1 standard deviation?? Sometimes because of the few representation values (i.e. 4096 in this case), x fits really poorly the expected Gaussian, so that I wanna manipulate x (linearly or not, it does not really matter at this stage) in order to get a better fitness.
> > >
> > > I'd like remarking that I have access to the statistical toolbox.
> > >
> > > I thank you all in advance.
> >
> > Increase the number of sampling points (4096 in your example)
> > or
> > try another random number generator for a normally distributed random variable.
> >
> > Best wishes
> > Torsten.
>
> It's quite a simplistic method.
>
> Unfortunately, I cannot magnify the number of representations because of some reasons I will not explain here in detail (theory beyond the code I am writing). Besides, what else random generator may I use?
>
> I do believe that is a way to "force" data better fitting the expected normal distribution.
>

I'm not an expert in this area, but in my opinion, every deterministic attempt to manipulate the data after their generation will weaken their randomness.
A random number generator always makes a compromise between performance
and quality. If speed is not important for your application, there should be random number generators with higher quality than randn. Make a GOOGLE search.

> Regards,
> Francesco

Best wishes
Torsten.

Subject: manipulate data to better fit a Gaussian Distribution

From: Torsten

Date: 19 Mar, 2013 12:46:06

Message: 5 of 7

"Torsten" wrote in message <ki9i7v$gjp$1@newscl01ah.mathworks.com>...
> "Francesco Perrone" <francesco86perrone@yahoo.it> wrote in message <ki9fsv$abp$1@newscl01ah.mathworks.com>...
> > "Torsten" wrote in message <ki9fdu$91c$1@newscl01ah.mathworks.com>...
> > > "Francesco Perrone" <francesco86perrone@yahoo.it> wrote in message <ki9dra$56c$1@newscl01ah.mathworks.com>...
> > > > Hi all,
> > > >
> > > > I have got a question concerning normal distribution (with mu = 0 and sigma = 1).
> > > >
> > > > Let say that I firstly call randn or normrnd this way
> > > >
> > > > x = normrnd(0,1,[4096,1]); % x = randn(4096,1)
> > > >
> > > > Now, to assess how good x values fit the normal distribution, I call
> > > >
> > > > [a,b] = normfit(x);
> > > >
> > > > and to have a graphical support
> > > >
> > > > histfit(x)
> > > >
> > > > Now come to the core of the question: if I am not satisfied enough on how x fits the given normal distribution, how can I optimize x in order to better fit the expected normal distribution with 0 mean and 1 standard deviation?? Sometimes because of the few representation values (i.e. 4096 in this case), x fits really poorly the expected Gaussian, so that I wanna manipulate x (linearly or not, it does not really matter at this stage) in order to get a better fitness.
> > > >
> > > > I'd like remarking that I have access to the statistical toolbox.
> > > >
> > > > I thank you all in advance.
> > >
> > > Increase the number of sampling points (4096 in your example)
> > > or
> > > try another random number generator for a normally distributed random variable.
> > >
> > > Best wishes
> > > Torsten.
> >
> > It's quite a simplistic method.
> >
> > Unfortunately, I cannot magnify the number of representations because of some reasons I will not explain here in detail (theory beyond the code I am writing). Besides, what else random generator may I use?
> >
> > I do believe that is a way to "force" data better fitting the expected normal distribution.
> >
>
> I'm not an expert in this area, but in my opinion, every deterministic attempt to manipulate the data after their generation will weaken their randomness.
> A random number generator always makes a compromise between performance
> and quality. If speed is not important for your application, there should be random number generators with higher quality than randn. Make a GOOGLE search.
>
> > Regards,
> > Francesco
>
> Best wishes
> Torsten.

Of course, if randomness of the numbers chosen does not matter,
you can proceed as follows:

1. Choose an equidistant grid on [0:1] (e.g. p=[1/4 1/2 3/4]).
2. Calculate X=norminv(p,0,1)
3. Between X(i) and X(i+1), place 4096/(n-1) equidistant points where n is the length of the vector p (in this case n=3).
4. The collection of all these points will approximately follow a standard-normal distribution.

Best wishes
Torsten.

Subject: manipulate data to better fit a Gaussian Distribution

From: Francesco Perrone

Date: 19 Mar, 2013 14:48:10

Message: 6 of 7

"Torsten" wrote in message <ki9mme$t77$1@newscl01ah.mathworks.com>...
> "Torsten" wrote in message <ki9i7v$gjp$1@newscl01ah.mathworks.com>...
> > "Francesco Perrone" <francesco86perrone@yahoo.it> wrote in message <ki9fsv$abp$1@newscl01ah.mathworks.com>...
> > > "Torsten" wrote in message <ki9fdu$91c$1@newscl01ah.mathworks.com>...
> > > > "Francesco Perrone" <francesco86perrone@yahoo.it> wrote in message <ki9dra$56c$1@newscl01ah.mathworks.com>...
> > > > > Hi all,
> > > > >
> > > > > I have got a question concerning normal distribution (with mu = 0 and sigma = 1).
> > > > >
> > > > > Let say that I firstly call randn or normrnd this way
> > > > >
> > > > > x = normrnd(0,1,[4096,1]); % x = randn(4096,1)
> > > > >
> > > > > Now, to assess how good x values fit the normal distribution, I call
> > > > >
> > > > > [a,b] = normfit(x);
> > > > >
> > > > > and to have a graphical support
> > > > >
> > > > > histfit(x)
> > > > >
> > > > > Now come to the core of the question: if I am not satisfied enough on how x fits the given normal distribution, how can I optimize x in order to better fit the expected normal distribution with 0 mean and 1 standard deviation?? Sometimes because of the few representation values (i.e. 4096 in this case), x fits really poorly the expected Gaussian, so that I wanna manipulate x (linearly or not, it does not really matter at this stage) in order to get a better fitness.
> > > > >
> > > > > I'd like remarking that I have access to the statistical toolbox.
> > > > >
> > > > > I thank you all in advance.
> > > >
> > > > Increase the number of sampling points (4096 in your example)
> > > > or
> > > > try another random number generator for a normally distributed random variable.
> > > >
> > > > Best wishes
> > > > Torsten.
> > >
> > > It's quite a simplistic method.
> > >
> > > Unfortunately, I cannot magnify the number of representations because of some reasons I will not explain here in detail (theory beyond the code I am writing). Besides, what else random generator may I use?
> > >
> > > I do believe that is a way to "force" data better fitting the expected normal distribution.
> > >
> >
> > I'm not an expert in this area, but in my opinion, every deterministic attempt to manipulate the data after their generation will weaken their randomness.
> > A random number generator always makes a compromise between performance
> > and quality. If speed is not important for your application, there should be random number generators with higher quality than randn. Make a GOOGLE search.
> >
> > > Regards,
> > > Francesco
> >
> > Best wishes
> > Torsten.
>
> Of course, if randomness of the numbers chosen does not matter,
> you can proceed as follows:
>
> 1. Choose an equidistant grid on [0:1] (e.g. p=[1/4 1/2 3/4]).
> 2. Calculate X=norminv(p,0,1)
> 3. Between X(i) and X(i+1), place 4096/(n-1) equidistant points where n is the length of the vector p (in this case n=3).
> 4. The collection of all these points will approximately follow a standard-normal distribution.
>
> Best wishes
> Torsten.

I would more go for a least-squares fitting, but I don't really have a clue how to setup it within MATLAB.

Generally, I would call a reference random distribution fitting the expected normal distribution quite reliably:

x_ref = normrnd(0,1,[400000 1]);

Then, the standard data I have

x_act = normrnd(0,1,[4000 1]);

Once these two vectors are generated, I would call lsqcurvefit to minimize the error between the reference and actual values. But I am stuck on how to implement it correctly.

Subject: manipulate data to better fit a Gaussian Distribution

From: Torsten

Date: 19 Mar, 2013 15:15:19

Message: 7 of 7

"Francesco Perrone" <francesco86perrone@yahoo.it> wrote in message <ki9tra$ocv$1@newscl01ah.mathworks.com>...
> "Torsten" wrote in message <ki9mme$t77$1@newscl01ah.mathworks.com>...
> > "Torsten" wrote in message <ki9i7v$gjp$1@newscl01ah.mathworks.com>...
> > > "Francesco Perrone" <francesco86perrone@yahoo.it> wrote in message <ki9fsv$abp$1@newscl01ah.mathworks.com>...
> > > > "Torsten" wrote in message <ki9fdu$91c$1@newscl01ah.mathworks.com>...
> > > > > "Francesco Perrone" <francesco86perrone@yahoo.it> wrote in message <ki9dra$56c$1@newscl01ah.mathworks.com>...
> > > > > > Hi all,
> > > > > >
> > > > > > I have got a question concerning normal distribution (with mu = 0 and sigma = 1).
> > > > > >
> > > > > > Let say that I firstly call randn or normrnd this way
> > > > > >
> > > > > > x = normrnd(0,1,[4096,1]); % x = randn(4096,1)
> > > > > >
> > > > > > Now, to assess how good x values fit the normal distribution, I call
> > > > > >
> > > > > > [a,b] = normfit(x);
> > > > > >
> > > > > > and to have a graphical support
> > > > > >
> > > > > > histfit(x)
> > > > > >
> > > > > > Now come to the core of the question: if I am not satisfied enough on how x fits the given normal distribution, how can I optimize x in order to better fit the expected normal distribution with 0 mean and 1 standard deviation?? Sometimes because of the few representation values (i.e. 4096 in this case), x fits really poorly the expected Gaussian, so that I wanna manipulate x (linearly or not, it does not really matter at this stage) in order to get a better fitness.
> > > > > >
> > > > > > I'd like remarking that I have access to the statistical toolbox.
> > > > > >
> > > > > > I thank you all in advance.
> > > > >
> > > > > Increase the number of sampling points (4096 in your example)
> > > > > or
> > > > > try another random number generator for a normally distributed random variable.
> > > > >
> > > > > Best wishes
> > > > > Torsten.
> > > >
> > > > It's quite a simplistic method.
> > > >
> > > > Unfortunately, I cannot magnify the number of representations because of some reasons I will not explain here in detail (theory beyond the code I am writing). Besides, what else random generator may I use?
> > > >
> > > > I do believe that is a way to "force" data better fitting the expected normal distribution.
> > > >
> > >
> > > I'm not an expert in this area, but in my opinion, every deterministic attempt to manipulate the data after their generation will weaken their randomness.
> > > A random number generator always makes a compromise between performance
> > > and quality. If speed is not important for your application, there should be random number generators with higher quality than randn. Make a GOOGLE search.
> > >
> > > > Regards,
> > > > Francesco
> > >
> > > Best wishes
> > > Torsten.
> >
> > Of course, if randomness of the numbers chosen does not matter,
> > you can proceed as follows:
> >
> > 1. Choose an equidistant grid on [0:1] (e.g. p=[1/4 1/2 3/4]).
> > 2. Calculate X=norminv(p,0,1)
> > 3. Between X(i) and X(i+1), place 4096/(n-1) equidistant points where n is the length of the vector p (in this case n=3).
> > 4. The collection of all these points will approximately follow a standard-normal distribution.
> >
> > Best wishes
> > Torsten.
>
> I would more go for a least-squares fitting, but I don't really have a clue how to setup it within MATLAB.
>
> Generally, I would call a reference random distribution fitting the expected normal distribution quite reliably:
>
> x_ref = normrnd(0,1,[400000 1]);
>
> Then, the standard data I have
>
> x_act = normrnd(0,1,[4000 1]);
>
> Once these two vectors are generated, I would call lsqcurvefit to minimize the error between the reference and actual values. But I am stuck on how to implement it correctly.

And what should be the restrictions about possible changes in x_act during this fitting process ?
If you don't define any restrictions, you just have the problem of grouping 4000 elements over (-oo,oo) such that their positioning best resembles a standard normal distribution.
To study this problem, you don't need to generate the 4000 data points by normrnd,
but you can follow the guidelines (1)-(4) of my above suggestion.

Best wishes
Torsten.

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us