Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
How to generate correlated data based on the given real data

Subject: How to generate correlated data based on the given real data

From: fatih

Date: 30 May, 2011 22:38:02

Message: 1 of 6

Hi,
I have a data distribution which the size is 1 * 15000 . However, this data does not follow any kind of distribution such as gaussian. Besides this I have a correlation matrix. I want to generate correlated data based on that matrix although I have only single data distribution. If anyone can help me, It will be great.
Regards,

Subject: How to generate correlated data based on the given real data

From: TideMan

Date: 30 May, 2011 23:31:12

Message: 2 of 6

On May 31, 10:38 am, "fatih " <fatihaba...@yahoo.com> wrote:
> Hi,
> I have a data distribution which the size is 1 * 15000 . However, this data does not follow any kind of distribution such as gaussian. Besides this I have a correlation matrix. I want to generate correlated data based on that matrix although I have only single data distribution. If anyone can help me, It will be great.
> Regards,

Well, to generate random numbers fitting the distribution of your
data, you need to use an empirical CDF and bootstrapping. Search this
forum for empirical CDF and you'll find lots of references.

As for correlation, you can use your generated random numbers in a
Markov chain, but I don't understand how you've gotten a correlation
matrix from only one variable. Maybe you can explain this?

Subject: How to generate correlated data based on the given real data

From: Roger Stafford

Date: 30 May, 2011 23:49:05

Message: 3 of 6

"fatih " <fatihabanoz@yahoo.com> wrote in message <is168a$fk1$1@newscl01ah.mathworks.com>...
> Hi,
> I have a data distribution which the size is 1 * 15000 . However, this data does not follow any kind of distribution such as gaussian. Besides this I have a correlation matrix. I want to generate correlated data based on that matrix although I have only single data distribution. If anyone can help me, It will be great.
> Regards,
- - - - - - - - -
  The notion of a single dimensional distribution possessing some kind of correlation doesn't make sense to me unless you have in mind autocorrelation - that is, a correlation between successive values of a single random variable. Even in that case the natural representation of such autocorrelation would be a vector, not a matrix.

  The idea behind ordinary correlation is that two or more coincident random variables are not entirely independent of one another. Correlation is a particular measure of this degree of dependence. It loses its significance if there is only one random variable under consideration.

  Perhaps you can explain in greater detail what it is you are asking?

Roger Stafford

Subject: How to generate correlated data based on the given real data

From: ImageAnalyst

Date: 31 May, 2011 00:36:32

Message: 4 of 6

I have one link handy - I think Roger originally gave it here and I
bookmarked it:

http://en.wikipedia.org/wiki/Inverse_transform_sampling

Subject: How to generate correlated data based on the given real data

From: fatih

Date: 31 May, 2011 08:20:19

Message: 5 of 6

"Roger Stafford" wrote in message <is1adh$p47$1@newscl01ah.mathworks.com>...
> "fatih " <fatihabanoz@yahoo.com> wrote in message <is168a$fk1$1@newscl01ah.mathworks.com>...
> > Hi,
> > I have a data distribution which the size is 1 * 15000 . However, this data does not follow any kind of distribution such as gaussian. Besides this I have a correlation matrix. I want to generate correlated data based on that matrix although I have only single data distribution. If anyone can help me, It will be great.
> > Regards,
> - - - - - - - - -
> The notion of a single dimensional distribution possessing some kind of correlation doesn't make sense to me unless you have in mind autocorrelation - that is, a correlation between successive values of a single random variable. Even in that case the natural representation of such autocorrelation would be a vector, not a matrix.
>
> The idea behind ordinary correlation is that two or more coincident random variables are not entirely independent of one another. Correlation is a particular measure of this degree of dependence. It loses its significance if there is only one random variable under consideration.
>
> Perhaps you can explain in greater detail what it is you are asking?
>
> Roger Stafford

Hi,
First of all, thank you very much for your help. My problem is that I need to do the load modeling in a power system. In that power system, there are around 40 loads. Furthermore, the loads should be modeled based on a correlation matrix. The correlation matrix is already given due to the standards. However, I have only one single load data which the size is 1*15000. Depending on only that load data and given correlation matrix, I want to generate other load data(s) in the system. I hope this is more clear. Thank you in advance.
Regards,

Subject: How to generate correlated data based on the given real data

From: Tom Lane

Date: 31 May, 2011 16:58:16

Message: 6 of 6

> First of all, thank you very much for your help. My problem is that I need
> to do the load modeling in a power system. In that power system, there are
> around 40 loads. Furthermore, the loads should be modeled based on a
> correlation matrix. The correlation matrix is already given due to the
> standards. However, I have only one single load data which the size is
> 1*15000. Depending on only that load data and given correlation matrix, I
> want to generate other load data(s) in the system. I hope this is more
> clear. Thank you in advance.

You may want to look at the copularnd function. The general idea behind a
copula is to separate the marginal distribution of each variable from the
dependence (correlation) structure.

When you generate a matrix U from copularnd, each column is a sample from a
uniform distribution, but the columns are correlated. Then you can transform
each column to the distribution that you prefer. (Other posters have given
ideas about how to do that.)

I assume you want each separate column to have roughly the same distribution
as your original 15000 values. If that's not the case, I'd need more
information about how you want to define the column distributions.

This may get you going in the right direction. You should be aware, though,
that the correlation matrix you provide will not be the same as the
correlation of U, nor the same as the correlation after transforming each
column. You'll have to decide if that's a problem for you.

-- Tom

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us