Generate correlated samples with copulas: Problems/Errors by using "copulafit"

Question

rowJoe on 24 Jul 2015

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/231061-generate-correlated-samples-with-copulas-problems-errors-by-using-copulafit

Edited: Shruti Sapre on 29 Jul 2015

exampleData.mat

Hello everybody,

I need to generate samples out of real measured data for teaching an unsupervised machine learning algorithm. Inspired by the examples in the documentation ( link: Page in Documentation ) I would like to do this by using copulafit .

In the following code, the variable thisData is a m-by-n-matrix ( m: samples, size: 231 and n: different indicators with their own distributions, size: 16 ). Out of these 231 measurementsets I would like to generate 2500 (variable "nSample") samples with the dependancy of the 16 different distributions.

Error Message:

By executing the code, there is the following error in the command line display: "Error in copulafit (line 125) [lowerBnd,upperBnd] = bracket1D(profileFun,lowerBnd,5); % 'upper', search ascending from 5

Error in SimulatingDependentRandomVariablesUsingCopulas3 (line 13) [Rho,nu] = copulafit('t',u,'Method','ApproximateML')"

My Code:

%%Show distributions of dataset
plotmatrix(thisData)
%%Transform the data to the copula scale (unit square) using a kernel estimator of the cumulative distribution function.
for i = 1:size(thisData,2)
    u(:,i) = ksdensity(thisData(:,i),thisData(:,i),'function','cdf');
end
%plotmatrix(u,'Direction','out')
%%Fit a 't' copula.
[Rho,nu] = copulafit('t',u,'Method','ApproximateML')
%%Generate a random sample from the t copula.
r = copularnd('t',Rho,nu,1000);
u1 = r(:,1);
v1 = r(:,2);
scatterhist(u1,v1,'Direction','out')
xlabel('u')
ylabel('v')
set(get(gca,'children'),'marker','.')
%%Transform the random sample back to the original scale of the data.
x1 = ksdensity(x,u1,'function','icdf');
y1 = ksdensity(y,v1,'function','icdf');
scatterhist(x1,y1,'Direction','out')
set(get(gca,'children'),'marker','.')

I appreciate your help and support - thank you very much.

Jonas

2 Comments
Show NoneHide None

Tom Lane on 24 Jul 2015

You give the location of the error but not the text of the error. Could you add that? I got your code to run after correcting the variable names x and y.

rowJoe on 27 Jul 2015

Edited: rowJoe on 27 Jul 2015

exampleData.mat

Hi Tom,

thank you very much for your comment. This is the error message:

Error using copulafit/approxProfileNLL_t (line 290)
The estimate of Rho has become rank-deficient.  You may have too few data, or strong dependencies among variables.
Error in copulafit>bracket1D (line 489) oldnll = nllFun(bound);
Error in copulafit (line 125) 
[lowerBnd,upperBnd] = bracket1D(profileFun,lowerBnd,5); % 'upper', search ascending from 5
Error in SimulatingDependentRandomVariablesUsingCopulas3 (line 14)
[Rho,nu] = copulafit('t',u,'Method','ApproximateML')

Moreover, I attached the file "exampleData.mat" which contains an example of matrix "thisData".

Sign in to comment.

Sign in to answer this question.

Answer 1

Shruti Sapre on 29 Jul 2015

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/231061-generate-correlated-samples-with-copulas-problems-errors-by-using-copulafit#answer_187638

Edited: Shruti Sapre on 29 Jul 2015

Hi Jonas,

I understand that you are receiving an error while using the “copulafit” function with your data.

This error is due to collinearities in the input data; in this case, due to the presence of duplicate columns. The rank of the matrix (14) is less than the number of columns in the matrix (16).

These singularities can be observed by computing the eigenvalues of matrix "thisData" using the following commands:

>> [V,D] = eig(corr(thisData))
>> Eigenvalues = diag(D)
>> Eigenvalues(find(abs (Eigenvalues < 10^(-16))))

You can observe that there are a couple of eigenvalues which are smaller than 10^-16; effectively machine zeros in the context of a sample covariance/correlation matrix. Therefore, based on linear algebra and machine precision, some columns of "thisData" are treated as a linear combination of some other ones.

Modifying the matrix so that its rank is equal to the number of columns/indicators/variables can help resolve your issue.

Hope this helps!

-Shruti

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Generate correlated samples with copulas: Problems/Errors by using "copulafit"

2 Comments
Show NoneHide None

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

Generate correlated samples with copulas: Problems/Errors by using "copulafit"

2 Comments Show NoneHide None

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

2 Comments
Show NoneHide None

0 Comments
Show -2 older commentsHide -2 older comments