factoran Error, "Some columns of the X matrix are constant."

I am trying to perform a factor analysis on a smoothed matrix size 14300 by 238, with an m value of 143. However Matlab keeps giving me this error "some columns of the X matrix are constant."
Initially I was trying to do it on a 143 by 238 matrix with 143 observations, and it was giving me an error "The number of factors requested, M, is too large for the number of the observed variables."
One last error I have got while using this function is "The data X must have a covariance matrix that is positive definite."
Any help understanding what these errors are would be much appreciated!
Thanks!

1 Comment

It would be good if you uploaded your dataset in a *.mat file (using the paper clip icon), and the code that you are using to call factoran.

Sign in to comment.

Answers (1)

You've asked questions about three errors:
  1. some columns of the X matrix are constant
  2. the number of factors requested, M, is too large for the number of the observed variables
  3. The data X must have a covariance matrix that is positive definite
I'm going to make a guess that you are using a technique -- factor analysis -- that you do not understand very well. This is always a bit dangerous for understanding/interpreting what you are using this tool for. I'm not an expert on factor analysis, either, but I have some understanding of why you are getting these errors.
(1) Straightforward. You have one or more columns in X where all the numbers are the same. If you have an explanatory variable that is constant, then it (obviously?) cannot be important for explaining variation in the response variable. Just drop those variables (columns) from your analysis.
(2) I don't know the theoretical underpinnings here, but there is a maximum number of factors that can be found, depending on the number of variables and observations.
If n is your number of variables, and m is the number of factors you requested, then the number of degrees of freedom is
degrees_freedom = ((n-m)^2 - (n+m))/2
If that value is negative or if m > n, then you have tried to define too many factors.
(3) Your error about using a positive-definite matrix is a bit complicated to answer, but it boils down to the fact that the correlations among your variables need to be self-consistent. For example, suppose A and B have correlation coefficient 0.9 -- very high. And suppose that B and C are also correlated 0.9. This is implies that the correlation of A and C must be fairly high. It could not be 0.1, for example. So, implicit in your X matrix must be some variables that have inconsistent correlations like that.

3 Comments

Regarding (3), I don't think that's exactly right. If the X matrix has the values of the variables, then their mutual correlations are necessarily consistent. That is, although it is possible to construct a correlation matrix that is internally inconsistent, I don't think it is possible to construct a dataset that will produce such a correlation matrix (well, maybe if you allow missing data).
The complaint about a non-positive-definite matrix means that some of the variables can be almost perfectly predicted from the other variables. The simplest way for that to happen is if two of the variables have a correlation near +1 or -1. In that case, you must omit one of those variables from the analysis (either one, since they convey the same information). Unfortunately, in more complicated cases it can be pretty tough to figure out which variable(s) need to be dropped.
Yes, I agree completely with your first paragraph. I misstated the condition.
I don't want to paste the proprietary code here, but lines 155-167 of factoran (in R2019b) are where the condition is calculated and enforced. (I have to admit that I do not fully understand the theory here.)
Thanks for both your replies.
After carefully looking at my data, I realized I had a few columns with repeating zeros. After I went back and checked my preprocessing I realized I was making a mistake there and the FA ran fine after that.

Sign in to comment.

Asked:

on 18 Mar 2020

Commented:

on 21 Mar 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!