Be the first to rate this file! 27 downloads (last 30 days) File Size: 11.39 KB File ID: #22558

Multiple Correspondence Analysis Based on the Burt Matrix.

by Antonio Trujillo-Ortiz

 

30 Dec 2008 (Updated 03 Jan 2009)

Code covered by BSD License  

multiple correspondence analysis, correspondence analysis, categorical analysis, graphical procedure

Download Now | Watch this File

File Information
Description

Statistics fundamentals of the Correspondence Analysis (CA) is presented in the CORRAN and MCORRAN1 m-files you can find in this FEX author''s page. CA can be extended to more than two categorical variables, called Multiple Correspondence Analysis (MCA). CA and MCA are graphical techniques for representing the information in a two-way or higher-order multiway contingency table. They contain the counts (frequencies) of items for a cross-classifications of the categorical variables (Rencher, 2000).

Karl Pearson (1913) developed the antecedent of CA used by Procter&Gamble (Horst 1935). R.A. Fisher (1940) named the approach 'reciprocal averaging' because is reciprocally averages row and column percents in table data until they are reconciled. Since reciprocal averaging was inefficient, Europeans such as Mosaier (1946) and Benzecri (1969) related table data with computer programs for principal component (factor) analysis. Burt (1953) developed MCA (homogeneity analysis) of a binary indicator.

Here, MCA is applied to the Burt matrix (B), the matrix of all two-way cross-tabulations of the categorical variables. The Burt matrix has a square block on the diagonal for each variable (the frequencies for the categories in the corresponding variable) and a rectangular block off-diagonal for each pair of variables (a two-way contingency table for the corresponding pair of variables). In the dual eigenanalysis or
Singular Value Decomposition (SVD) we get the squares of the singular values, or principal inertias.

The so-called 'percentage of inertia problem' can be improved by using adjusted inertias procedure or eigenvalue correction. The adjusted inertias are calculated only for each singular value that satisfies the inequality >= 1/number of variables. They are expressed as a percentage of the average off-diagonal inertia, which can be calculated either by direct calculation on the off-diagonal tables in the Burt matrix. The adjusted solution not only does it considerably improve the measure of fit, but it also removes the inconsistency about the Burt matrix to analyse. This inconsistency is due to artificial dimensions added because one categorical variable is coded with several columns. As a consequence, the inertia (i.e., variance) of the solution space is artificially inflated and therefore the percentage of inertia explained by the first dimension is severely underestimated.

A complete statistics fundamentals explanation is found on Greenacre (2006).

A MCA yields only rows or columns coordinates and each point represents a category (attribute) of one of the variables.

Syntax: function mcorran2(X)
     
Input:
X - Data matrix=Burt matrix. Size: categorical variables x categorical variables (>2).
   
Outputs:
Complete Multiple Correspondence Analysis
The adjusted inertias table is given by default
Pair-wise Dimensions Plots. For the vertical and horizonal lines we use the hline.m and vline.m files kindly published on FEX by Brandon Kuczenski [http://www.mathworks.com/matlabcentral/fileexchange/1039]. For connecting lines to the originwe use the plot2org published on FEX by Jos [http://www.mathworks.com/matlabcentral/fileexchange/11337]

Required Products Statistics Toolbox
MATLAB release MATLAB 7 (R14)
Zip File Content  
Other Files mcorran2/hline.m,
mcorran2/mcorran2.m,
mcorran2/plot2org.m,
mcorran2/READMEmcorran2.txt,
mcorran2/vline.m
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Please login to add a comment or rating.
Updates
31 Dec 2008

It was added an appropriate format to cite this file.

03 Jan 2009

Text was improved.

Tag Activity for this File
Tag Applied By Date/Time
multiple correspondence analysis Antonio Trujillo-Ortiz 31 Dec 2008 09:13:41
burt matrix Antonio Trujillo-Ortiz 31 Dec 2008 09:13:41
crosstabulation analysis Antonio Trujillo-Ortiz 31 Dec 2008 09:13:41
contingency analysis Antonio Trujillo-Ortiz 31 Dec 2008 09:13:41
control design Antonio Trujillo-Ortiz 31 Dec 2008 09:13:41
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com