Be the first to rate this file! 11 Downloads (last 30 days) File Size: 4.96 KB File ID: #35442
image thumbnail

Connected Component Analysis on an Undirected Graph

by

 

03 Mar 2012 (Updated )

Connected component analysis on undirected graphs, with thresholding and connectivity constraints.

| Watch this File

File Information
Description

Tristan Ursell
April 2012
Connected component analysis on an undirected graph, with various thresholding and connectivity constraints.

[groups,orphans] = graph_analysis(W);
[groups,orphans] = graph_analysis(W,'field',value,...);
[groups,~] = graph_analysis(W,'field',value,...);

W is the N x N adjaceny matrix for a symmetric graph. Thus W should be a symmetric matrix, if it is not, this function will give an error. Self connections (i.e. diagonal elements) are not allowed and will be removed automatically. The values of the parameters below are applied as a union set, that is, all original elements must meet all of the conditions specified by the parameters to be included in a group of connected components.

If only W is given, then all components with W > 0 will be analyzed and grouped, with the default constraints.

'min_conn' (0 <= min_conn <= max_conn) specifies the minimum degree of connectivity (not including itself) for any element in W to be included in a group. The default 'min_conn' value is 1.

'max_conn' (min_conn <= max_conn <= N) specifies the maximum degree of connectivity for any element in W to be included in a group. The default 'max_conn' value is N.

'min_like' is the minimum likelihood value for an element in W to be included in any group. The default value is 0. The 'likelihood' is not necessarily a probability, and hence is not bounded between zero and one. However, for any two elements in W, the ratio of likelihoods should be equal to their ratio of probabilities.

'max_link' is the maximum number of linkages to search for in the network. This parameter is useful when you know that the network has some maximum number of connections between the elements in the network (e.g. if the graph has the property that any two two nodes are no more than N connections away, you can seg max_lin = N, to speed up the code). Choosing smaller values of 'max_link' can significantly speed the code. The default value is the size of the current sub-block.

'max_rank' (1 <= max_rank <= N) is the number of highest likelihood values to use in forming connected groups. For instance, if max_rank = 3, and a row of W is:

   1 3 6 4 4 5 6 8 0 3 1

then the matrix row will become:
 
   0 0 1 0 0 0 1 1 0 0 0
 
If a row of W is:

   1 3 6 4 4 6 6 6 0 3 1
 
with max_rank = 1,2,3 or 4, the row becomes:

   0 0 1 0 0 1 1 1 0 0 0
 
with max_rank = 5, the same row becomes:
 
   0 0 1 1 1 1 1 1 0 0 0

The default value of 'max_rank' is N.

'plot' with value 1 will generate plots of the grouping algorithm as it creates block diagonal groups in from top left to bottom right in W.

The output 'groups' is a structure array with fields:

groups(i).num_els = number of elements in group i.
groups(i).block = sub-block identity of group i.
groups(i).elements = elements of W that are in group i.
groups(i).degrees = degrees of connection for each element in group i.
orphans = elements of W that were not in any group, becasue they did not meet the constraints.

The number of distinct groups is length(groups).

Example with a block diagonal random matrix W:

W=blkdiag(rand(50,50),rand(100,100),rand(200,200),rand(300,300));
W=(W+W')/2;

figure;
imagesc(W)
axis equal tight
xlabel('Elements of W')
ylabel('Elements of W')
title('Random Block-Diagonal Adjacency Matrix')

%more inclusive connections, less inclusive probability
[groups,orphans]=graph_analysis(W,'min_like',0.95,'min_conn',3,'plot',1);

%less inclusive connections, more inclusive probability
[groups,orphans]=graph_analysis(W,'min_like',0.9,'min_conn',6,'plot',1);

Required Products MATLAB
MATLAB release MATLAB 7.9 (R2009b)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Updates
05 Mar 2012

Improved description.

08 Mar 2012

fixed some typos in the directions

12 Apr 2012

added new features, improved plot handling, reduced memory load and fixed precision error.

27 Jul 2012

put 'modone' function inside of m-file as requested by multiple users.

30 Aug 2012

improved memory usage, so that 32 bit systems can handle larger arrays.

Contact us