# Implementing Breadth first search using connectivity matriX

17 views (last 30 days)
nitin arora on 14 May 2015
Answered: Shan Sham on 2 Mar 2018
I have a connectivity matrix in which each row corresponds to two connected nodes or one line segment. Some of these points are connected to each other resulting in several number of clusters. I want to find the number of clusters and number of nodes in each cluster. I know this can be done using Breadth-first search(BFS) or Depth first search(DFS) but the available bfs/dfs codes are specifically written for biographs. I can't use biographs and want to use only connectivity matrix to give me the information. connectivity matrix=[1 2; 2 3; 2 4; 3 4; 3 5;4 1; 2 6 ;2 7];

Walter Roberson on 16 May 2015
cmat = [1 2; 2 3; 2 4; 3 4; 3 5;4 1; 2 6 ;2 7];
numnode = max(cmat(:));
%for each pair (i,j) in the connectivity list, set adj(i,j) to 1
fromlist = cmat(:,1);
tolist = cmat(:,2);
fromtoidx = sub2ind([numnode,numnode], fromlist, tolist); %as linear indices
tofromidx = sub2ind([numnode,numnode], tolist, fromlist); %connections are bidirectional
You now have a matrix of 0's and 1's, with adj(i,j) = 1 meaning there is a link between i and j.
With this adjacency matrix, you can calculate transitions. consider (adj)^N where ^ is matrix power, adj^N being adj*adj*adj...*adj a total of N times. In that adj^N, if there is a non-zero entry at (i,j) then there is a way to get from i to j in N steps.
An advantage of this representation is that it allows you to perform your searches in parallel.
It has been a good 30 years since I read the algorithm to determine cut-edges from adjacency graphs. Scanning some literature it looks like perhaps it isn't trivial.
I think at this point it is worth confirming your definition of "cluster". Does a cluster occur exactly when there is a single cut-edge that partitions the graph into two pieces? If so, what if there are multiple cut-edges? Do you prefer to define in terms of cut-vertices? If there are two sub-graphs, then what is the maximum number of cut-edges permitted between the two sub-graphs before you determine that they are part of the same cluster?
I worked on data classification for several years. The data we were given did not have explicit connectivity information, just high-dimensional coordinates (e.g., concentration measurements for 500 different chemicals). The algorithms we used to cluster were never able to assign restricted adjacency graphs that could say that there was no connection between any two given vertices; they could only find probabilities that a vertex was part of a cluster with particular attributes, such as finding the hyperplane that best divided the vertices into two clusters.
In this forum we used to semi-regularly get questions about how to determine the "right" number of clusters to ask a routine such as k-means to divide the data into. The answer to that is that the best classification is always achieved by placing every node into its own cluster. For any lesser number of clusters to be considered "better" you needed to define a penalty function that encouraged fewer clusters -- and that penalty function had to be problem-dependent.
If you are lucky enough to have a situation with an actual adjacency list, then before we can tell you how many clusters there are, you need to give us a formal definition of how to tell whether whether two vertices are in different clusters. Because anything beyond there being a single cut-edge or single cut-vertex that partitions the graph is no longer obvious and is probably starting to get problem dependent.
##### 2 CommentsShowHide 1 older comment
Walter Roberson on 19 May 2015
tagging myself

Alfonso Nieto-Castanon on 20 May 2015
Edited: Alfonso Nieto-Castanon on 20 May 2015
If I am interpreting correctly, something like the following might be what you are after:
c = [1 2; 2 3; 2 4; 3 4; 3 5;4 1; 2 6 ;2 7]; % edges
n = max(c(:)); % number of nodes
C = sparse(c(:,1),c(:,2),1,n,n); % adjacency matrix
C = C|C'|speye(n); % forces symmetry
[idx,~,p] = dmperm(C); % magic
clusters = mat2cell(idx,1,diff(p)); % clusters
N = numel(clusters); % # of clusters
s = cellfun('length',clusters); % # of nodes per cluster

Shan Sham on 2 Mar 2018
Now only starts work with matlab. Am a mathematics student. I need a algorithm for the following Question. Let G and H be two directed graph with V(G)=V(H). Let S and T be a source and sink of the directed graph G. Adj(G) and Adj(H) denotes the adjacency matrices of G and H. I want to find out a path from a vertex of S to a vertex of T in such a way that the first arc must be in G and second arc in H and also the last arc must be in G. Suppose $P:x_1,x_2,\dots,x_i$ is a desired path. Then x_1 \in S and x_i \in T. {x_1 x_2, x_3 x_4,...,x_{i-1}x_i} \subseteq A(G) and {x_2 x_3, x_4 x_5,...,x_{i-2}x_{i-1}} \subseteq A(H). For the below graphs, S={1,2} and T={5,3}. Example of a such a desired path is * 2145* here 2 is in s and 5 is in T and 21,45 is in G and 14 is in H.
Help me to find a efficient algorithm using BFS.