Implementing Breadth first search using connectivity matriX

Question

0 votes

I have a connectivity matrix in which each row corresponds to two connected nodes or one line segment. Some of these points are connected to each other resulting in several number of clusters. I want to find the number of clusters and number of nodes in each cluster. I know this can be done using Breadth-first search(BFS) or Depth first search(DFS) but the available bfs/dfs codes are specifically written for biographs. I can't use biographs and want to use only connectivity matrix to give me the information. connectivity matrix=[1 2; 2 3; 2 4; 3 4; 3 5;4 1; 2 6 ;2 7];

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Walter Roberson on 16 May 2015

Open in MATLAB Online

0 votes

cmat = [1 2; 2 3; 2 4; 3 4; 3 5;4 1; 2 6 ;2 7];
numnode = max(cmat(:));
adj = zeros(numnode,numnode);  %adjacency matrix
%for each pair (i,j) in the connectivity list, set adj(i,j) to 1
fromlist = cmat(:,1);
tolist = cmat(:,2);
fromtoidx = sub2ind([numnode,numnode], fromlist, tolist);  %as linear indices
adj(fromtoidx) = 1;
tofromidx = sub2ind([numnode,numnode], tolist, fromlist);  %connections are bidirectional
adj(tofromidx) = 1;

You now have a matrix of 0's and 1's, with adj(i,j) = 1 meaning there is a link between i and j.

With this adjacency matrix, you can calculate transitions. consider (adj)^N where ^ is matrix power, adj^N being adj*adj*adj...*adj a total of N times. In that adj^N, if there is a non-zero entry at (i,j) then there is a way to get from i to j in N steps.

An advantage of this representation is that it allows you to perform your searches in parallel.

It has been a good 30 years since I read the algorithm to determine cut-edges from adjacency graphs. Scanning some literature it looks like perhaps it isn't trivial.

I think at this point it is worth confirming your definition of "cluster". Does a cluster occur exactly when there is a single cut-edge that partitions the graph into two pieces? If so, what if there are multiple cut-edges? Do you prefer to define in terms of cut-vertices? If there are two sub-graphs, then what is the maximum number of cut-edges permitted between the two sub-graphs before you determine that they are part of the same cluster?

I worked on data classification for several years. The data we were given did not have explicit connectivity information, just high-dimensional coordinates (e.g., concentration measurements for 500 different chemicals). The algorithms we used to cluster were never able to assign restricted adjacency graphs that could say that there was no connection between any two given vertices; they could only find probabilities that a vertex was part of a cluster with particular attributes, such as finding the hyperplane that best divided the vertices into two clusters.

In this forum we used to semi-regularly get questions about how to determine the "right" number of clusters to ask a routine such as k-means to divide the data into. The answer to that is that the best classification is always achieved by placing every node into its own cluster. For any lesser number of clusters to be considered "better" you needed to define a penalty function that encouraged fewer clusters -- and that penalty function had to be problem-dependent.

If you are lucky enough to have a situation with an actual adjacency list, then before we can tell you how many clusters there are, you need to give us a formal definition of how to tell whether whether two vertices are in different clusters. Because anything beyond there being a single cut-edge or single cut-vertex that partitions the graph is no longer obvious and is probably starting to get problem dependent.

2 Comments
Show None Hide None

nitin arora on 19 May 2015

Edited: Walter Roberson on 19 May 2015

Open in MATLAB Online

Thanks Walter. I have a set of coordinates of object A and B in a given space. I found the distance of all points from each other, i.e A-A, A-B, B-B distance using pdist function. If the distance between two objects i(A or B) and j (A or B) is less than a threshold(manually set), then i connect i and j using a line segment on a figure. So, my definition of cluster is all the points that are connected by line segments. I am running 3 loops(A-A, A-B, B-B) for connecting i and j. In the same loop, I recorded i and j object index values in a connectivity matrix such that each row represents a connection between i and j. I wrote the following BFS(breadth first search) code but it has several limitations such as

Identifying the last node in a cluster: It does not detect 5 as a part of 1st cluster
Repeats some index(It repeats the index 4)

Please advice. I am really stuck and I am on a tight deadline to figure this out.

clc;
clear all;
close all;
conncomp=[1 2; 2 3; 2 4; 3 4; 3 5];
%// Find all possible node IDs
%//Nitin finds unique number in an array without repition
nodeIDs = unique(conncomp(:)); 
%// Flag that tells us if there are any nodes we should visit
nodeIDList = false(1,numel(nodeIDs));
%// Stores our list of clusters
clusterList = {};
%// Keeps track of how many clusters we have
counter = 1;
%// queue - initialize to empty-keeps track of the connected nodes
queue = [];
nodescluster=[];
while (~all(nodeIDList))    
    % Find any node and make it currently working node
      currentnode = find(nodeIDList == false, 1);
      % Initialize our queue to contain this node and add this node to
      % cluster
      queue=currentnode;
      nodescluster=currentnode;
      %// While our queue is not empty; iteration stops when queue is empty
      %and all the nodes have been visited
      while (~isempty(queue))
          % Grab the current node from previous iteration and remove from
          % queue
           queue(1)=[];    
           if (nodeIDList(currentnode))
              continue;
           end
          %// Mark the current node as visited 
          nodeIDList(currentnode) = true;
   %// Remove those already visited
          visitedNodes = nodeIDList(true);
         UnvistitedNodes = nodeIDList(false);
          %// Retrieve all nodes connected to the current node
          connectedNodes = conncomp(conncomp(:,1) == currentnode, 2);
          queue = [queue,connectedNodes.'];
         nodescluster = [nodescluster,connectedNodes.'];
          %move to one of these connected nodes and set it as current node
          if(numel(queue)~= 0)
          currentnode=queue(1);
          end
      end
      %// Add connected components to its own cluster
      clusterList{counter} = nodescluster;
      counter = counter + 1;
  end
celldisp(clusterList)

Walter Roberson on 19 May 2015

tagging myself

Sign in to comment.

Answer 2

Alfonso Nieto-Castanon on 20 May 2015

Edited: Alfonso Nieto-Castanon on 20 May 2015

Open in MATLAB Online

0 votes

See help dmperm

If I am interpreting correctly, something like the following might be what you are after:

 c = [1 2; 2 3; 2 4; 3 4; 3 5;4 1; 2 6 ;2 7]; % edges
 n = max(c(:));                             % number of nodes
 C = sparse(c(:,1),c(:,2),1,n,n);           % adjacency matrix
 C = C|C'|speye(n);                         % forces symmetry
 [idx,~,p] = dmperm(C);                     % magic
 clusters = mat2cell(idx,1,diff(p));        % clusters
 N = numel(clusters);                       % # of clusters
 s = cellfun('length',clusters);            % # of nodes per cluster

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 3

Shan Sham on 2 Mar 2018

0 votes

Now only starts work with matlab. Am a mathematics student. I need a algorithm for the following Question. Let G and H be two directed graph with V(G)=V(H). Let S and T be a source and sink of the directed graph G. Adj(G) and Adj(H) denotes the adjacency matrices of G and H. I want to find out a path from a vertex of S to a vertex of T in such a way that the first arc must be in G and second arc in H and also the last arc must be in G. Suppose $P:x_1,x_2,\dots,x_i $ is a desired path. Then x_1 \in S and x_i \in T. {x_1 x_2, x_3 x_4,...,x_{i-1}x_i} \subseteq A(G) and {x_2 x_3, x_4 x_5,...,x_{i-2}x_{i-1}} \subseteq A(H). For the below graphs, S={1,2} and T={5,3}. Example of a such a desired path is * 2145* here 2 is in s and 5 is in T and 21,45 is in G and 14 is in H.

Help me to find a efficient algorithm using BFS.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Implementing Breadth first search using connectivity matriX

0 Comments
Show -2 older comments Hide -2 older comments

Answers (3)

2 Comments
Show None Hide None

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Products

Tags

Community Treasure Hunt

Implementing Breadth first search using connectivity matriX

0 Comments Show -2 older comments Hide -2 older comments

Answers (3)

2 Comments Show None Hide None

0 Comments Show -2 older comments Hide -2 older comments

0 Comments Show -2 older comments Hide -2 older comments

Categories

Products

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

2 Comments
Show None Hide None

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments