Monday, December 6, 2010

Optimal Vertex Cover for a Tree (Linear-Time-Algorithm)

A vertex cover of an undirected graph G is a set of vertices such that each edge in G is adjacent to at least on vertex from the vertex cover.


An optimal vertex cover of a graph G is a minimum-size vertex cover of G.


The problem of finding an optimal vertex cover of a given graph is called the vertex-cover problem. This problem is a NP-Hard problem, meaning there is no polynomial-time algorithm that has been found to solve the problem. For some instances of the problem it is impossible to find an algorithm that finds a correct answer quickly enough before we turn old and die :) or before the universe runs out of fuel and disappears.


However, any instance of this problem for which the graph G is a tree is easily solvable in linear-polynomial time, meaning we can find a solution very quickly. Finding an optimal vertex cover for a tree can be solved in linear-time. Today I am going to provide an example of an algorithm that solves this problem in linear-time.


I am going to provide the actual code in C++, but before that let's quickly go over the algorithm and explain why it works as I say it does.


I am going to start by introducing a definition and a theorem.
First, a matching M for a graph G is a set of edges such that no vertex in G is adjacent to more than one edge in M. Each vertex in G is adjacent to either one edge in M or zero edges in M. Meaning each vertex in the matching M is matched to only one vertex in M.


A maximal matching M for a graph G is a matching in G such that every edge that is not in M is adjacent to at least one edge that is in M.


Then I present a theorem that says: The size of a maximal matching of a every graph G is a lower bound on the size of the optimal vertex cover of G. Meaning, an optimal vertex cover of a given graph can never have a size smaller than the graph's maximal matching size. The proof for this theorem is pretty evident so I am not going to prove it.


I found out, as so many others have done before me, that the size of a maximal matching in a tree is equals to the size of the optimal vertex cover in a tree.


My algorithm does the following:
Starting from any arbitrary node in a given tree T, and making this node the root of the tree for all purposes, do a full walk of the tree. Every time you return from a child node, check if the child node and the parent of the node have been matched. If none of them have been matched then go ahead and match them and add the edge between them to the matching, and add the parent node to the vertex cover. When the full walk returns to the root, we have constructed a vertex cover with size equal to a maximal matching of the tree. We constructed a maximal matching by adding each on of the edges that we added when we matched a child node and its parent.


See that each edge of the tree that was not added to the matching is necessarily adjacent to an edge in the matching (because if it wasn't then we would have added the edge to the matching and one of its adjacent nodes to the vertex cover, the highest hierarchy node of the two).


Since the algorithm returns a vertex cover with size equals to a maximal matching of the tree, then we can go ahead and accept this vertex cover as optimal, since no vertex cover can be smaller than that.


All right, here the code:



#include <vector>
using namespace std;


void dfsTree(vector<vector<int>> & G, 
vector<bool> & visited, 
vector<bool> & matched, 
vector<int> & minCover,
int current)
{
visited[current] = true;


for(int i = 0; i < G[current].size(); i++)
{
int childNode = G[current][i];
if( !visited[childNode] )
{
dfsTree(G, visited, matched, minCover, childNode);


if(!matched[childNode] && !matched[current])
{
//add edge to matching by adding 
                                //both of its adjacent vertices
matched[current] = true;
matched[childNode] = true;


//push father node into vertex cover
minCover.push_back(current);
}
}
}
}


vector<int> GetTreeMinVertexCover (vector<vector<int>> & G)
{
vector<bool> visited(G.size(), false);
vector<bool> matched(G.size(), false);
vector<int> minCover;


int start = 0; // any arbitrary node... really
dfsTree(G, visited, matched, minCover, start);


return minCover;
}

-------------------------------


So yeah all you have to do is provide GetTreeMinVertexCover with a tree in the form of an adjacency list and done.


So I prepared a couple of runs with different cases:



vector<int> commaSeparatedStringToIntVector(string str)
{
vector<int> vec;


while(str.length() > 0)
{
int pos = str.find(",");
if (pos!=string::npos)
{
string strNum = str.substr(0, pos);
vec.push_back(atoi(strNum.c_str()));
str = str.substr(pos+1);
}
else
{
vec.push_back(atoi(str.c_str()));
break;
}
}


return vec;
}


void Run1()
{
vector<string> adj;
adj.push_back("1");
adj.push_back("0,2");
adj.push_back("1,3");
adj.push_back("2,4");
adj.push_back("3,5");
adj.push_back("4");
vector<vector<int>> tree;

for(int i = 0; i < adj.size(); i++)
{
tree.push_back( commaSeparatedStringToIntVector(adj[i]) );
}

vector<int> minCover = GetTreeMinVertexCover(tree);
printf("minCoverSize = %d\n", minCover.size());
for(int i = 0; i < minCover.size(); i++)
{
printf("%d ", minCover[i]);
}
printf("\n-----------------\n");
}


void Run2()
{
vector<string> adj;
adj.push_back("1,2");//0
adj.push_back("0,3,4,5");//1
adj.push_back("0,6,7");//2
adj.push_back("1");//3
adj.push_back("1,8,9");//4
adj.push_back("1");//5
adj.push_back("2,10");//6
adj.push_back("2");//7
adj.push_back("4");//8
adj.push_back("4,11");//9
adj.push_back("6");//10
adj.push_back("9");//11
vector<vector<int>> tree;

for(int i = 0; i < adj.size(); i++)
{
tree.push_back( commaSeparatedStringToIntVector(adj[i]) );
}

vector<int> minCover = GetTreeMinVertexCover(tree);
printf("minCoverSize = %d\n", minCover.size());
for(int i = 0; i < minCover.size(); i++)
{
printf("%d ", minCover[i]);
}
printf("\n-----------------\n");
}


int main ()
{
Run1();
Run2();
}

---------------------


I added a little function that parses a comma-separated list of integers into a vector of integers to help myself out creating those vector of vector of integers.


The output this gives looks like this:




I guess you can use this code any time. If you find any bugs crawling around blame it on me and only on me of course :) , but I doubt this simple piece of code could have any serious mistakes, but it is certainly vulnerable to my human errors.


QED?

2 comments:

  1. Hello, Mario!

    First of all, thank you! I was thinking about this problem and your post surely helped me.

    The theorem you stated is correct, yet I think you didn't clarify the fact that the maximal matching is not unique -- actually, even a maximum matching needs not to be unique (although its size/cost is unique, due to the definition of "maximum").

    More importantly, the size of a minimum vertex cover in a tree is equal to the size of a maximum matching in the same graph -- but there may be a lot of maximal matchings in the graph that are not maximum, and thus do not "reach" this optimal size.
    (I do realize, though, that this is what you intended to state.)
    Also, an interesting fact is that this equality is true not only for trees, but actually for any bipartite graph (that is, a graph that can have its vertices coloured with as few as 2 colours in such a manner that no neighbors share the same colour). (Any tree is bipartite.)

    These things said, and with my sincere apologies for being kind of picky about the Graph Theory aspects, the fact is that these statements do justify the algorithm (as you know). That is, the important thing, which you presented correctly, is that looking for a minimum vertex cover in a tree is the same as looking at a maximum matching.

    The "killer" fact, I think, to prove your algorithm correct, is that whenever we find a vertex cover that is the same size as that of a given matching, the vertex cover is minimum and the matching is maximum -- which follows directly from your "lower bound" theorem. (Sometimes such observations are called "Duality Theorems" regarding the mincover-maxmatching duality.)

    Your algorithm begins with an empty matching and successively "enlarges it", simultaneously building a set of vertices with the same size. At the end of the algorithm, the set of vertices is a vertex cover, and has the same size of the matching. Thus, the correctness of your algorithm holds (via the discussion right above)! :-)

    Also, I haven't checked your code very carefully, but it seems to me that it implements your algorithm correctly :-)

    I'm sory for stating so many obvious things. I just wanted to look more carefully into your solution, and I'm glad to find that it is correct.

    Thanks again for sharing your idea, and I hope my comment is still of interest 18 months after your original post.

    ReplyDelete
  2. hey , i have implement the same logic in java. And thats work correctly for all cases . But i could not identify the bugs which i am getting as NZEC after 12 successful cases in java. Please can you identify that .....


    package programming;

    import java.io.*;
    import java.util.*;
    public class PT07xspojVertexCover {

    private static Reader in;
    private static PrintWriter out;
    private static int n;
    private static boolean[][] graph;

    public static void main(String[] args) throws Exception {
    // TODO Auto-generated method stub

    in = new Reader ();
    out = new PrintWriter (System.out, true);
    n = in.nextInt ();

    int[] arr = new int[n + 1];


    int m = n - 1;

    graph = new boolean[n + 1][n + 1];


    // making of adjacency matrix.
    for (int i = 0; i < m; i++) {
    int x = in.nextInt();
    int y = in.nextInt();

    graph[x][y] = true;
    graph[y][x] = true;
    arr[x]++;
    arr[y]++;
    }

    int res = maxBpm(graph);
    out.println (res);

    //in.close();
    }

    // find bipartite matching.
    public static boolean bipaprtiteM(boolean[][] graph2, int u,
    boolean[] seen, int[] matchR) {
    // TODO Auto-generated method stub

    for (int v = 1; v <= n; v++) {

    if (graph2[u][v] && !seen[v]) {

    seen[v] = true;
    seen[u] = true;

    if (matchR[v] == 0
    || bipaprtiteM(graph2, matchR[v], seen, matchR)) {
    matchR[v] = u;
    matchR[u] = v;
    return true;
    }
    }
    }
    return false;
    }

    // find maximum no. of matching.
    public static int maxBpm(boolean[][] graph2) {
    // TODO Auto-generated method stub
    int matchR[] = new int[n + 1];

    for (int i = 1; i <= n; i++) {
    matchR[i] = 0;
    }

    int result = 0;
    for (int u = 1; u <= n; u++) {
    boolean seen[] = new boolean[n + 1];
    if (matchR[u] == 0 && bipaprtiteM(graph2, u, seen, matchR)) {
    //System.out.print(u + " ");
    result++;
    }
    }
    // System.out.print(result);
    return result;
    }

    }

    //here i have written the fast inputting code....

    ReplyDelete