Blog Archives

Minimum Spanning Tree Algorithms

Given an undirected, connected graph G=(V,E), one might be concerned with finding a subset ST of edges from E that “span” the graph by ensuring that the graph remains connected. If we further require that the total weights of the edges in ST are minimized, then we are interested in finding a minimum spanning tree (MST). PRIM’S ALGORITHM,  shows how to construct an MST from such a graph by using a greedy approach in which each step of the algorithm makes forward progress toward a solution without reversing earlier decisions. PRIM’S ALGORITHM grows a spanning tree T one edge at a time until an MST results (and the resulting spanning tree is provably minimum).

It randomly selects a start vertex s∈V to belong to a growing set S, and it ensures that T forms a tree of edges rooted at s. PRIM’S ALGORITHM is greedy in that it incrementally adds edges to T until an MST is computed. The intuition behind the algorithm is that the edge (u,v) with lowest weight between u∈S and v∈V–S must belong to the MST. When such an edge (u,v) with lowest weight is found, it is added to T and the vertex v is added to S.

The algorithm uses a priority queue to store the vertices v∈V–S with an associated priority equal to the lowest weight of some edge (u,v) where u∈S. This carefully designed approach ensures the efficiency of the resulting implementation.

Solution
The C++ solution below relies on a binary heap to provide the implementation of the priority queue that is central to PRIM’S ALGORITHM. Ordinarily, using a binary heap would be inefficient because of the check in the main loop for whether a particular vertex is a member of the priority queue (an operation not supported by binary heaps). However, the algorithm ensures that vertices
are only removed from the priority queue as it processes, so we need only maintain a status array inQueue[] that is updated whenever a vertex is extracted from the priority queue.

In another implementation optimization, we maintain an external array key[] that records the current priority key for each vertex in the queue, which again eliminates the need to search the priority queue for a given vertex identifier.

/**
* Prim’s Algorithm implementation with binary heap
*
* Given undirected graph, compute MST starting from a randomly
* selected vertex. Encoding of MST is done using 'pred' entries.
*/

void mst_prim (Graph const &graph, vector &pred)
{
// initialize pred[] and key[] arrays. Start with arbitrary
// vertex s=0. Priority Queue PQ contains all v in G.
const int n = graph.numVertices( );
pred.assign(n, -1);
vector<int> key(n, numeric_limits<int>::max( ));
key[0] = 0;
BinaryHeap pq(n);
vector inQueue(n, true);

for (int v = 0; v < n; v++)
{
pq.insert(v, key[v]);
}

while (!pq.isEmpty( ))
{
int u = pq.smallest( );
inQueue[u] = false;

// Process all neighbors of u to find if any edge beats best distance

for (VertexList::const_iterator ci = graph.begin(u); ci != graph.end(u); ++ci)
{
int v = ci->first;

if (inQueue[v])
{
int w = ci->second;
if (w < key[v])
{
pred[v] = u;
key[v] = w;
pq.decreaseKey(v, w);
}
}
}
}
}

Consequences
For dense graphs, the priority queue can be implemented instead with a Fibonacci heap. This improves the performance to O(E+V*log V), a significant speedup over the binary heap implementation.

Analysis
The initialization phase of PRIM’S ALGORITHM inserts each vertex into the priority queue (implemented by a binary heap) for a total cost of O(V log V). The decreaseKey operation in PRIM’S ALGORITHM requires O(log q) performance, where q is the number of elements in the queue, which will always be less than |V|. It can be called at most 2*|E| times since each vertex is removed once from
the priority queue and each undirected edge in the graph is visited exactly twice. Thus the total performance is O((V+2*E)*log n) or O((V+E)*log V).

Single-Source Shortest Path (Dijkstra’s Algorithm Implementation in C++)

Suppose you want to fly  a private plane on the shortest path from Saint Johns­ bury, VT to Waco, TX. Assume you know the distances between the airports for all pairs of  cities and towns that are reachable from each other in  one nonstop flight of  your plane. The best-known algorithm to solve this problem, Dijkstra’s Algorithm, finds the shortest path from Saint Johns­ bury to  all other airports, although the search may be halted once the shortest path  to Waco is known.

Dijkstra’s Algorithm conceptually operates in greedy fashion by expanding a set of  vertices, S, for  which the shortest path from s to every vertex VE S is known, but only using paths that include vertices in S. Initially, S  equals the set {s}. To expand S, Dijkstra’s Algorithm  finds the vertex VE V-S whose distance to s is  smallest, and follows v’s edges to see whether a shorter path exists to another  vertex. After processing v2 , for example, the algorithm determines that the distance from s to v3 is  really 17 through the path <s,v2 ,v 3>. Once S  expands to equal V, the algorithm completes.

Input/Output

Input

A  directed, weighted graph G=(V,E) and a source vertex sE V. Each edge e=(u,v) has an associated positive weight in the graph. The quantity n represents the number of  vertices in  G.

Output

Dijkstra’s Algorithm  produces two computed  arrays. The primary result is the array dist[] of  values representing the distance from source vertex s to each vertex in the graph. Note that d ist[ s] is zero. The secondary result is  the array p red [), which can be used to rediscover the actual shortest paths from vertex s to each vertex in the graph.

Assumptions

The edge weights are positive (i.e., greater than zero); if  this assumption is  not true, then dist[u]  may contain  invalid results. Even worse, Dijkstra’s Algorithm will loop forever if a cycle exists whose sum of  all weights is less than zero.

Solution

As  Dijkstra’s Algorithm executes, dis t[v]  represents the maximum length of the shortest path found from the source s to v using only vertices visited within the setS. Also, for  each vES, dist[v] is correct. Fortunately, Dijkstra’s Algorithm does not actually compute and store the setS. It  initially constructs a set  containing the vertices in V, and then it  removes vertices one at a time from the set to compute proper dis t[v] values; for  convenience, we continue to refer to this ever-shrinking set as V-S. Dijkstra’s Algorithm terminates when all vertices are either visited or are shown to not be reachable from the source vertex s.

In the C++ solution shown below, a binary heap stores the vertices in  the set V-S  as a priority queue because, in constant time, one can locate the vertex with smallest priority (where the priority is determined by the vertex’s distance from s). Additionally, when  a shorter  path  from s   to v is found, dist [ v ]   is decreased, requiring the heap to be modified. Fortunately, the decrease Key  opera­tion on  priority queues represented  using binary heaps can be performed  on average in O(log q)  time, where q is the number of verticesin the binary heap, which will always be less than or equal to the number of vertices, n.

//Dijkstra’s Algorithm with priority queue implementation

#include "BinaryHeap.h"
#include "Graph.h"

/** Given directed, weighted graph, compute shortest distance to vertices
* (dist) and record predecessor links (pred) for all vertices. */

void singleSourceShortest(Graph const &g, int s, vector &dist, vector &pred)
{
// initialize dist[] and pred[] arrays. Start with vertex s by setting
// dist[] to 0. Priority Queue PQ contains all v in G.

const int n = g.numVertices( );
pred.assign(n, -1);
dist.assign(n, numeric_limits<int>::max( ));
dist[s] = 0;
BinaryHeap pq(n);

for (int u = 0; u < n; u++)
{
pq.insert (u, dist[u]);
}
// find vertex in ever-shrinking set, V-S, whose dist[] is smallest.
// Recompute potential new paths to update all shortest paths

while (!pq.isEmpty( ))
{
int u = pq.smallest( );
// For neighbors of u, see if newLen (best path from s->u + weight
// of edge u->v) is better than best path from s->v. If so, update
// in dist[v] and re-adjust binary heap accordingly. Compute in
// long to avoid overflow error.

for (VertexList::const_iterator ci = g.begin(u); ci != g.end(u); ++ci)
{
int v = ci->first;
long newLen = dist[u];
newLen += ci->second;

if (newLen < dist[v])
{
pq.decreaseKey (v, newLen);
dist[v] = newLen;
pred[v] = u;
}
}
}
}

Consequences

Arithmetic error also may occur if  the sum of  the individual edge weights exceeds numeric_limits<i n t>: :max () (although the individual values do  not). To avoid this situation, the computed new len uses a long data type.

Analysis

In the implementation of  Dijkstra’s Algorithm, the loop that constructs  the  initial priority  queue  performs  the  insert  operation  V  times, resulting in performance  O(V log V). In   the remaining while loop, each edge is visited once, and thus decrease Key  is called no more than E  times, which contrib­utes O(E log V)  time. Thus, the overall performance is O(( V +E) log V).

The   C++ implementation below is simpler since it avoids the use of  a binary heap. Th e efficiency of   this version is determined by considering how fast the smallest dist [] value in V-S can be retrieved. The while loop is executed n times, since S grows on e vertex at a time. Finding the smallest dist [ u]  in V-S inspects all n vertices. Note that each edge is inspected exactly once in the inner loop within the while loop. Thus, the total running time of  this version is 0 (V 2+E).

//Implementation of Dijkstra’s Algorithm for dense graphs

#include "Graph.h"

void singleSourceShortest(Graph const &graph, int s, vector &dist, vector &pred)
{

// initialize dist[] and pred[] arrays. Start with vertex s by setting
// dist[] to 0.

const int n = graph.numVertices( );
pred.assign(n, -1);
dist.assign(n, numeric_limits<int>::max( ));
vector<bool> visited(n);
dist[s] = 0;

// find vertex in ever-shrinking set, V-S, whose dist value is smallest
// Recompute potential new paths to update all shortest paths

while (true)
{
// find shortest distance so far in unvisited vertices
int u = -1;
int sd = numeric_limits<int>::max( ); // assume not reachable

for (int i = 0; i < n; i++)
{
if (!visited[i] && dist[i] < sd)
{
sd = dist[i];
u = i;
}
}
if (u == -1)
{
break; // no more progress to be made
}

// For neighbors of u, see if length of best path from s->u + weight
// of edge u->v is better than best path from s->v.
visited[u] = true;

for (VertexList::const_iterator ci = graph.begin(u); ci != graph.end(u); ++ci)
{
int v = ci->first; // the neighbor v
long newLen = dist[u]; // compute as long
newLen += ci->second; // sum with (u,v) weight

if (newLen < dist[v])
{
dist[v] = newLen;
pred[v] = u;
}
}
}
}

We can further optimize to remove all  of  the C++ standard template library  objects,  as  shown  below.  By reducing the  overhead of   the supporting classes, we realize impressive performance benefits, as discussed in the “Comparison” section.

/**
* Optimized Dijkstra’s Algorithm for dense graphs
*
* Given int[][] of edge weights in raw form, compute shortest distance to
* all vertices in graph (dist) and record predecessor links for all
* vertices (pred) to be able to recreate these paths. An edge weight of
* INF means no edge. Suitable for Dense Graphs Only.
*/

void singleSourceShortestDense(int n, int ** const weight, int s,int *dist, int *pred)
{
// initialize dist[] and pred[] arrays. Start with vertex s by setting
// dist[] to 0. All vertices are unvisited.
bool *visited = new bool[n];
for (int v = 0; v < n; v++)
{
dist[v] = numeric_limits<int>::max( );
pred[v] = -1;
visited[v] = false;
}

dist[s] = 0;

// find shortest distance from s to all unvisited vertices. Recompute
// potential new paths to update all shortest paths. Exit if u remains -1.
while (true)
{
int u = -1;
int sd = numeric_limits<int>::max( );
for (int i = 0; i < n; i++)
{
if (!visited[i] && dist[i] < sd)
{
sd = dist[i];
u = i;
}
}
if (u == -1)
{
break;
}
// For neighbors of u, see if length of best path from s->u + weight
// of edge u->v is better than best path from s->v. Compute using longs.
visited[u] = true;
for (int v = 0; v < n; v++)
{
int w = weight[u][v];
if (v == u) continue;
long newLen = dist[u];
newLen += w;
if (newLen < dist[v])
{
dist[v] = newLen;
pred[v] = u;
}
}
}
delete [] visited;
}