# Blog Archives

## Single-Source Shortest Path (Dijkstra’s Algorithm Implementation in C++)

Suppose you want to fly a private plane on the shortest path from Saint Johns bury, VT to Waco, TX. Assume you know the distances between the airports for all pairs of cities and towns that are reachable from each other in one nonstop flight of your plane. The best-known algorithm to solve this problem, Dijkstra’s Algorithm, finds the shortest path from Saint Johns bury to all other airports, although the search may be halted once the shortest path to Waco is known.

Dijkstra’s Algorithm conceptually operates in greedy fashion by expanding a set of vertices, *S, *for which the shortest path from *s *to every vertex *VE **S *is known, *but only using paths that include vertices **in S. *Initially, *S *equals the set {s}. To expand *S*, Dijkstra’s Algorithm finds the vertex *VE **V-S* whose distance to *s *is smallest, and follows *v’s *edges to see whether a shorter path exists to another vertex. After processing *v**2** **, *for example, the algorithm determines that the distance from *s *to *v**3 *is really 17 through the path *<s,v**2** **,v** **3**>.** *Once *S *expands to equal *V, *the algorithm completes.

**Input/Output**

**Input**

A directed, weighted graph *G=(V,E) *and a source vertex sE *V. *Each edge *e=**(**u,v**) *has an associated positive weight in the graph. The quantity *n *represents the number of vertices in G.

**Output**

Dijkstra’s Algorithm produces two computed arrays. The primary result is the array dist[] of values representing the distance from source vertex *s *to each vertex in the graph. Note that d ist[ s] is zero. The secondary result is the array p red [), which can be used to rediscover the actual shortest paths from vertex *s *to each vertex in the graph.

**Assumptions**

The edge weights are positive (i.e., greater than zero); if this assumption is not true, then dist[u] may contain invalid results. Even worse, Dijkstra’s Algorithm will loop forever if a cycle exists whose sum of all weights is less than zero.

**Solution**

As Dijkstra’s Algorithm executes, dis t[v] represents the maximum length of the shortest path found from the source *s *to *v *using only vertices visited within the setS. Also, for each vES, dist[v] is correct. Fortunately, Dijkstra’s Algorithm does not actually compute and store the setS. It initially constructs a set containing the vertices in *V, *and then it removes vertices one at a time from the set to compute proper dis t[v] values; for convenience, we continue to refer to this ever-shrinking set as *V-S. *Dijkstra’s Algorithm terminates when all vertices are either visited or are shown to not be reachable from the source vertex *s.*

In the C++ solution shown below, a binary heap stores the vertices in the set *V-S *as a priority queue because, in constant time, one can locate the vertex with smallest priority (where the priority is determined by the vertex’s distance from s). Additionally, when a shorter path from *s *to *v *is found, dist [ v ] is decreased, requiring the heap to be modified. Fortunately, the decrease Key operation on priority queues represented using binary heaps can be performed on average in O(log *q) *time, where *q *is the number of verticesin the binary heap, which will always be less than or equal to the number of vertices, *n.*

//Dijkstra’s Algorithm with priority queue implementation #include "BinaryHeap.h" #include "Graph.h" /** Given directed, weighted graph, compute shortest distance to vertices * (dist) and record predecessor links (pred) for all vertices. */ void singleSourceShortest(Graph const &g, int s, vector &dist, vector &pred) { // initialize dist[] and pred[] arrays. Start with vertex s by setting // dist[] to 0. Priority Queue PQ contains all v in G. const int n = g.numVertices( ); pred.assign(n, -1); dist.assign(n, numeric_limits<int>::max( )); dist[s] = 0; BinaryHeap pq(n); for (int u = 0; u < n; u++) { pq.insert (u, dist[u]); } // find vertex in ever-shrinking set, V-S, whose dist[] is smallest. // Recompute potential new paths to update all shortest paths while (!pq.isEmpty( )) { int u = pq.smallest( ); // For neighbors of u, see if newLen (best path from s->u + weight // of edge u->v) is better than best path from s->v. If so, update // in dist[v] and re-adjust binary heap accordingly. Compute in // long to avoid overflow error. for (VertexList::const_iterator ci = g.begin(u); ci != g.end(u); ++ci) { int v = ci->first; long newLen = dist[u]; newLen += ci->second; if (newLen < dist[v]) { pq.decreaseKey (v, newLen); dist[v] = newLen; pred[v] = u; } } } }

**Consequences**

Arithmetic error also may occur if the sum of the individual edge weights exceeds numeric_limits<i n t>: :max () (although the individual values do not). To avoid this situation, the computed new len uses a long data type.

**Analysis**

In the implementation of Dijkstra’s Algorithm, the loop that constructs the initial priority queue performs the insert operation *V *times, resulting in performance *O(V *log V). In the remaining while loop, each edge is visited once, and thus decrease Key is called no more than E times, which contributes *O(E *log V) time. Thus, the overall performance is O(( V *+E) *log V).

The C++ implementation below is simpler since it avoids the use of a binary heap. Th e efficiency of this version is determined by considering how fast the smallest dist [] value in *V-S *can be retrieved. The while loop is executed *n *times, since *S *grows on e vertex at a time. Finding the smallest dist [ u] in *V-S *inspects all *n *vertices. Note that each edge is inspected exactly once in the inner loop within the while loop. Thus, the total running time of this version is 0 (V 2+E).

//Implementation of Dijkstra’s Algorithm for dense graphs #include "Graph.h" void singleSourceShortest(Graph const &graph, int s, vector &dist, vector &pred) { // initialize dist[] and pred[] arrays. Start with vertex s by setting // dist[] to 0. const int n = graph.numVertices( ); pred.assign(n, -1); dist.assign(n, numeric_limits<int>::max( )); vector<bool> visited(n); dist[s] = 0; // find vertex in ever-shrinking set, V-S, whose dist value is smallest // Recompute potential new paths to update all shortest paths while (true) { // find shortest distance so far in unvisited vertices int u = -1; int sd = numeric_limits<int>::max( ); // assume not reachable for (int i = 0; i < n; i++) { if (!visited[i] && dist[i] < sd) { sd = dist[i]; u = i; } } if (u == -1) { break; // no more progress to be made } // For neighbors of u, see if length of best path from s->u + weight // of edge u->v is better than best path from s->v. visited[u] = true; for (VertexList::const_iterator ci = graph.begin(u); ci != graph.end(u); ++ci) { int v = ci->first; // the neighbor v long newLen = dist[u]; // compute as long newLen += ci->second; // sum with (u,v) weight if (newLen < dist[v]) { dist[v] = newLen; pred[v] = u; } } } }

We can further optimize to remove all of the C++ standard template library objects, as shown below. By reducing the overhead of the supporting classes, we realize impressive performance benefits, as discussed in the “Comparison” section.

/** * Optimized Dijkstra’s Algorithm for dense graphs * * Given int[][] of edge weights in raw form, compute shortest distance to * all vertices in graph (dist) and record predecessor links for all * vertices (pred) to be able to recreate these paths. An edge weight of * INF means no edge. Suitable for Dense Graphs Only. */ void singleSourceShortestDense(int n, int ** const weight, int s,int *dist, int *pred) { // initialize dist[] and pred[] arrays. Start with vertex s by setting // dist[] to 0. All vertices are unvisited. bool *visited = new bool[n]; for (int v = 0; v < n; v++) { dist[v] = numeric_limits<int>::max( ); pred[v] = -1; visited[v] = false; } dist[s] = 0; // find shortest distance from s to all unvisited vertices. Recompute // potential new paths to update all shortest paths. Exit if u remains -1. while (true) { int u = -1; int sd = numeric_limits<int>::max( ); for (int i = 0; i < n; i++) { if (!visited[i] && dist[i] < sd) { sd = dist[i]; u = i; } } if (u == -1) { break; } // For neighbors of u, see if length of best path from s->u + weight // of edge u->v is better than best path from s->v. Compute using longs. visited[u] = true; for (int v = 0; v < n; v++) { int w = weight[u][v]; if (v == u) continue; long newLen = dist[u]; newLen += w; if (newLen < dist[v]) { dist[v] = newLen; pred[v] = u; } } } delete [] visited; }

###### Related articles

- Dijkstra’s Algorithm (woodscompscitutorials.wordpress.com)
- Practice (graph searching) Dijkstra algorithm (codekhal.wordpress.com)
- How to record all the shortest paths from a source vertex to a destination vertex (stackoverflow.com)
- Dijkstra algorithm, application to problem RoboCourier I3 (codekhal.wordpress.com)
- SSSP (Single Source Shortest Path) on weighted graph (mohamed0essam.wordpress.com)
- Graph Theory in Everyday Life (gavinleroux.wordpress.com)

## Linear Time Sorting – Bucket or Bin Sort

Assume that the keys of the items that we wish to sort lie in a small fixed range and that there is only one item with each value of the key. Then we can sort with the following procedure:

**1. **Set up an array of “bins” – one for each value of the key – in order,

**2. **Examine each item and use the value of the key to place it in the appropriate bin.

Now our collection is sorted and it only took n operations, so this is an O(n) operation. However, note that it will only work under very restricted conditions. To understand these restrictions, let’s be a little more precise about the specification of the problem and assume that there are m values of the key. To recover our sorted collection, we need to examine each bin. This adds a third step to the algorithm above,

**3. **Examine each bin to see whether there’s an item in it.

which requires m operations. So the algorithm’s time becomes:

T (n) = c1n + c2m

and it is strictly O(n + m). If m ≤ n, this is clearly O(n). However if m >> n, then it is O(m). An implementation of bin sort might look like:

BUCEKTSORT( array A, int n, int M)

1 // Pre-condition: for 1 ≤ i ≤ n, 0 ≤ a[i] < M

2 // Mark all the bins empty

3 **for **i ← 1 **to **M

4 **do **bin[i] ← Empty

5 **for **i ← 1 **to **n

6 **do **bin[A[i]] ← A[i]

If there are *duplicates*, then each bin can be replaced by a *linked list*. The third step then becomes:

**3. **Link all the lists into one list.

We can add an item to a linked list in O(1) time. There are n items requiring O(n) time. Linking a list to another list simply involves making the tail of one list point to the other, so it is O(1). Linking m such lists obviously takes O(m) time, so the algorithm is still O(n + m).

###### Related articles

- Bucket Sort – buckets unchanged after sorting (daniweb.com)
- All sorts of SORTING techniques (aishwaryr.wordpress.com)
- Linear Time Sorting, Counting Sort (alikhuram.wordpress.com)
- My sort of Bucket List (imperfectionismybeauty.com)

## Linear Time Sorting, Counting Sort & Radix Sort

The lower bound of sorting algorithms implies that if we hope to sort numbers faster than O(n log n), we cannot do it by making comparisons alone. Is it possible to sort without making comparisons?

The answer is yes, but only under very restrictive circumstances. Many applications involve sorting small integers (e.g. sorting characters, exam scores, etc.). We present three algorithms based on the theme of speeding up sorting in special cases, by not making comparisons.

Counting sort assumes that the numbers to be sorted are in the range 1 to k where k is small. The basic idea is to determine the rank of each number in final sorted array.

The rank of an item is the number of elements that are less than or equal to it. Once we know the ranks, we simply copy numbers to their final position in an output array.

The question is how to find the rank of an element without comparing it to the other elements of the array?. The algorithm uses three arrays.

A[1..n] holds the initial input, B[1..n] holds the sorted output and C[1..k] is an array of integers.

C[x] is the rank of x in A, where x ∈ [1..k].

The algorithm is remarkably simple, but deceptively clever. The algorithm operates by first constructing C. This is done in two steps.

First we set C[x] to be the number of elements of A[j] that are equal to x. We can do this initializing C to zero, and then for each j, from 1 to n, we increment C[A[j]] by 1.

Thus, if A[j] = 5, then the 5th element of C is incremented, indicating that we have seen one more 5. To determine the number of elements that are less than or equal to x, we replace C[x] with the sum of elements in the sub array R[1 : x]. This is done by just keeping a running total of the elements of C.

C[x] now contains the rank of x. This means that if x = A[j] then the final position of A[j] should be at position C[x] in the final sorted array.

Thus, we set B[C[x]] = A[j]. Notice We need to be careful if there are duplicates, since we do not want them to overwrite the same location of B. To do this, we decrement C[i] after copying.

COUNTING-SORT( array A, array B, int k)

1 **for **i ← 1 **to **k

2 **do **C[i] ← 0 k times

3 **for **j ← 1 **to **length[A]

4 **do **C[A[j]] ← C[A[j]] + 1 n times

5 *// *C[i] *now contains the number of elements *= i

6 **for **i ← 2 **to **k

7 **do **C[i] ← C[i] + C[i − 1] k times

8 *// *C[i] *now contains the number of elements *≤ i

9 **for **j ← length[A] **downto **1

10 **do **B[C[A[j]]] ← A[j]

11 C[A[j]] ← C[A[j]] − 1 n times

There are four (un-nested) loops, executed k times, n times, k − 1 times, and n times, respectively, so the total running time is Θ(n + k) time. If k = O(n), then the total running time is Θ(n).

Counting sort is not an in-place sorting algorithm but it is stable. Stability is important because data are often carried with the keys being sorted. radix sort (which uses counting sort as a subroutine) relies on it to work correctly. Stability achieved by running the loop down from n to 1 and not the other way around.

**Radix Sort**

The main shortcoming of counting sort is that it is useful for small integers, i.e., 1..k where k is small. If k were a million or more, the size of the rank array would also be a million. Radix sort provides a nice work around this limitation by sorting numbers one digit at a time.

576 49[4] 9[5]4 [1]76 176

494 19[4] 5[7]6 [1]94 194

194 95[4] 1[7]6 [2]78 278

296 ⇒ 57[6] ⇒ 2[7]8 ⇒ [2]96 ⇒ 296

278 29[6] 4[9]4 [4]94 494

176 17[6] 1[9]4 [5]76 576

954 27[8] 2[9]6 [9]54 954

Here is the algorithm that sorts A[1..n] where each number is d digits long.

RADIX-SORT( array A, int n, int d)

1 **for **i ← 1 **to **d

2 **do **stably sort A w.r.t i th lowest order digit

###### Related articles

- Top 15 Data Structures and Algorithm Interview Questions for Java programmer – Answers (javarevisited.blogspot.com)
- C++ – Gnome Sort (bradenlenz.com)
- All sorts of SORTING techniques (aishwaryr.wordpress.com)
- An update on radix sort (attractivechaos.wordpress.com)
- Find max element in sorted array – O(log n) (cs.stackexchange.com)
- A quick note on radix sort (attractivechaos.wordpress.com)
- Linear Time Sorting – Bucket or Bin Sort (alikhuram.wordpress.com)
- Linear Time Sorting, Counting Sort (alikhuram.wordpress.com)
- Radix (quenta.org)