9.5 Shortest-Paths Problems

9.5.1 Shortest-Paths on Unweighted Graphs

If you have an unweighted graph, the shortest path between two vertices is the smallest number of edges you have to pass to get from one of the vertices to the other.

If you agument the breadth-first search algorithm to remember which vertex a visited vertex came from, if will give you the shortest path between the start vertex and any other vertex. However, things become sligthly more complicated if the graph is weighted.

9.5.2 Shortest-Paths on Weighted Graphs

On a road map, a road connecting two towns is typically labeled with its distance. We can model a road network as a directed graph whose edges are labeled with real numbers. These numbers represent the distance (or other cost metric, such as travel time) between two vertices. These labels may be called weights, costs, or distances, depending on the application. Given such a graph, a typical problem is to find the total length of the shortest path between two specified vertices. This is not a trivial problem, because the shortest path may not be along the edge (if any) connecting two vertices, but rather may be along a path involving one or more intermediate vertices.

For example, in Figure #DistExamp, the cost of the path from AA to BB to DD is 15. The cost of the edge directly from AA to DD is 20. The cost of the path from AA to CC to BB to DD is 10. Thus, the shortest path from AA to DD is 10 (rather than along the edge connecting AA to DD). We use the notation 𝐝(A,D)=10\mathbf{d}(A, D) = 10 to indicate that the shortest distance from AA to DD is 10. In the figure, there is no path from EE to BB, so we set 𝐝(E,B)=∞\mathbf{d}(E, B) = \infty. We define 𝐰(A,D)=20\mathbf{w}(A, D) = 20 to be the weight of edge (A,D)(A, D), that is, the weight of the direct connection from AA to DD. Because there is no edge from EE to BB, 𝐰(E,B)=∞\mathbf{w}(E, B) = \infty. Note that 𝐰(D,A)=∞\mathbf{w}(D, A) = \infty because the graph is directed. We assume that all weights are positive.

Example graph for shortest-path definitions.

9.5.3 Single-Source Shortest Paths

We will now present an algorithm to solve the single-source shortest paths problem. Given Vertex SS in Graph 𝐆\mathbf{G}, find a shortest path from SS to every other vertex in 𝐆\mathbf{G}. We might want only the shortest path between two vertices, SS and TT. However in the worst case, finding the shortest path from SS to TT requires us to find the shortest paths from SS to every other vertex as well. So there is no better algorithm (in the worst case) for finding the shortest path to a single vertex than to find shortest paths to all vertices. The algorithm described here will only compute the distance to every such vertex, rather than recording the actual path. Recording the path requires only simple modifications to the algorithm.

Computer networks provide an application for the single-source shortest-paths problem. The goal is to find the cheapest way for one computer to broadcast a message to all other computers on the network. The network can be modeled by a graph with edge weights indicating time or cost to send a message to a neighboring computer.

For unweighted graphs (or whenever all edges have the same cost), the single-source shortest paths can be found using a simple breadth-first search. When weights are added, BFS will not give the correct answer.

One approach to solving this problem when the edges have differing weights might be to process the vertices in a fixed order. Label the vertices v0v_0 to vnβˆ’1v_{n-1}, with S=v0S = v_0. When processing Vertex v1v_1, we take the edge connecting v0v_0 and v1v_1. When processing v2v_2, we consider the shortest distance from v0v_0 to v2v_2 and compare that to the shortest distance from v0v_0 to v1v_1 to v2v_2. When processing Vertex viv_i, we consider the shortest path for Vertices v0v_0 through viβˆ’1v_{i-1} that have already been processed. Unfortunately, the true shortest path to viv_i might go through Vertex vjv_j for j>ij > i. Such a path will not be considered by this algorithm. However, the problem would not occur if we process the vertices in order of distance from SS. Assume that we have processed in order of distance from SS to the first iβˆ’1i-1 vertices that are closest to SS; call this set of vertices 𝐒\mathbf{S}. We are now about to process the ii th closest vertex; call it XX.

A shortest path from SS to XX must have its next-to-last vertex in SS. Thus,

𝐝(S,X)=minUβˆˆπ’(𝐝(S,U)+𝐰(U,X)) \mathbf{d}(S, X) = \min_{U \in \mathbf{S}}(\mathbf{d}(S, U) + \mathbf{w}(U, X))

In other words, the shortest path from SS to XX is the minimum over all paths that go from SS to UU, then have an edge from UU to XX, where UU is some vertex in 𝐒\mathbf{S}.

This solution is usually referred to as Dijkstra’s algorithm. It works by maintaining a distance estimate 𝐃(X)\mathbf{D}(X) for all vertices XX in 𝐕\mathbf{V}. The elements of 𝐃\mathbf{D} are initialized to the value ∞\infty (positive infinity). Vertices are processed in order of distance from SS. Whenever a vertex vv is processed, 𝐃(X)\mathbf{D}(X) is updated for every neighbor XX of VV. Here is an implementation for Dijkstra’s algorithm. At the end, array D will contain the shortest distance values.

// Compute shortest path distances from s
function Dijkstra(G, s):
    visited = new Set()
    D = new Map()
    for each v in G.vertices():
        D.put(v, ∞)  // Initialise distances
    D.put(s, 0)  // The distance from s to s is 0

    repeat G.vertxCount() times:  // Process the vertices
        v = minVertex(G, D, visited)  // Find next-closest vertex
        visited.add(v)
        if D.get(v) == ∞:
            return D  // Vertex v is unreachable
        for each e in G.outgoingEdges(v):
            w = e.end
            dist = D.get(v) + e.weight
            if dist < D.get(w): // If the new distance is shorter...
                D.put(w, dist)  // ...update w with the new distance
    return D

There are two reasonable solutions to the key issue of finding the unvisited vertex with minimum distance value during each pass through the main for loop. The first method is simply to scan through the list of |𝐕||\mathbf{V}| vertices searching for the minimum value, as follows:

// Find the unvisited vertex with the smalled distance
function minVertex(G, D, visited):
    minV = null
    for each v in G.vertices():
        if v not in visited:
            if minV is null or D.get(v) < D.get(minV):
                minV = v
    return minV

Because this scan is done |𝐕||\mathbf{V}| times, and because each edge requires a constant-time update to D, the total cost for this approach is Θ(|𝐕|2+|𝐄|)=Θ(|𝐕|2)\Theta(|\mathbf{V}|^2 + |\mathbf{E}|) = \Theta(|\mathbf{V}|^2), because |𝐄||\mathbf{E}| is in O(|𝐕|2)O(|\mathbf{V}|^2).

An alternative approach is to store unprocessed vertices in a min-heap ordered by their distance from the processed vertices. The next-closest vertex can be found in the heap in Θ(log|𝐕|)\Theta(\log |\mathbf{V}|) time. Every time we modify 𝐃(X)\mathbf{D}(X), we could reorder XX in the heap by deleting and reinserting it. This is an example of a priority queue with priority update. To implement true priority updating, we would need to store with each vertex its position within the heap so that we can remove its old distances whenever it is updated by processing new edges. A simpler approach is to add the new (always smaller) distance value for a given vertex as a new record in the heap. The smallest value for a given vertex currently in the heap will be found first, and greater distance values found later will be ignored because the vertex will already be marked as visited. The only disadvantage to repeatedly inserting distance values in this way is that it will raise the number of elements in the heap from Θ(|𝐕|)\Theta(|\mathbf{V}|) to Θ(|𝐄|)\Theta(|\mathbf{E}|) in the worst case. But in practice this only adds a slight increase to the depth of the heap. The time complexity is Θ((|𝐕|+|𝐄|)log|𝐄|)\Theta((|\mathbf{V}| + |\mathbf{E}|) \log |\mathbf{E}|), because for each edge that we process we must reorder the heap. We use the KVPair class to store key-value pairs in the heap, with the edge weight as the key and the target vertex as the value. here is the implementation for Dijkstra’s algorithm using a heap.

// Dijkstra's shortest-paths: priority queue version
function DijkstraPQ(G, s):
    visited = new Set()
    D = new Map()
    for (V v : G.vertices())
        D.put(v, ∞)  // Initialize distance
    D.put(s, 0)  // The distance from s to s is 0

    agenda = new PriorityQueue()
    agenda.add(new KVPair(0, s))  // Initial vertex

    while not agenda.isEmpty():
        v = agenda.removeMin().value
        if v not in visited:
            visited.add(v)
            if D.get(v) == ∞:
                return D  // Vertex v is unreachable
            for each e in G.outgoingEdges():
                w = e.end
                dist = D.get(v) + e.weight
                if dist < D.get(w): // If the new distance is shorter...
                    D.put(w, dist)  // ...update w with the new distance...
                    agenda.add(new KVPair(dist, w))  // ...and add it to the agenda

Using minVertex to scan the vertex list for the minimum value is more efficient when the graph is dense, that is, when |𝐄||\mathbf{E}| approaches |𝐕|2|\mathbf{V}|^2. Using a heap is more efficient when the graph is sparse because its cost is Θ((|𝐕|+|𝐄|)log|𝐄|)\Theta((|\mathbf{V}| + |\mathbf{E}|) \log |\mathbf{E}|). However, when the graph is dense, this cost can become as great as Θ(|𝐕|2log|𝐄|)=Θ(|𝐕|2log|𝐕|)\Theta(|\mathbf{V}|^2 \log |\mathbf{E}|) = \Theta(|\mathbf{V}|^2 \log |\mathbf{V}|).

Now you can practice using Dijkstra’s algorithm.