9 Graphs

Graphs provide the ultimate in data structure flexibility. A graph consists of a set of nodes, and a set of edges where an edge connects two nodes. Trees and lists can be viewed as special cases of graphs.

Graphs are used to model both real-world systems and abstract problems, and are the data structure of choice in many applications. Here is a small sampling of the types of problems that graphs are routinely used for.

  1. Modeling connectivity in computer and communications networks.
  2. Representing an abstract map as a set of locations with distances between locations. This can be used to compute shortest routes between locations such as in a GPS routefinder.
  3. Modeling flow capacities in transportation networks to find which links create the bottlenecks.
  4. Finding a path from a starting condition to a goal condition. This is a common way to model problems in artificial intelligence applications and computerized game players.
  5. Modeling computer algorithms, to show transitions from one program state to another.
  6. Finding an acceptable order for finishing subtasks in a complex activity, such as constructing large buildings.
  7. Modeling relationships such as family trees, business or military organizations, and scientific taxonomies.

The rest of this module covers some basic graph terminology. The following modules will describe fundamental representations for graphs, provide a reference implementation, and cover core graph algorithms including traversal, topological sort, shortest paths algorithms, and algorithms to find the minimal-cost spanning tree. Besides being useful and interesting in their own right, these algorithms illustrate the use of many other data structures presented throughout the course.

A graph 𝐆=(𝐕,𝐄)\mathbf{G} = (\mathbf{V}, \mathbf{E}) consists of a set of vertices 𝐕\mathbf{V} and a set of edges 𝐄\mathbf{E}, such that each edge in 𝐄\mathbf{E} is a connection between a pair of vertices in 𝐕\mathbf{V}. The number of vertices is written |𝐕||\mathbf{V}|, and the number of edges is written |𝐄||\mathbf{E}|. |𝐄||\mathbf{E}| can range from zero to a maximum of |𝐕|2βˆ’|𝐕||\mathbf{V}|^2 - |\mathbf{V}|.

Note: Some graph applications require that a given pair of vertices can have multiple or parallel edges connecting them, or that a vertex can have an edge to itself. However, the applications discussed here do not require either of these special cases. To simplify our graph API, we will assume that there are no duplicate edges.

A graph whose edges are not directed is called an undirected graph, as shown in part (a) of the following figure. A graph with edges directed from one vertex to another (as in (b)) is called a directed graph or digraph. A graph with labels associated with its vertices (as in (c)) is called a labeled graph. Associated with each edge may be a cost or weight. A graph whose edges have weights (as in (c)) is said to be a weighted graph.

Some types of graphs.

An edge connecting Vertices aa and bb is written (a,b)(a, b). Such an edge is said to be incident with Vertices aa and bb. The two vertices are said to be adjacent. If the edge is directed from aa to bb, then we say that aa is adjacent to bb, and bb is adjacent from aa. The degree of a vertex is the number of edges it is incident with. For example, Vertex ee below has a degree of three.

In a directed graph, the out degree for a vertex is the number of neighbors adjacent from it (or the number of edges going out from it), while the in degree is the number of neighbors adjacent to it (or the number of edges coming in to it). In (c) above, the in degree of Vertex 1 is two, and its out degree is one.

A sequence of vertices v1,v2,...,vnv_1, v_2, ..., v_n forms a path of length nβˆ’1n-1 if there exist edges from viv_i to vi+1v_{i+1} for 1≀i<n1 \leq i < n. A path is a simple path if all vertices on the path are distinct. The length of a path is the number of edges it contains. A cycle is a path of length three or more that connects some vertex v1v_1 to itself. A cycle is a simple cycle if the path is simple, except for the first and last vertices being the same.

An undirected graph is a connected graph if there is at least one path from any vertex to any other. The maximally connected subgraphs of an undirected graph are called connected components. For example, this figure shows an undirected graph with three connected components.

A graph with relatively few edges is called a sparse graph, while a graph with many edges is called a dense graph. A graph containing all possible edges is said to be a complete graph. A subgraph 𝐒\mathbf{S} is formed from graph 𝐆\mathbf{G} by selecting a subset 𝐕s\mathbf{V}_s of 𝐆\mathbf{G}’s vertices and a subset 𝐄s\mathbf{E}_s of 𝐆\mathbf{G} ’s edges such that for every edge eβˆˆπ„se \in \mathbf{E}_s, both vertices of ee are in 𝐕s\mathbf{V}_s. Any subgraph of VV where all vertices in the graph connect to all other vertices in the subgraph is called a clique.n

A graph without cycles is called an acyclic graph. Thus, a directed graph without cycles is called a directed acyclic graph or DAG.

A free tree is a connected, undirected graph with no simple cycles. An equivalent definition is that a free tree is connected and has |𝐕|βˆ’1|\mathbf{V}| - 1 edges.