# Graphs

In this lesson we consider directed graphs (without weights) represented with adjacency lists. We first present how to encode them in an easy way in C++ and then show the implementation of several basic graph algorithms.

## Graph representation

In the following, we assume that directed graphs are represented through
**adjacency lists**: For each node, we have an (unordered) list of its
succesors. For simplicity, we assume that the nodes (also known as
*vertices*) are identified with integers from 0 up to $n-1$, where $n$ is the
total number of nodes in the graph. Consequently, nodes and graphs can simply
be declared like this:

```
using node = int; // nodes are represented by indices from 0
using AdjacencyList = vector<node>; // the adjacency list of one node (arcs to successor nodes)
using Graph = vector<AdjacencyList>; // the complete adjacency list for all nodes in the graph
```

The outter vector of the `Graph`

type stores, for each node,
an adjacency list. This adjacency list is stored as a vector of
nodes. As an example, the graph

Layout:

can be defined as

```
Graph G = {{2, 3}, {4, 2}, {3}, {2}, {1, 3, 5}, {0, 1}};
```

Observe that we do not add contraints on the order of the nodes in the adjacency lists —even though this could make sense in some settings. However, we do not allow repetitions (this would mean multiple arcs) nor self-loops.

This representations has the following interesting properties:

- The number of nodes in a graph
`G`

is`G.size()`

. - The out-degree of node
`u`

in a graph`G`

is`G[u].size()`

. - One can iterate through all nodes
`u`

in a graph`G`

using a`for (node u = 0; u < G.size(); ++u)`

loop. - One can iterate through all succesors
`v`

of a node`u`

in a graph`G`

using a`for (node v : G[u])`

loop. - It is possible to set attributes or properties of the nodes through additional vectors.

In the following, $n$ denotes the number of nodes of a directed graph, and $m$ denotes its number of arcs. Therefore, the size of the graph is $\O(m+n)$. Consequently, algorithms of cost $\O(m+n)$ are linear time algorithms.

All given algorithms are given for directed graphs, but work all the same in directed graphs when each edge is given as two symmetric arcs. As an example, the undirected graph

Layout:

can be defined as

```
Graph G = {{1, 2}, {2, 0, 3}, {0, 1, 4}, {1, 4}, {3, 2}};
```

In the following, some algorithms need the notion of an unknown natural or node
value. To do so, we shall use the following `None`

constant:

```
const int None = -1;
```

## Simple traversal

The following code snippet shows how to traverse each node in a directed graph and how to traverse its succesors in order to print them.

```
void simple_traversal (const Graph& G) {
int n = G.size(); // let n be the number of nodes in G
for (node u = 0; u < n; ++u) { // for each node u in G
cout << u << ":" << endl; // print u:
for (node v : G[u]) { // for each neighbor v of u in G
cout << " " << v; // print v
}
cout << endl;
} }
```

Applying this code to the first sample graph above would produce this output:

```
0: 2 3
1: 4 2
2: 3
3: 2
4: 1 3 5
5: 0 1
```

## Depth-first search ordering (recursive)

**Depth-first search** (**DFS**) is an algorithm for traversing or searching a graph.
It starts from each node in the graph and explores as far as possible along
each branch before backtracking.

The following code returns a DFS ordering of the nodes of a given graph
`G = (V,E)`

. This DFS implementation uses recursion and needs to keep track of
the subset `S ⊆ V`

of nodes that have already been **visited**. Starting from an
empty ordering in a list `L`

, it adds each new visited node to the end of `L`

and recursively visits all its unvisited succesors.

The subset of visited nodes `S`

is represented as a vector of booleans that
tells, for each of the `n`

possible nodes, whether they are in `S`

or not.
Using a `set<int>`

for `S`

would be overkill. The returned DFS ordering is a
list of nodes. For efficiency reasons, one could prefer using a vector of
nodes, as the only necessary operation is to push nodes at its end.

```
/** Returns a list of nodes with a DFS ordering of a graph. */
list<node> dfs_ordering (const Graph& G) {
int n = G.size(); // number of nodes of G
list<node> L; // list with the DFS ordering
vector<bool> S(n, false); // subset of visited nodes
for (node u = 0; u < n; ++u) {
dfs_from(G, u, S, L);
}
return L;
}
void dfs_from (const Graph& G, node u, vector<bool>& S, list<node>& L) {
if (not S[u]) {
S[u] = true;
L.push_back(u);
for (node v : G[u]) {
dfs_from(G, v, S, L);
} } }
```

Applying this code to the first sample graph above produces
`[0, 2, 3, 1, 4, 5]`

.

The running time is $\O(m+n)$, which is linear in the size of the graph.

**🤔 Exercice:** How would you modify this program to check
if an undirected graph is connected? (An undirected graph
is connected if there is a path between any pair of its nodes.)

## Depth-first search ordering (iterative)

One may change the previous `dfs_from()`

function to work in a iterative rather
than recursive way. In this case, a stack of nodes `s`

is used to keep track
of the nodes that have already been found and their order.

```
void dfs_from (const Graph& G, node u, vector<bool>& S, list<node>& L) {
if (not S[u]) {
stack<node> s;
s.push(u);
while (not s.empty()) {
node v = s.top();
s.pop();
if (not S[v]) {
S[v] = true;
L.push_back(v);
for (node w : G[v]) if (not S[w]) {
s.push(w);
} } } } }
```

Applying this code to the first sample graph above produces ```
[0, 3, 2, 1, 4,
5]
```

. You should think why this DFS ordering is different from the DFS ordering
in the recursive implementation and why both are valid.

The running time is still $\O(m+n)$.

## Breadth-first search ordering

**Breadth-first search** (**BFS**) is an algorithm for traversing or searching
a graph. It starts from each node of the graph and explores the succesor
nodes first, before moving to the next level of succesors.

The following code returns a BFS ordering of the nodes of a given graph. The
BFS implementation uses a queue of nodes. It also needs to keep track of the
subset of nodes `Q ⊆ V`

that have already **enqueued** (🧠remember: *enqueued*
for BFS and *visited* for DFS).

```
/** Returns a list of nodes with a BFS ordering of a graph. */
list<node> bfs_ordering (const Graph& G) {
int n = G.size(); // number of nodes in G
list<node> L; // list with the BFS ordering
vector<bool> Q(n, false); // subset of enqueued nodes
for (node u = 0; u < n; ++u) {
bfs_from(G, u, Q, L);
}
return L;
}
void bfs_from (const Graph& G, int u, vector<bool>& Q, list<node>& L) {
if (not Q[u]) {
queue<node> q;
q.push(u);
Q[u] = true;
while (not q.empty()) {
int v = q.front();
q.pop();
L.push_back(v);
for (node w : G[v]) if (not Q[w]) {
q.push(w);
Q[w] = true;
} } } }
```

Applying this code to the above sample graph produces `[0, 2, 3, 1, 4, 5]`

.

The running time is $\O(m+n)$.

**🤔 Exercice:** How would you modify this program to check
if an undirected graph is connected?

## Minimum distance from a source node

The following code performs a breadth-first search on an undirected graph in
order to compute the minimum distance (number of hops) from a given source
node to all other nodes in the graph. The input is the graph `G`

and the
source node `s`

(with `0 ≤ s < G.size`

), and the returned value is a vector
with the distances from `s`

for all nodes in the graph. In the case that some
node is not reachable from `s`

, the computed distance is the special value
`None`

(defined as `-1`

through a constant).

The implementation is based on BFS. It starts from `s`

and runs until all
reachable nodes have been enqueued. For each node, we keep its distance from
`s`

in `d`

, which is initialized to `None`

values, to reflect unreachableness.
We do not need to use a subset `Q`

to tell whether a node has been
enqueued or not because `d`

can be used for that same purpouse.
Its running time is $\O(m+n)$.

```
/** Returns the minimum distance of each node of a graph from a source node s. */
vector<int> minimum_distance (const Graph& G, node s) {
int n = G.size(); // number of nodes in G
vector<int> d(n, None); // for each node, its distance from s
d[s] = 0;
queue<node> q;
q.push(s);
while (not q.empty()) {
node v = q.front();
q.pop();
for (node w : G[v]) if (d[w] == None) {
d[w] = d[v] + 1;
q.push(w);
} }
return d;
}
```

Applying this code to the initial sample graph from source node `1`

produces
`[3, 0, 1, 2, 1, 2]`

.

You may think how to modify this function to get the minimum distance between a source node and a target node.

**🤔 Exercice:** How would you modify this program to compute
the minimum distance from a source node `s`

to a target node `t`

?

## Topological ordering

A **topological ordering** of an acyclic directed graph is a linear ordering
of its nodes such that for every directed edge $(u,v)$ from node $u$ to node
$v$, $u$ comes before $v$ in the ordering.

Given a directed acyclic graph, the following algorithm returns a topological ordering if its nodes. To do so, at step 1, we count the in-degree of all nodes. Then, at step 2, we place all nodes with in-degree zero in some container. In this implementation, the chosen container is a stack, but any other fast data structure would do. Finally, at step 3, the nodes are extracted from the container and placed at the end of the ordering, as they have in-degree zero. Moreover, since they can be cosidered “done”, they can discount one unit to the in-degree of their succesors and, when their in-degree becomes zero, they are placed in the container, as all their dependencies have been fulfilled.

```
/** Returns a list of nodes with a topological ordering of a directed acyclic graph. */
list<node> topological_ordering(const Graph& G) {
int n = G.size();
// Step 1: compute the in-degree of each node
vector<int> indeg(n, 0); // for each node, its in-degree
for (int u = 0; u < n; ++u) {
for (node v : G[u]) {
++indeg[v];
} }
// Step 2: place all nodes with in-degree zero in some container.
stack<int> s; // container with nodes with in-degree zero
for (node u = 0; u < n; ++u) {
if (indeg[u] == 0) {
s.push(u);
} }
// Step 3: process work from the container.
list<node> L; // list with the topological ordering
while (not s.empty()) {
node u = s.top();
s.pop();
L.push_back(u);
for (node v : G[u]) {
if (--indeg[v] == 0) {
s.push(v);
} } }
return L;
}
```

As an exemple, the previous algorithm ran on the next DAG produces
the `[1, 3, 0, 2, 5, 4]`

topological ordering.

Layout:

The running time is $\O(m+n)$.

**🤔 Exercice:** How would you modify this program to check
that the given directed graph is really acyclic?

## Strongly-connected components

A **strongly-connected component** of a directed graph is a maximal subset of
nodes of the graph such that any pair of them is connected through some
directed path. Therefore, the nodes of a graph can be partitioned into
several strongly connected components.

The following code computes the strongly-connected components of a graph. More
precisely, given an undirected graph `G`

, it returns the identifier of the
strongly connected component of each node, as a number between zero and
$C-1$, where $C$ is the total number of strongly connected components.

You can get an explanation of this algorithm in these slides.

```
/** Returns the strongly connected component identifier of each nodes of a graph. */
vector<int> strongly_connected_components (const Graph& G) {
// Reverse the graph.
Graph Gr = reverse(G);
// Get a DFS post-ordering L of the nodes and reverse it.
list<node> L; // list of vertices in post visit order
vector<bool> S(n, false); // subset of visited nodes
for (node u = 0; u < n; ++u) {
dfs1(G, u, S, L);
}
L.reverse();
// Perform DFS in G according to the order of the nodes in L.
int c = 0; // current number of strongly connected component
vector<int> scc(n, None); // for each node, number of its strongly connected component
for (node u : L) {
dfs2(G, u, scc, ++c);
}
return scc;
}
Graph reverse(const Graph& G) {
int n = G.size();
Graph Gr(n);
for (node u = 0; u < n; ++u) {
for (node v : G[u]) {
Gr[v].push_back(u);
} }
return Gr;
}
void dfs1 (const Graph& G, node u, vector<bool>& S, list<node>& L) {
if (not S[u]) {
S[u] = true;
for (node v : G[u]) {
dfs1(G, v, S, L);
}
L.push_back(u);
} }
void dfs2 (const Graph& G, node u, vector<int>& scc, int c) {
if (scc[u] == None) {
scc[u] = c;
for (node v : G[u]) {
dfs2(G, v, scc, c);
} } }
```

Applying this code to the first sample graph produces `[1, 2, 0, 0, 2, 2]`

.
These strong connected components are shown in different colors below:

Layout:

The running time is $\O(m+n)$.

# Fòrum

**Lliçons.jutge.org**

Jordi Petit, Jordi Cortadella

Universitat Politècnica de Catalunya, 2020

Prohibit copiar. Tots els drets reservats.

*No copy allowed. All rights reserved.*