Paths, Connectivity, and Trees
Algorithmically Speaking - #12: An introduction to foundational concepts in the graph theory field.
Hello there, and welcome to a new edition of Algorithmically Speaking!
As is usually the case with all fields of knowledge, it is virtually impossible to progress and push the limits if one lacks a theoretical basis supporting more significant discoveries. It is often the case that particularly challenging problems don’t need particularly challenging solutions but clever applications of basic definitions.
In today’s discussion, I want to present some of these fundamental concepts in graph theory, hoping that they will help you lay the foundations for understanding more complex topics in future editions.
In order, this is our agenda for today:
🎯 Basic Definitions — an introduction to fundamental definitions of graph theory.
🥨 Paths and Connectivity — an introduction to one of the most valuable applications of graphs, finding paths.
🌳 Trees — an introduction to a specific type of graph, trees.
This article is part of a chapter on Fundamental Definitions of graph theory in my upcoming book, The Competitive Programmer Graphs Handbook. You can read all the details here:
You can support the book by preordering your digital, DRM-free copy, with early and frequent updates, including all future editions.
🎯 Basic Definitions
A graph G is a mathematical object consisting of two sets, V and E. V is a finite set of objects called vertices. E is a set of subsets of V with cardinality two; each is called an edge.
Edges in graphs generally indicate symmetric relationships between their endpoints. In some cases, the phenomenon being modeled requires representing asymmetric relationships between the graph’s vertices, and for this, directed graphs are used.
A directed graph comprises vertices V and a set of directed edges E′. Each e′ ∈ E′ is an ordered pair (u, v), meaning that u and v functions are not interchangeable. By definition, the edge is said to leave vertex u and enter vertex v. The underlying graph of a directed graph G is obtained by removing the direction constraints from each edge. As a clarification, whenever it is not specified that a graph is directed, it will be assumed to be an undirected graph.
The degree of a vertex in an undirected graph G is the number of edges incident to that vertex. For a vertex u, the degree of u equals the number of edges e = (x, y) such that x = u or y = u. In directed graphs, the degree concept differs slightly because the edges have direction. The in-degree of a vertex u refers to the number of edges that enter u. The number of edges that leave the vertex is called the out-degree.
Proposition 1:
Let G = < V, E > be an undirected graph. The number of vertices with odd degrees is even.
For an in-depth proof of this proposition, check this post:
Often, it is necessary to analyze specific parts of a graph rather than the entire graph. Let G1 = < V1, E1 > and G2 = < V2, E2 >. We say that G1 is a subgraph of G2 if V1 ⊆ V2, E1 ⊆ E2, and for every edge (u, v) ∈ E1, it holds that u ∈ V1 and v ∈ V1. Additionally, if G1 is a subgraph of G2, G1 is said to be a spanning subgraph if it contains all the vertices of G2.
Let G = < V, E > be a graph and S ⊆ V. The induced subgraph G[S] is the graph whose vertex set is S, whose edges are those that belong to E and have both endpoints in S.
🥨 Paths and Connectivity
One of the most common operations in graphs involves traversing a sequence of vertices where an edge connects each pair of consecutive vertices. This phenomenon is commonly seen in real life when following a sequence of hyperlinks on the WEB or traveling by plane from one place to another with several layovers.
With this idea in mind, a path in an undirected graph G = < V, E > is defined as a sequence P of vertices (v1, v2, . . . , vk−1, vk) that satisfies the property that each consecutive pair (vi, vi+1) is connected by an edge in G. P is the path from v1 to vk. A path is simple if all the vertices it includes are distinct. A cycle is a path v1, v2, . . . , vk−1, vk with k > 2, where the first k − 1 vertices are distinct and v1 = vk; in other words, the sequence of vertices ends where it began.
All these concepts can be easily extended to directed graphs with the following adjustment: every pair of consecutive vertices (vi, vi+1) must correspond to a directed edge in the graph, meaning the sequence of vertices must respect the direction constraints associated with G.
An undirected graph G is said to be connected if, for every pair of vertices (u, v), there exists a path from u to v. The definition of connectivity in directed graphs is somewhat different because two vertices (u, v) can have a path starting at u and ending at v. Still, no path starting at v and ending at u. Therefore, a directed graph is strongly connected if, for every pair of vertices (u, v), a path exists from u to v and from v to u.
Sometimes, it is desired to know if there is a path between a pair of vertices in a graph and to find the shortest route. Therefore, for this post, the distance between two vertices (u, v) is defined as the minimum number of edges in a path from u to v. A special symbol such as ∞ can indicate no path between some pair of vertices.
🌳 Trees
If an undirected graph is connected and has no cycles, it is called a tree. In their simplicity, trees are the most basic type of connected graph, as removing any edge causes the graph to become disconnected.
When considering a tree's structure, it is convenient to designate a vertex r as the root. Visually, this idea is similar to taking the tree by the vertex r and letting it hang so that all other vertices are suspended below r by the force of gravity.
Formally, all edges are oriented to form paths starting at r. For every other vertex v, the vertex directly preceding v in its path from r is designated as the parent of v. Conversely, a vertex v is called the child of a vertex w if v is the parent of w. In the most general case, w is said to be a descendant of v if v is on the path from r to w; in turn, v is called an ancestor of w. Finally, a vertex x is defined as a leaf if it has no descendants.
Commonly, when referring to trees where the root is not defined, these are called free trees.
Thinking about trees with a root can often lead to answering specific questions more straightforwardly and intuitively. For example, if one wanted to count the number of edges in a tree, it is sufficient to note that every vertex, except the root, has an edge connecting it to its parent, which holds for every tree. Thus, this observation serves as a proof of the following proposition:
Proposition 2:
Every tree with n vertices has n − 1 edges.
Something much stronger than the above holds.
Proposition 3:
Let G be an undirected graph with n vertices. Any two of the following propositions imply the third:
G is connected.
G contains no cycles.
G has n − 1 edges.
This post does not present rigorous proof of the previous propositions but feel free to leave a comment below or reach out if you know how.
The next edition will discuss the connectivity problem, graph traversals, and connected components. In the meantime, here are some complimentary posts that you might find interesting as well:
Stay tuned.
And that’s it for today! If you are finding this newsletter valuable, consider doing any of these:
1) ✉️ Subscribe to the newsletter — if you aren’t already, consider becoming a free or paid subscriber.
2) 💛 Get early access to my book— thanks to all of you who have contributed to support my writing. The book I'm writing covers this and similar topics in more depth.
3) 🍻 Read with your friends — this newsletter grows because readers share. I don’t use social media too much to advertise my writing. I’m counting on you!
Have a great day! ☀️
Alberto
Oh by the way, I’d love to work on the proof of why trees are defined like this (that question was a challenge problem I posed in one of my posts and nobody replied)
I like the build up to the last propositions and the way you introduced them, I wish I had written about trees like this