Graph Theory Basics

A graph is a mathematical structure for representing pairwise relationships. Graph ML extends machine learning to graph-structured data such as social networks, molecules, knowledge bases, and citation networks.

Graph Definition

A graph $G = (V, E)$ consists of:

  • $V = {v_1, \ldots, v_n}$: a set of $n$ nodes (vertices).
  • $E \subseteq V \times V$: a set of $m$ edges.

An edge $(u, v) \in E$ connects nodes $u$ and $v$.

Node features: $X \in \mathbb{R}^{n \times d}$, where row $X_v$ is the feature vector of node $v$.

Edge features: $e_{uv} \in \mathbb{R}^{d_e}$ for each edge $(u,v)$.

Graph features: a single vector describing the whole graph.

Types of Graphs

Undirected: $(u,v) \in E \Leftrightarrow (v,u) \in E$. Symmetric relationships. Examples: social friendships, molecular bonds.

Directed: edges have direction $(u \to v)$. Examples: web links, citations, Twitter follows.

Weighted: each edge has a scalar weight $w_{uv} \in \mathbb{R}$. Examples: road distances, interaction strengths.

Bipartite: nodes split into two disjoint sets $U$ and $V$; edges only between sets. Examples: user-item interactions (recommender systems), author-paper networks.

Multigraph: multiple edges allowed between the same pair of nodes.

Hypergraph: edges (hyperedges) can connect more than two nodes.

Heterogeneous graph: multiple types of nodes and edges. Examples: knowledge graphs (entities and relation types), bibliographic networks (papers, authors, venues).

Dynamic graph: graph evolves over time; nodes and edges appear and disappear.

Degree

Degree $d_v$: number of edges incident to node $v$.

\[d_v = \sum_{u \in V} A_{vu}\]

In-degree / out-degree for directed graphs.

Degree distribution: $P(d = k)$ = fraction of nodes with degree $k$. Power-law degree distributions ($P(k) \propto k^{-\alpha}$, $\alpha \approx 2$–$3$) characterize scale-free networks (social networks, web).

Graph Properties

Path: sequence of nodes $v_0, v_1, \ldots, v_k$ where each $(v_{i-1}, v_i) \in E$.

Shortest path length $d(u,v)$: minimum number of edges between $u$ and $v$.

Diameter: maximum shortest path over all pairs. $\max_{u,v} d(u,v)$.

Clustering coefficient of node $v$:

\[C_v = \frac{\text{number of triangles through } v}{\text{number of triplets centered at } v} = \frac{|\{(u,w): (v,u),(v,w),(u,w) \in E\}|}{\binom{d_v}{2}}\]

Measures how connected a node’s neighbors are to each other.

Connected component: maximal subset of nodes where every pair is reachable via paths. A graph is connected if it has a single component.

Strongly connected component (directed graphs): maximal subset where every node can reach every other.

Common Graph Structures

Tree: connected acyclic graph. $n$ nodes, $n-1$ edges.

Star: one hub node connected to all others.

Complete graph $K_n$: every pair of nodes connected. $n(n-1)/2$ edges.

Erdos-Renyi $G(n,p)$: each possible edge included independently with probability $p$. Random graph model; lacks community structure and power-law degrees.

Barabasi-Albert (preferential attachment): new nodes attach to existing nodes proportional to their degree. Produces scale-free power-law degree distributions.

Stochastic Block Model (SBM): nodes belong to communities; intra-community edges are denser than inter-community. Standard model for community structure.

Graph Isomorphism

Two graphs $G$ and $G’$ are isomorphic if there exists a bijection $\phi: V \to V’$ that preserves edges. Deciding graph isomorphism is not known to be in P or NP-complete.

Weisfeiler-Leman (WL) graph isomorphism test: iteratively aggregate node labels from neighbors; update node labels via hashing. Two graphs are deemed non-isomorphic if their label multisets differ. GNNs are at most as powerful as the 1-WL test (fundamental limitation).