Chapter 7: SciPy Graphs

SciPy Graphs, which almost everyone means scipy.sparse.csgraph when they say that.

This is not a general-purpose graph library like NetworkX (which is for more flexible, higher-level graph stuff). Instead, scipy.sparse.csgraph is a very efficient, low-level collection of graph algorithms that work directly on sparse matrices (usually in CSR or CSC format).

It’s designed for speed and memory efficiency when your graph is:

  • Large (thousands to millions of nodes)
  • Sparse (most possible edges don’t exist → adjacency matrix has mostly zeros)

Think road networks, social connections (sparse!), pixel neighborhoods in image segmentation, finite element meshes, recommender systems (users × items), etc.

Current status (February 2026): Still lives in scipy.sparse.csgraph Latest SciPy version around 1.17.0 (released early 2026) No major breaking changes in recent years — very stable

Core Idea — Graphs as Sparse Matrices

In csgraph, a graph with N nodes is represented as an N × N sparse adjacency matrix:

  • G[i,j] = weight/cost/distance from node i to node j
  • If no edge → G[i,j] = 0 (or very large number for some algos)
  • Undirected graph → symmetric matrix
  • Directed graph → asymmetric

The module only accepts sparse formats (csr_array, csc_array, coo_array, etc.) or sometimes dense (but converts internally).

Most Important Functions — What You’ll Actually Use

Here’s the practical shortlist (from docs & real usage in 2026):

What you want to do Main function(s) Input graph directed? Returns what? Typical use case
Check connected components connected_components Yes/No n_components, labels Find clusters/islands in data
Shortest paths (all pairs) floyd_warshall, johnson Yes Distance matrix (dense!) All-pairs distances (small–medium graphs)
Shortest paths from single source dijkstra, bellman_ford Yes distances, predecessors Navigation, routing (Dijkstra most common)
Minimum spanning tree minimum_spanning_tree No MST as csr sparse matrix Network design, clustering
Depth-first / breadth-first traversal tree depth_first_tree, breadth_first_tree Yes Tree as sparse matrix Traversal order, spanning tree
Shortest path tree shortest_path (convenience wrapper) Yes Method choice + distances/predecessors One function for most shortest-path needs
Laplacian matrix laplacian No Sparse Laplacian Spectral clustering, graph cuts

Let’s Do Real Examples — Hands-On (Jupyter style)

Always start like this:

Python

Example 1 — Connected Components (very common first step)

Imagine we have 6 cities, some connected by roads.

Python

→ Nodes 0,1,2,3 in one group; 4,5 in another.

Example 2 — Shortest Path from One City to All Others (Dijkstra)

Python

For all-pairs (small graph):

Python

Example 3 — Minimum Spanning Tree (MST)

Useful for wiring networks with minimal cable, clustering, etc.

Python

→ Returns a tree (acyclic connected graph) with minimal total weight.

Example 4 — Graph Laplacian (spectral things)

Python

→ Used in spectral clustering, graph partitioning, diffusion, etc.

When to Use csgraph vs NetworkX?

Criterion Use scipy.sparse.csgraph Use NetworkX
Graph size Thousands–millions of nodes Hundreds–thousands (slower on big)
Memory efficiency Excellent (sparse only) Higher overhead
Speed of core algos Very fast (compiled Cython/Fortran) Slower (pure Python mostly)
You already have sparse matrix Yes → perfect fit Need to convert
Need rich graph features No (just algos) Yes (attributes, drawing, centrality…)
Visualization No built-in Yes (draw, spring_layout…)

Most scientific/ML people use csgraph for heavy lifting (shortest paths, connectivity on huge sparse data) and NetworkX for exploration/visualization/smaller graphs.

Teacher’s Final Tips (2026 edition)

  • Always use CSR format for most csgraph functions (fastest)
  • For directed graphs → set directed=True
  • If graph has no weights → use 1 for edges or just connectivity
  • Missing edge = 0 (but some algos like Dijkstra treat 0 as no edge — watch out!)
  • For huge graphs → prefer dijkstra with single source over Floyd-Warshall
  • Docs are excellent: https://docs.scipy.org/doc/scipy/reference/sparse.csgraph.html Tutorial bits: https://docs.scipy.org/doc/scipy/tutorial/csgraph.html (word ladder example is classic)

What kind of graph problem are you thinking about?

  • Road/city shortest paths?
  • Social network components?
  • Building MST for clustering?
  • Image segmentation (pixel graph)?
  • Something from your data?

Tell me and we’ll build a more specific, realistic example together (maybe even with fake data that looks like yours). 🚀

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *