Adjacency Matrix vs Edge List: One Question Decides Which You Need

June 10, 20269 min read
dsaalgorithmsinterview-prepgraphs
Adjacency Matrix vs Edge List: One Question Decides Which You Need
TL;DR
  • Adjacency matrix gives O(1) edge lookup at O(V²) space cost; use it for Floyd-Warshall, dense graphs, and repeated existence checks
  • Edge list stores only the edges at O(E) space; use it for Kruskal's, Bellman-Ford, and any algorithm that iterates over all edges
  • The deciding question: does your algorithm check whether a specific edge exists, or does it loop over every edge in the graph?
  • Sparse graphs (E = O(V)) give the edge list a 500x space advantage; the matrix overhead is only manageable when E approaches V²
  • Adjacency list handles the middle case: neighbor traversal for BFS, DFS, and Dijkstra at O(V + E) space
  • Common interview mistake: building a matrix for Kruskal's wastes O(V²) space when O(E) was available, and signals poor complexity awareness

You know there are three ways to represent a graph. You probably reach for the adjacency list by default, which is fine, relatable, correct for most things. But the adjacency matrix vs edge list trade-off is where real complexity costs hide. Choosing wrong costs you 500x memory or turns an O(E) algorithm into O(V²). Your interviewer will notice. They will not say anything. They will just write it down.

One question decides it before you write a line: does your algorithm need to check if a specific edge exists, or does it need to iterate over every edge? The answer points you to the right structure before you touch code.

What Each Representation Actually Stores

An adjacency matrix is a V-by-V 2D array. matrix[i][j] is 1 (or the edge weight) if an edge exists between vertex i and vertex j, and 0 otherwise. You are paying for every possible edge, including the ones that don't exist.

An edge list is exactly what it sounds like: an array of pairs. Each entry is (u, v) for an unweighted graph, or (u, v, weight) for a weighted one. That's the whole structure. No preprocessing, no nesting, no pretending edges exist when they don't.

# Adjacency matrix (4 vertices, 0-indexed) matrix = [ [0, 1, 0, 1], [1, 0, 1, 0], [0, 1, 0, 1], [1, 0, 1, 0], ] # Edge list (same graph) edges = [(0,1), (0,3), (1,2), (2,3)]

A 1000-vertex graph with 2000 edges takes 1,000,000 cells in a matrix and 2,000 entries in an edge list. That's a 500x difference in memory for the exact same graph. One of these is a data structure. The other is a cry for help.

Memory comparison: adjacency matrix allocates 1,000,000 cells for a sparse graph with 2,000 edges; edge list allocates 2,000 entries with zero waste

Complexity at a Glance

OperationAdjacency MatrixEdge List
SpaceO(V²)O(E)
Edge lookup: does (u, v) exist?O(1)O(E)
All neighbors of a vertexO(V)O(E)
Iterate over all edgesO(V²)O(E)
Add an edgeO(1)O(1) amortized
Add a vertexO(V²) to resizeO(1)

The matrix wins on exactly one operation: edge lookup. Everything else either ties or goes to the edge list. It's the specialist that's incredible at one thing and kind of a disaster at everything else.

Tweet: "No mom it's not a messy pile of clothes on my chair, it's an L1 cache for fast random access to my frequently used clothes in O(1) time."

The adjacency matrix, basically. O(1) lookup. Costs you the whole room.

When the Matrix Is Right

The matrix pays off when your algorithm needs to answer "does edge (u, v) exist?" repeatedly and quickly. Floyd-Warshall is the canonical example.

def floyd_warshall(dist): n = len(dist) for k in range(n): for i in range(n): for j in range(n): if dist[i][k] + dist[k][j] < dist[i][j]: dist[i][j] = dist[i][k] + dist[k][j] return dist

The matrix is structurally required here. The algorithm reads and writes dist[i][j] directly across O(V³) iterations. You build the initial matrix from your graph, then run those updates in-place. Adapting an edge list to this adds overhead for no reason, because you'd just end up rebuilding the matrix anyway.

Other situations where the matrix earns its place:

  • Dense graphs. When E approaches V², the matrix's space overhead shrinks relative to adjacency list overhead (pointers, dynamic arrays). At full density it's actually more compact.
  • Repeated edge queries. If your algorithm constantly asks whether two specific vertices are connected, O(1) lookup beats scanning an edge list every time.
  • Small graphs. When V is small (under a few hundred), the V² penalty is trivial. Reach for the simplest structure.

When the Edge List Is Right

The edge list wins when your algorithm processes all edges, possibly multiple times, and never needs neighbor lookups. This describes more algorithms than you'd think.

Kruskal's algorithm is the textbook case. You sort all edges by weight, then greedily pick the smallest edge that doesn't form a cycle. The algorithm never asks "who are the neighbors of vertex 5?" It only asks "what's the next cheapest edge?"

def kruskals(num_vertices, edges): edges.sort(key=lambda e: e[2]) # sort by weight parent = list(range(num_vertices)) def find(x): while parent[x] != x: parent[x] = parent[parent[x]] x = parent[x] return x mst = [] for u, v, weight in edges: pu, pv = find(u), find(v) if pu != pv: parent[pu] = pv mst.append((u, v, weight)) return mst

Building a matrix first wastes O(V²) space and O(V²) time scanning for edges to sort. The sort is O(E log E). Nothing here benefits from a matrix. You'd be allocating a million cells to eventually look at two thousand of them.

Bellman-Ford is the same story. It relaxes every edge V-1 times:

def bellman_ford(num_vertices, edges, source): dist = [float('inf')] * num_vertices dist[source] = 0 for _ in range(num_vertices - 1): for u, v, weight in edges: # iterate all edges if dist[u] + weight < dist[v]: dist[v] = dist[u] + weight return dist

The inner loop scans all E edges on every pass. An edge list gives you O(E) per pass. A matrix costs O(V²) per pass. On a sparse graph where E = O(V), that's a V-times slowdown for no benefit.

Dense or Sparse: That's the Deciding Factor

The inflection point is how many edges your graph has.

  • Sparse graph: E is O(V) or O(V log V). Road networks, dependency graphs, most real-world graphs. Use an edge list.
  • Dense graph: E is close to V². Complete graphs, feature similarity graphs in ML. Use a matrix.
V = 1000, sparse (E = 2000):
  Matrix:    1,000,000 cells
  Edge list: 2,000 entries
  Ratio: 500x

V = 1000, dense (E = 400,000):
  Matrix:    1,000,000 cells
  Edge list: 400,000 entries
  Ratio: 2.5x  ← matrix overhead is manageable

The edge list's advantage compounds as graphs get sparser. Most interview graphs and most real-world graphs are sparse. The matrix is the right choice less often than people reach for it.

A Worked Example: Choosing Before You Code

Say you're asked to find all edges in a graph with a weight below some threshold.

With a matrix:

def cheap_edges_matrix(matrix, threshold): n = len(matrix) result = [] for i in range(n): for j in range(i + 1, n): # O(V²) even if 99% of edges are 0 if 0 < matrix[i][j] < threshold: result.append((i, j, matrix[i][j])) return result

With an edge list:

def cheap_edges_list(edges, threshold): return [(u, v, w) for u, v, w in edges if w < threshold] # O(E)

Same result. The edge list version is O(E). The matrix version is O(V²) regardless of how sparse the graph is. On a sparse graph with 1000 vertices and 2000 edges, the matrix does 500,000 comparisons while the edge list does 2,000. You get the same answer and pay 250 times more for it.

What About the Adjacency List?

The adjacency list sits between these two and handles most interview problems well. It wins for algorithms that traverse neighbors, giving you O(degree) lookup and O(V + E) space. BFS, DFS, and Dijkstra all belong here.

The practical split across all three:

  • Fast edge lookup, matrix DP, dense graph: adjacency matrix
  • Neighbor traversal, BFS, DFS, Dijkstra: adjacency list
  • Iterate all edges, MST, Bellman-Ford, sparse graph: edge list

For a deeper look at where adjacency lists and matrices trade off, see Adjacency List vs Adjacency Matrix.

Converting Formats Costs O(V²) One Way

Sometimes an algorithm specifies one format and you need another. These conversions are worth knowing cold.

# Edge list to adjacency matrix def to_matrix(num_vertices, edges): matrix = [[0] * num_vertices for _ in range(num_vertices)] for u, v in edges: matrix[u][v] = 1 matrix[v][u] = 1 # undirected return matrix # Adjacency matrix to edge list def to_edge_list(matrix): n = len(matrix) edges = [] for i in range(n): for j in range(i + 1, n): if matrix[i][j]: edges.append((i, j, matrix[i][j])) return edges

Matrix to edge list costs O(V²), no avoiding it. You have to scan every cell to find the nonzero ones. This is part of why starting with the right representation matters: you never pay the full scan to rebuild it later.

Where Interviews Go Wrong

The most common mistake is building an adjacency matrix for a Kruskal's or Bellman-Ford problem. You get the right answer. You also signal that you didn't think about the memory model before writing code, which is exactly the kind of thing that ends up in an interviewer's notes.

Drake meme: disapproves O(nlogn) in 10 lines; approves O(n!) in 9 lines

Choosing the matrix when the edge list fits. Technically works. Definitely noticed.

Choosing the right representation is itself a signal about how you think about complexity. Interviewers aren't just checking that the output is correct. They're watching whether you reason about the structure before you code it.

The second mistake is the opposite: reaching for an edge list when you need to check edge existence repeatedly. Scanning an E-length list every time you want to know if (u, v) exists turns O(1) into O(E). That gets slow fast on dense graphs. For those problems, the matrix is the right call, and saying so out loud shows you know the trade-off.


If you want to practice recognizing which graph representation a problem needs before you touch code, SpaceComplexity runs voice-based mock interviews where you have to verbally justify your data structure choices, not just produce the right answer.

For the algorithms that use each representation in practice, see Kruskal's Algorithm, Bellman-Ford Algorithm, and Floyd-Warshall Algorithm. For a deeper dive into the adjacency list itself, see What Is an Adjacency List?.

Further Reading