Tree Edge vs Cross Edge: How DFS Classifies Every Edge

June 11, 20268 min read
dsaalgorithmsinterview-prepgraphs
Tree Edge vs Cross Edge: How DFS Classifies Every Edge
TL;DR
  • DFS edge classification assigns one of four types to every edge based on the color of the destination node during traversal
  • Back edges are the only edge type that signals a cycle in directed graphs. No back edges means the graph is a DAG.
  • Forward and cross edges only exist in directed graphs. Undirected graphs produce tree and back edges only.
  • Timestamp intervals (disc and fin) either fully nest or are completely disjoint, never partially overlapping. This is the proof behind why topological sort works.
  • Tarjan's SCC algorithm uses back edges to bind nodes into the same component. Cross and forward edges always connect different SCCs.

You've written DFS a hundred times. You know it visits nodes. What you probably didn't know: it's been quietly assigning labels to every single edge along the way. DFS edge classification divides every edge in the graph into one of four types, determined in a single pass by checking the state of the destination node. That label reveals whether a graph has a cycle, whether it can be topologically sorted, and why Tarjan's algorithm works the way it does. Most engineers who've used DFS have never seen this framework. Once you do, a surprising amount of graph algorithm theory clicks into place.

How DFS Builds Its Spanning Tree

Run DFS on any graph. Some edges take you to a node you haven't seen yet. Those edges form a tree called the DFS spanning tree. Every other edge connects nodes already in that tree, and how you classify those non-tree edges depends entirely on the relationship between the two endpoints.

The cleanest way to track that relationship is the three-color scheme from CLRS:

  • White: not yet discovered
  • Gray: discovered and currently on the DFS stack (still being processed)
  • Black: fully finished

When DFS walks an edge from u to v, the color of v tells you exactly what kind of edge you're looking at.

Four Labels From a Single Pass

def dfs(graph, u, color, disc, fin, timer): color[u] = 1 # gray timer[0] += 1 disc[u] = timer[0] for v in graph[u]: if color[v] == 0: # white: tree edge dfs(graph, v, color, disc, fin, timer) elif color[v] == 1: # gray: back edge pass else: # black: forward or cross edge pass color[u] = 2 # black timer[0] += 1 fin[u] = timer[0]

One function. One pass. Four edge types. Here's what each one means.

Tree Edge

v is white. You haven't touched it yet. DFS discovers v through this edge, so it joins the spanning tree. Every DFS traversal produces these. They're the edges DFS actually walks.

Back Edge

v is gray. Gray means v is currently on the DFS stack, which means v is an ancestor of u. You've found a path from a descendant back up to its own ancestor.

Back edges are the only edges that indicate cycles in directed graph DFS. Find one, a cycle exists. Find none, the graph is a DAG. That's the entire cycle detection algorithm for directed graphs.

PM delegates to TL delegates to Dev delegates to LLM to "implement it"

The DFS spanning tree for most software organizations. Add an edge from the LLM pointing back to the PM and you've got a back edge. Also a cycle. Also a bug.

Forward Edge

v is black, and disc[v] > disc[u]. v was discovered after u started, so it's a descendant of u in the DFS tree. But u reached v through a shortcut instead of the original tree path. Forward edges only appear in directed graphs.

Cross Edge

v is black, and disc[v] < disc[u]. v was completely finished before u even started. No ancestor-descendant relationship exists between them. Cross edges also only appear in directed graphs.

Tracing One Graph End to End

Take this directed graph with five nodes:

Edges: 0→1, 0→2, 1→3, 2→3, 3→4, 4→1

0 ──→ 1 ──→ 3 ──→ 4
│         ↑       │
└──→ 2 ───┘       │
          ←───────┘  (4→1 creates the cycle)

DFS starting at 0:

Visit 0  (gray, disc=1)
  Visit 1  (gray, disc=2)  # tree edge 0→1
    Visit 3  (gray, disc=3)  # tree edge 1→3
      Visit 4  (gray, disc=4)  # tree edge 3→4
        See 1, which is GRAY  # BACK EDGE 4→1  (cycle: 1→3→4→1)
      Finish 4  (black, fin=5)
    Finish 3  (black, fin=6)
  Finish 1  (black, fin=7)
  Visit 2  (gray, disc=8)  # tree edge 0→2
    See 3, which is BLACK, disc[3]=3 < disc[2]=8  # CROSS EDGE 2→3
  Finish 2  (black, fin=9)
Finish 0  (black, fin=10)

Edge 4→1 is a back edge because 1 is gray when 4 sees it. Edge 2→3 is a cross edge because 3 is black and its discovery time predates 2's. If the graph also had a direct edge 0→4, that would be a forward edge: 4 is black when 0 checks it, and disc[4]=4 > disc[0]=1, so 4 is already a descendant reached by the longer path.

Diagram showing the DFS spanning tree with all four edge types color-coded

Undirected Graphs Only Get Two of These

Undirected graphs can only have tree edges and back edges. Forward edges and cross edges cannot exist.

The reason is symmetry. If v is black with no ancestor-descendant relationship to u, then v finished before u was discovered. But v is a neighbor of u, so when v was being processed, it would have seen u. If u was white then, v would have discovered it, making u a descendant. If u was gray, the edge (v, u) would already have been classified as a back edge from v's side. No cross edges slip through. Same argument eliminates forward edges.

Edge TypeDirected GraphUndirected Graph
TreeYesYes
BackYesYes
ForwardYesNever
CrossYesNever

This table comes up more often in interviews than you'd expect.

Why Interviews Actually Care About This

Cycle Detection

A directed graph has a cycle if and only if DFS finds a back edge. When your code tracks gray nodes and spots a neighbor that's still gray, you've found a cycle. The check is one condition in an inner loop.

For undirected graphs it's simpler: any non-tree edge is a back edge, so any non-tree edge means a cycle. The one trap: skip the node you just came from, or every tree edge looks like a back edge pointing to the parent.

Full breakdown in Graph Cycle Detection.

Topological Sort

A DAG has no cycles, which means DFS finds no back edges. A directed graph can be topologically sorted if and only if DFS produces no back edges.

When DFS finishes a node, push it to a stack. Reverse finish order is a valid topological order. The proof: for every edge (u, v), if no back edge exists, then v finishes before u. Reversing finish times gives an order where u always comes before v.

More on this in Topological Sort Algorithm.

Strongly Connected Components

Both Tarjan's and Kosaraju's algorithms are built on the DFS tree structure. Tarjan's uses "low-link" values: the lowest discovery time reachable from each node through the DFS tree and back edges. When a node's low-link equals its own discovery time, it's the root of an SCC.

The SCC structure of a directed graph is exactly what back edges create. Cross and forward edges connect different SCCs. Back edges are what bind nodes into the same one.

Details in Tarjan's Algorithm and Strongly Connected Components.

The Proof Interviewers Want When They Ask "Why?"

This is where the timestamp intervals come in. When someone asks you to prove why topological sort works, or to explain how Tarjan's low-link computation is correct, they want you to talk about discovery and finish times.

  • disc[v] < disc[u] < fin[u] < fin[v] means v is an ancestor of u (back edge scenario)
  • disc[u] < disc[v] < fin[v] < fin[u] means v is a descendant of u (tree or forward edge)
  • disc[v] < fin[v] < disc[u] < fin[u] means v finished before u started (cross edge)

These timestamp intervals either nest or are disjoint. They never partially overlap. That's the parenthesis theorem, provable by contradiction from how DFS processes nodes, and it's exactly what interviewers are fishing for.

Tweet: 2026 software interview - do the wordle, show me how your computer is organized, 30s monkeytype, show me ur favourite YouTube video

The 2026 FAANG interview is not coming to save you. They still ask you to prove why topological sort works.

The Classification Rule

When DFS is at u and sees edge to v:

v is WHITE  →  tree edge    (u discovers v)
v is GRAY   →  back edge    (v is u's ancestor; cycle!)
v is BLACK  →  check timestamps:
    disc[v] > disc[u]  →  forward edge  (v is u's descendant)
    disc[v] < disc[u]  →  cross edge    (no relationship)

Undirected graphs: v is GRAY = back edge, v is BLACK = skip (already classified)

Memorize the color check. Everything else follows from it.

If you want to test how well you can articulate this under pressure, SpaceComplexity runs voice-based mock interviews where you walk through graph algorithm reasoning out loud, the same way a real interview expects you to. Pattern knowledge is necessary. Explaining it fluently is what gets you the offer.

Further Reading