Longest Common Subsequence: The Classic DP Problem, Explained
- Longest common subsequence (LCS) preserves relative order but allows skipped characters; a substring must be contiguous
- The recurrence has two cases: matching characters extend the diagonal (dp[i-1][j-1]+1); mismatches take max of the cell above and to the left
- Time complexity is O(m×n) to fill the table; space is O(m×n) by default, reducible to O(n) with a two-row rolling array
- Reconstruction traces backward from dp[m][n]: record matched characters, follow the larger neighbor on mismatches, then reverse
- LCS unlocks a family of problems: edit distance, shortest common supersequence, and minimum deletions all derive from the same table with minor recurrence tweaks
- The most common interview mistake is using the wrong cell for the match case (diagonal, not above/left) or mixing 1-based dp indices with 0-based string indices
Two strings. A 2D table. One of the most satisfying moments in DSA when it finally clicks. LCS is the 2D DP problem worth understanding from scratch, because the recurrence, once derived, unlocks a whole family of follow-up problems. Edit distance, diff tools, DNA alignment. The reason git diff shows red and green lines. They all start here.
Subsequence Is Not Substring
This distinction exists specifically to trip you up in interviews. Memorize it now.
A substring must be contiguous. "ABC" is a substring of "ABCDE" because those characters sit next to each other.
A subsequence only has to preserve relative order. You can skip characters as long as what remains appears left-to-right. "ACE" is a subsequence of "ABCDE": A at index 0, C at index 2, E at index 4. You skipped B and D. That is fine.
"AEC" is not a subsequence of "ABCDE" because E appears after C in the original string. Order is everything. Adjacency is not.
The longest common subsequence of two strings is the longest sequence that qualifies as a subsequence of both at once.
A Concrete Example
Take s1 = "ABCB" and s2 = "BCB". What do they share?
- "B" is common. Length 1.
- "CB" is common: in s1 at indices 2,3; in s2 at indices 1,2. Length 2.
- "BCB" is common: in s1 at indices 1,2,3; in s2 at indices 0,1,2. Length 3.
No common subsequence of length 4 exists since s2 only has 3 characters. The LCS is "BCB" with length 3.
You could brute-force this. Generate all 2^4 = 16 subsequences of s1, all 2^3 = 8 subsequences of s2, find the longest match. For strings of length n that is O(2^n) work. At n=50 that is 1,125,899,906,842,624 comparisons. Your interviewer will not wait.
Why This Problem Has DP Written All Over It
Two conditions signal that a problem wants dynamic programming. LCS checks both boxes.
Overlapping subproblems. The LCS of "ABCB" and "BCB" depends on the LCS of "ABC" and "BC", which depends on "AB" and "B", and so on. A naive recursive solution recomputes the same subproblems dozens of times. A table computes each one once and looks it up.
Optimal substructure. The LCS of two full strings can always be derived from smaller prefixes. If you see two matching characters at the end, the optimal solution extends the LCS of everything before them. If the last characters do not match, the optimal solution drops the last character from one string or the other and takes whichever gives the longer LCS.
This gives the recurrence. Let dp[i][j] be the length of the LCS of the first i characters of s1 and the first j characters of s2.
if s1[i-1] == s2[j-1]:
dp[i][j] = dp[i-1][j-1] + 1
else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
Every cell depends on the cell diagonally above-left (a match extension) or the cells directly above and to the left (the two mismatch options). Base cases are zero: an empty string has no common subsequence with anything.
Walking the Table
Fill in the full table for s1 = "ABCB", s2 = "BCB". Rows are characters of s1, columns are characters of s2.
| B | C | B | ||
|---|---|---|---|---|
| 0 | 0 | 0 | 0 | |
| A | 0 | 0 | 0 | 0 |
| B | 0 | 1 | 1 | 1 |
| C | 0 | 1 | 2 | 2 |
| B | 0 | 1 | 2 | 3 |
Row by row:
- A row. 'A' does not appear in "BCB" anywhere, so every cell stays 0. A boring row. It sets the mood.
- B row, column B. s1[1]='B' matches s2[0]='B'. Take the diagonal (dp[1][0] = 0) plus 1. Cell = 1.
- B row, column C. 'B' vs 'C', mismatch. Take max(above=0, left=1) = 1.
- B row, column B. 'B' vs 'B', match. Diagonal is dp[1][1] = 0, so cell = 1. The earlier match does not automatically carry forward here.
- C row, column C. 'C' vs 'C', match. Diagonal is dp[2][1] = 1, so cell = 2.
- B row, column B (last cell). 'B' vs 'B', match. Diagonal is dp[3][2] = 2, so cell = 3.
The answer is dp[4][3] = 3, the bottom-right cell.
Reading the Sequence Back
The table gives you the length. To reconstruct the actual characters, trace backward from dp[m][n]:
- If s1[i-1] == s2[j-1], this character is in the LCS. Record it, move to dp[i-1][j-1].
- Otherwise, move to whichever neighbor is larger: dp[i-1][j] or dp[i][j-1]. Break ties either way.
- Stop when you reach row 0 or column 0.
Tracing back through the table:
- dp[4][3] = 3, s1[3]='B' == s2[2]='B'. Record 'B'. Move to dp[3][2].
- dp[3][2] = 2, s1[2]='C' == s2[1]='C'. Record 'C'. Move to dp[2][1].
- dp[2][1] = 1, s1[1]='B' == s2[0]='B'. Record 'B'. Move to dp[1][0].
- dp[1][0] = 0. Stop.
Reverse the recorded characters: "BCB". Correct.
Time and Space Complexity
Filling the table visits every cell in the (m+1) x (n+1) grid exactly once, doing O(1) work per cell. Time complexity is O(m x n). Space is also O(m x n) for the table.
Can You Cut the Space Down?
You do not need the entire table to compute the length. Each row only looks at the row directly above it and the cell to its left. Two arrays are enough.
def lcs_length(s1: str, s2: str) -> int: m, n = len(s1), len(s2) prev = [0] * (n + 1) curr = [0] * (n + 1) for i in range(1, m + 1): for j in range(1, n + 1): if s1[i - 1] == s2[j - 1]: curr[j] = prev[j - 1] + 1 else: curr[j] = max(prev[j], curr[j - 1]) prev, curr = curr, [0] * (n + 1) return prev[n]
Space drops to O(n). If n > m, swap the strings first to keep the rolling array as small as possible.
Reconstruction is not directly possible with this optimization. If you need the actual sequence and cannot afford O(mn) space, Hirschberg's algorithm does it in O(mn) time with only O(m + n) space. In practice, interview problems ask for the length.
Where LCS Fits in a Larger Pattern
LCS is not a standalone curiosity. Once you have the recurrence, a whole family of problems opens up with small modifications to the table logic.
Edit distance. The minimum number of insertions and deletions to transform s1 into s2 is m + n - 2 x LCS(s1, s2). Add substitutions and you get Levenshtein distance (LeetCode 72), which needs a slightly adjusted recurrence but the same table structure.
Shortest Common Supersequence. The shortest string containing both s1 and s2 as subsequences has length m + n - LCS(s1, s2) (LeetCode 1092). Every shared character appears once instead of twice.
Minimum deletions to make strings equal. Delete characters from each string until they match. That is just m + n - 2 x LCS.
Solve LCS cleanly and this whole family is one or two recurrence adjustments away.
Interviewers use LCS to test two things specifically: whether you can identify 2D DP state, and whether you can derive the recurrence without help. The most common mistake is getting the diagonal wrong for the match case. The off-by-one when translating from dp table indices (1-based lengths) to string character indices (0-based) is a close second. Everyone writes that bug at least once. Now you know to check it first.
Before your next mock, try explaining the recurrence out loud before touching the keyboard. That is the part that separates a candidate who understands the pattern from one who memorized the solution. SpaceComplexity runs voice-based mock interviews with rubric-graded feedback on exactly this kind of explanation, so you can practice articulating the derivation under realistic conditions.
For more on why DP applies and how to recognize these patterns in new problems, the dynamic programming framework guide covers the general method. For the top-down memoized approach (which some find easier to derive first), memoization covers the caching mechanics. The bottom-up DP post goes deeper on the tabulation approach used here, including the space-reduction tricks.
One Structure, Every Language
All implementations below solve LeetCode 1143: given two strings, return the length of their longest common subsequence.
Python 3
def longest_common_subsequence(text1: str, text2: str) -> int: m, n = len(text1), len(text2) dp = [[0] * (n + 1) for _ in range(m + 1)] for i in range(1, m + 1): for j in range(1, n + 1): if text1[i - 1] == text2[j - 1]: dp[i][j] = dp[i - 1][j - 1] + 1 else: dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]) return dp[m][n]
Python 2
def longest_common_subsequence(text1, text2): m, n = len(text1), len(text2) dp = [[0] * (n + 1) for _ in range(m + 1)] for i in range(1, m + 1): for j in range(1, n + 1): if text1[i - 1] == text2[j - 1]: dp[i][j] = dp[i - 1][j - 1] + 1 else: dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]) return dp[m][n]
JavaScript
function longestCommonSubsequence(text1, text2) { const m = text1.length, n = text2.length; const dp = Array.from({ length: m + 1 }, () => new Array(n + 1).fill(0)); for (let i = 1; i <= m; i++) { for (let j = 1; j <= n; j++) { if (text1[i - 1] === text2[j - 1]) { dp[i][j] = dp[i - 1][j - 1] + 1; } else { dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]); } } } return dp[m][n]; }
TypeScript
function longestCommonSubsequence(text1: string, text2: string): number { const m = text1.length, n = text2.length; const dp: number[][] = Array.from({ length: m + 1 }, () => new Array(n + 1).fill(0)); for (let i = 1; i <= m; i++) { for (let j = 1; j <= n; j++) { if (text1[i - 1] === text2[j - 1]) { dp[i][j] = dp[i - 1][j - 1] + 1; } else { dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]); } } } return dp[m][n]; }
Java
public int longestCommonSubsequence(String text1, String text2) { int m = text1.length(), n = text2.length(); int[][] dp = new int[m + 1][n + 1]; for (int i = 1; i <= m; i++) { for (int j = 1; j <= n; j++) { if (text1.charAt(i - 1) == text2.charAt(j - 1)) { dp[i][j] = dp[i - 1][j - 1] + 1; } else { dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]); } } } return dp[m][n]; }
C++
int longestCommonSubsequence(string text1, string text2) { int m = text1.size(), n = text2.size(); vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0)); for (int i = 1; i <= m; i++) { for (int j = 1; j <= n; j++) { if (text1[i - 1] == text2[j - 1]) { dp[i][j] = dp[i - 1][j - 1] + 1; } else { dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]); } } } return dp[m][n]; }
C
int longestCommonSubsequence(char* text1, char* text2) { int m = strlen(text1), n = strlen(text2); int* dp = calloc((m + 1) * (n + 1), sizeof(int)); for (int i = 1; i <= m; i++) { for (int j = 1; j <= n; j++) { int idx = i * (n + 1) + j; if (text1[i - 1] == text2[j - 1]) { dp[idx] = dp[(i - 1) * (n + 1) + (j - 1)] + 1; } else { int up = dp[(i - 1) * (n + 1) + j]; int left = dp[i * (n + 1) + (j - 1)]; dp[idx] = up > left ? up : left; } } } int result = dp[m * (n + 1) + n]; free(dp); return result; }
Go
func longestCommonSubsequence(text1 string, text2 string) int { m, n := len(text1), len(text2) dp := make([][]int, m+1) for i := range dp { dp[i] = make([]int, n+1) } for i := 1; i <= m; i++ { for j := 1; j <= n; j++ { if text1[i-1] == text2[j-1] { dp[i][j] = dp[i-1][j-1] + 1 } else if dp[i-1][j] > dp[i][j-1] { dp[i][j] = dp[i-1][j] } else { dp[i][j] = dp[i][j-1] } } } return dp[m][n] }
Rust
pub fn longest_common_subsequence(text1: String, text2: String) -> i32 { let s1: Vec<u8> = text1.bytes().collect(); let s2: Vec<u8> = text2.bytes().collect(); let (m, n) = (s1.len(), s2.len()); let mut dp = vec![vec![0i32; n + 1]; m + 1]; for i in 1..=m { for j in 1..=n { if s1[i - 1] == s2[j - 1] { dp[i][j] = dp[i - 1][j - 1] + 1; } else { dp[i][j] = dp[i - 1][j].max(dp[i][j - 1]); } } } dp[m][n] }
Swift
func longestCommonSubsequence(_ text1: String, _ text2: String) -> Int { let a = Array(text1), b = Array(text2) let m = a.count, n = b.count var dp = [[Int]](repeating: [Int](repeating: 0, count: n + 1), count: m + 1) for i in 1...m { for j in 1...n { if a[i - 1] == b[j - 1] { dp[i][j] = dp[i - 1][j - 1] + 1 } else { dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]) } } } return dp[m][n] }
Kotlin
fun longestCommonSubsequence(text1: String, text2: String): Int { val m = text1.length val n = text2.length val dp = Array(m + 1) { IntArray(n + 1) } for (i in 1..m) { for (j in 1..n) { dp[i][j] = if (text1[i - 1] == text2[j - 1]) { dp[i - 1][j - 1] + 1 } else { maxOf(dp[i - 1][j], dp[i][j - 1]) } } } return dp[m][n] }
C#
public int LongestCommonSubsequence(string text1, string text2) { int m = text1.Length, n = text2.Length; int[,] dp = new int[m + 1, n + 1]; for (int i = 1; i <= m; i++) { for (int j = 1; j <= n; j++) { if (text1[i - 1] == text2[j - 1]) { dp[i, j] = dp[i - 1, j - 1] + 1; } else { dp[i, j] = Math.Max(dp[i - 1, j], dp[i, j - 1]); } } } return dp[m, n]; }
Ruby
def longest_common_subsequence(text1, text2) m, n = text1.length, text2.length dp = Array.new(m + 1) { Array.new(n + 1, 0) } (1..m).each do |i| (1..n).each do |j| dp[i][j] = if text1[i - 1] == text2[j - 1] dp[i - 1][j - 1] + 1 else [dp[i - 1][j], dp[i][j - 1]].max end end end dp[m][n] end
PHP
function longestCommonSubsequence(string $text1, string $text2): int { $m = strlen($text1); $n = strlen($text2); $dp = array_fill(0, $m + 1, array_fill(0, $n + 1, 0)); for ($i = 1; $i <= $m; $i++) { for ($j = 1; $j <= $n; $j++) { if ($text1[$i - 1] === $text2[$j - 1]) { $dp[$i][$j] = $dp[$i - 1][$j - 1] + 1; } else { $dp[$i][$j] = max($dp[$i - 1][$j], $dp[$i][$j - 1]); } } } return $dp[$m][$n]; }