LFU Cache: One Clever Integer Keeps the Whole Thing O(1)

- LFU cache evicts the key with the lowest access count; frequency ties break by recency (LRU within each bucket).
- Strict O(1) get and put require three components: a key-to-node HashMap, a freq-to-DLL HashMap, and one
minFreqinteger. - The
minFreqinvariant only increments after agetempties its bucket and always resets to 1 on a newput, so it never scans backward. - Every node must store its own key so eviction can delete it from
keyToNodein O(1) without a reverse lookup. - Sentinel head and tail nodes in each doubly linked list eliminate null checks and edge cases throughout.
- LFU beats LRU when access patterns are stable and skewed (CDN assets, hot DB queries); use LRU when the hot set shifts quickly.
- The sticky-key problem means high-frequency historical keys resist eviction; LFUDA (dynamic aging) is the production fix.
LRU evicts the thing you forgot about. LFU evicts the thing nobody cares about. That distinction is small in theory and enormous in production. Getting an LFU cache implementation right in strict O(1) is where the real engineering lives.
A Least Frequently Used cache tracks access counts and evicts the item hit the fewest times. When the cache fills, out goes the cold key. If two keys tie on frequency, the less recently used one loses. The mental model is simple: a popularity contest where the item nobody votes for gets cut.
When do you reach for LFU instead of LRU? When your access patterns are stable and skewed. CDN assets, hot database queries, popular API routes. A few keys get millions of hits; most get almost none. LRU can evict a key that was accessed a million times yesterday just because nothing touched it in the last ten minutes. LFU doesn't make that mistake.
The hard part is doing all of this in strict O(1). Let's build it.

Frequency-based eviction: the philosophy applies anywhere space is limited and some items clearly matter more.
Why Naive LFU Cache Implementations Break Down
The obvious implementation is a min-heap keyed by frequency. Put and get both require O(log n) heap operations. Fine for small caches, too slow for anything real.
The next instinct: scan all entries on eviction and find the minimum. That's O(n). Laughably slow.
The O(1) solution comes from a 2010 paper by Ketan Shah, Anirban Mitra, and Dhruv Matani. The core insight: stop sorting, start bucketing. Group keys by frequency level and track one integer that always points at the minimum. That integer is what makes eviction O(1).
Three Components, One Invariant
Three components carry the whole structure:
keyToNode (HashMap): maps each cached key to its node. O(1) lookup, O(1) insert, O(1) delete.
freqToList (HashMap of doubly linked lists): maps each frequency level to a DLL of all nodes at that frequency, ordered most-recently-used at the head to least-recently-used at the tail.
minFreq (integer): the current minimum frequency among all cached items. This is what makes eviction O(1). You always know exactly which bucket to evict from.
Each node stores five things: its key, its value, its current frequency, and prev/next pointers for its DLL position.
The node must store its own key. This is the subtlety most people miss on a first attempt. When you evict the LFU item from the DLL tail, you pull the node out. Then you need to remove it from keyToNode. Without the key stored inside the node, that deletion requires a reverse lookup, which is not O(1).

The whole structure at a glance. minFreq is the only reason eviction doesn't require a scan.
The Layout
keyToNode:
"a" → NodeA
"b" → NodeB
"c" → NodeC
"d" → NodeD
freqToList:
freq=1: HEAD ↔ [NodeD] ↔ TAIL ← MRU to LRU
freq=2: HEAD ↔ [NodeC] ↔ [NodeB] ↔ TAIL
freq=5: HEAD ↔ [NodeA] ↔ TAIL
minFreq = 1 ← eviction target: NodeD (tail of freq=1)
Each DLL bucket is ordered by recency. Newer accesses land at the head. When two nodes share the same frequency, the tail node (least recently used) is evicted first. So within each bucket you're running a mini LRU.
What Happens on get("b")
NodeB lives in the freq=2 bucket.
Step 1: Remove NodeB from freq=2
Before: HEAD ↔ [NodeC] ↔ [NodeB] ↔ TAIL
After: HEAD ↔ [NodeC] ↔ TAIL
Step 2: freq=2 is not empty (NodeC remains), so minFreq stays at 1.
Step 3: Increment NodeB.freq to 3. Insert at head of freq=3.
freq=3: HEAD ↔ [NodeB] ↔ TAIL
No scanning. Three pointer swaps.

Three pointer swaps. minFreq doesn't move because freq=2 still has NodeC. The whole operation is O(1).
What Happens on put("e") When Cache Is Full
Step 1: Evict tail of freqToList[minFreq=1] → NodeD
Remove NodeD from keyToNode.
Step 2: freq=1 list is now empty. That's fine:
we're about to set minFreq = 1 again.
Step 3: Create NodeE with freq=1.
Add to head of freq=1 list.
Add to keyToNode.
Set minFreq = 1.
A new key always starts at freq=1, which is the global minimum by definition. So minFreq resets to 1 on every new insert, no search required.

NodeD gets voted off the island. NodeE walks in and immediately becomes the new least-popular item. minFreq = 1, no questions asked.
Why minFreq Is Always Correct
This invariant is what makes the algorithm work.
After a get: you promoted a node from frequency f to f+1. If f was minFreq and the f bucket is now empty, the new minimum must be f+1. Increment minFreq by one. It cannot be anything lower, because every other cached node has frequency >= f, and the just-promoted node is the only thing that moved.
After a put of a new key: minFreq = 1. Always. The new key is the least frequent thing in the cache by definition.
minFreq only moves up after a get and always resets to 1 after a new put. It never scans backward. The variable is always tight.
Every Operation Is Strictly O(1)
| Operation | Time | Space |
|---|---|---|
get(key) | O(1) | O(1) |
put(key, value), key exists | O(1) | O(1) |
put(key, value), new key, no eviction | O(1) | O(1) |
put(key, value), new key, eviction needed | O(1) | O(1) |
| Space (total) | O(capacity) |
These are strict O(1), not amortized. Every single operation is a constant number of HashMap lookups and doubly-linked list pointer swaps. Nothing in the critical path grows with n.
The DLL uses sentinel (dummy) head and tail nodes. Every insertion goes right after the sentinel head. Every deletion is a local pointer swap. No null checks, no edge cases for empty lists.
One Structure, Every Language
from collections import defaultdict class Node: __slots__ = ('key', 'val', 'freq', 'prev', 'next') def __init__(self, key=0, val=0): self.key = key self.val = val self.freq = 1 self.prev = self.next = None class DLL: def __init__(self): self.head = Node() self.tail = Node() self.head.next = self.tail self.tail.prev = self.head self.size = 0 def push(self, node): node.next = self.head.next node.prev = self.head self.head.next.prev = node self.head.next = node self.size += 1 def remove(self, node): node.prev.next = node.next node.next.prev = node.prev self.size -= 1 def pop_tail(self): if self.size == 0: return None node = self.tail.prev self.remove(node) return node class LFUCache: def __init__(self, capacity: int): self.capacity = capacity self.size = 0 self.min_freq = 0 self.key_to_node: dict = {} self.freq_to_list: dict = defaultdict(DLL) def _touch(self, node: Node) -> None: freq = node.freq self.freq_to_list[freq].remove(node) if self.freq_to_list[freq].size == 0 and freq == self.min_freq: self.min_freq += 1 node.freq += 1 self.freq_to_list[node.freq].push(node) def get(self, key: int) -> int: if key not in self.key_to_node: return -1 node = self.key_to_node[key] self._touch(node) return node.val def put(self, key: int, value: int) -> None: if self.capacity == 0: return if key in self.key_to_node: node = self.key_to_node[key] node.val = value self._touch(node) return if self.size == self.capacity: evicted = self.freq_to_list[self.min_freq].pop_tail() del self.key_to_node[evicted.key] self.size -= 1 node = Node(key, value) self.key_to_node[key] = node self.freq_to_list[1].push(node) self.min_freq = 1 self.size += 1
When LFU Beats LRU
LFU shines whenever a stable minority of keys absorbs most of the load. Web traffic follows a Zipf distribution: the top 1% of URLs often account for 50%+ of requests. LRU can evict those hot keys during a burst of new traffic. LFU cannot, because their counts are orders of magnitude higher than anything new.
Concrete use cases:
- CDN edge caches. Static assets like a homepage logo or a popular JS bundle get hit billions of times. They should stay pinned in the edge cache regardless of recent activity. LFU keeps them; LRU might evict them during a traffic spike of new URLs.
- Database query caches. A handful of dashboard queries run millions of times per day. LFU keeps them hot even when less frequent queries arrive in bursts.
- Autocomplete suggestion caches. The most popular search prefixes deserve permanent residence. Their frequency count dwarfs anything a new user might type.
- In-process caches for read-heavy APIs. When most reads are concentrated on a few resource IDs, LFU ensures those IDs are never accidentally evicted.
The flip side: LFU is the wrong tool when access patterns shift quickly. A key that was popular last week but irrelevant today has a high accumulated count and resists eviction. For rapidly changing hot sets, LRU or a hybrid like LFU with Dynamic Aging (LFUDA) handles it better.
Recognizing the Pattern
The clearest signal is the eviction criterion: if the problem says evict the item used least often (not least recently), you're building an LFU cache.
A secondary signal: the problem tracks counts, not timestamps. You are counting accesses, not recording when they happened.
Example 1: LeetCode 460
The problem statement is direct. You have a cache of fixed capacity. get and put must both run in O(1). Eviction goes to the key with the lowest frequency; ties break by recency.
The path to the solution:
- You need O(1) eviction. That rules out sorting and scanning.
- Frequencies change on every
getandput. You need to move a key between frequency groups in O(1). - You need to know which group has the minimum frequency in O(1).
Each of those requirements maps to one component: the DLL for O(1) group operations, the freq-to-list HashMap for O(1) group lookup, and the minFreq integer for O(1) minimum identification.
Example 2: "Design a hot-item buffer for a music streaming service"
Problem: you have an in-memory buffer of capacity k that caches song data. Songs played more often should stay in the buffer longer. Design get and cache operations.
This is LFU with domain renaming. "Played more often" = frequency. "Stay longer" = resist eviction. The same three-component structure applies directly: a key-to-node map for O(1) song lookup, a freq-to-DLL map grouping songs by play count, and a minFreq pointer to evict the least-played song.
The tell: you want to sort by a count that changes on every access, but sorting is too slow. Stop sorting. Start bucketing.
Old Hits Never Die (and That's a Bug)
One edge case worth knowing before an interview brings it up. A key that was very popular in the past accumulates a high frequency count and becomes nearly impossible to evict, even if nobody touches it anymore.
Think of it as the cache equivalent of a one-hit wonder that won't leave the charts. The song had 10 million plays in 2019 and zero plays since, but its count still beats every new release. If your cache runs for days and access patterns shift, old hot keys will crowd out new ones.
The fix in production is LFU with Dynamic Aging (LFUDA): the eviction score decays over time, so a key that was popular six hours ago doesn't block a key that's popular now. For a coding interview, the basic structure is sufficient. Mentioning LFUDA as a production consideration shows you understand the tradeoffs and have thought past the happy path.
What to Remember
- LFU evicts the key with the lowest access count; ties break by recency (LRU within each bucket).
- O(1) for every operation:
keyToNodeHashMap,freqToListHashMap of DLLs, and oneminFreqinteger. minFreqonly moves up after aget(if the old bucket empties), and always resets to 1 after aputof a new key. It never needs a backward scan.- Each node must store its own key so eviction can remove it from
keyToNodein O(1). - Sentinel head and tail nodes in each DLL eliminate null checks throughout.
- Use LFU when access patterns are stable and skewed. Use LRU when the hot set changes quickly.
- The "sticky key" problem is real in long-running production caches. LFUDA (dynamic aging) is the standard mitigation.
If you want to practice explaining this structure out loud under pressure, SpaceComplexity runs voice-based mock interviews where you walk through exactly these kinds of design decisions. The rubric will catch whether you articulate the minFreq invariant clearly, not just whether your code compiles.
If you're still building pattern recognition for when to apply dynamic programming versus greedy versus caching, the post on building a dynamic programming framework covers how to identify overlapping subproblems and choose the right tool.