What Is Reference Counting? How Python Decides When to Free Memory

June 6, 202610 min read
dsaalgorithmsinterview-preppython
What Is Reference Counting? How Python Decides When to Free Memory
TL;DR
  • Reference counting tracks how many pointers aim at each heap object; the runtime frees the object the instant that count hits zero.
  • Python's sys.getrefcount() always returns one more than expected because the function call itself creates a temporary reference.
  • Circular references defeat reference counting; CPython supplements it with a generational cyclic GC in the gc module.
  • Swift ARC, C++ std::shared_ptr, and Rust Rc<T>/Arc<T> are all reference counting under different names, each with a weak-pointer escape hatch for cycles.
  • In multithreaded code, every counter update requires atomic operations, which is one reason CPython's GIL exists.
  • Interviewers expect both mechanisms: reference counting as the fast common path and the cyclic GC as the fallback for cycles.

You write x = "hello". Python allocates memory on the heap and hands you a reference. You move on. You write more code. You close the file and go touch grass.

Python does not move on. Python is taking notes.

The answer to "when does that memory get freed?" is reference counting. Python keeps a running tally of how many things point to each object. The moment the tally hits zero, the memory disappears. No background sweep. No waiting. Gone.

Every Object Gets a Tally Mark

Reference counting is a memory management strategy where every heap-allocated object carries an internal counter: the number of live references currently pointing to it.

The counter goes up when a new reference is created and down when one is destroyed. At zero, the object is freed immediately. No garbage collection cycle. No latency spike. The runtime just frees it right then.

That immediacy is the whole appeal. If you want predictable memory behavior, reference counting delivers it. Objects disappear the moment nobody needs them, which is the memory management equivalent of a very tidy roommate.

You Can Watch It Happen Live

Python exposes the reference count directly through sys.getrefcount(). You can watch the counter move in real time:

import sys x = "hello" print(sys.getrefcount(x)) # 2, not 1 y = x print(sys.getrefcount(x)) # 3 y = None print(sys.getrefcount(x)) # 2 again

First thing everyone notices: why 2 instead of 1? You only have one variable. Passing the object to getrefcount itself creates a temporary reference for the duration of that call. The function argument counts. You observed the object, and the observation left a mark. It's the Heisenberg uncertainty principle for memory.

When you do y = x, Python does not copy the string. It copies the reference and increments the count. Both x and y point to the same bytes in memory. Set y = None and that reference is gone, count back down. If it had hit zero, the allocator would have freed the string immediately.

Under the hood, every Python object starts with a C struct field called ob_refcnt, a Py_ssize_t stored in the object header. The C macros Py_INCREF and Py_DECREF manipulate it. The entire mechanism lives in a few hundred lines of C in Include/object.h. It is not magic. It is just addition and subtraction.

References Come From More Places Than You Think

The counter does not only move on =. A surprising number of operations create or destroy references, and if you do not know about them, the counts will constantly surprise you:

  • Calling a function with an object as an argument (the parameter binding is a new reference)
  • Appending to a list (the list holds a reference to the element)
  • Storing in a dictionary value
  • Closing over a variable in a lambda or nested function
  • Returning from a function (the caller gains a reference as the local frame loses one)

In CPython, nearly every bytecode instruction involves at least one reference count adjustment. The interpreter is counting constantly. The cost is small per operation, but it is real and it is everywhere.

This is also why closures can keep objects alive longer than you expect. That lambda you defined three stack frames ago? If it closed over x, that is still a reference. The string is not going anywhere until the lambda does.

Cycles: The Part Where It Breaks

Reference counting has one structural weakness. Everybody knows it. It still trips people up.

Suppose object A holds a reference to object B, and object B holds a reference back to A. Both counts stay at 1. Even if every external reference to both objects is gone, A keeps B's count at 1 and B keeps A's count at 1. They are unreachable from any running code. They will never be freed. They are just sitting there in memory, holding hands, refusing to leave.

class Node: def __init__(self): self.other = None a = Node() b = Node() a.other = b # b's count: 2 b.other = a # a's count: 2 del a # a's count: 1 (b.other still holds it) del b # b's count: 1 (a.other still holds it) # Both unreachable. Neither freed. Memory leak.

CPython deals with this by running a separate cyclic garbage collector alongside reference counting. The cyclic GC, controlled through the gc module, periodically hunts down reference cycles and collects them. It uses a generational approach: objects start in generation 0 and get promoted to older generations as they survive collections. Older generations are checked less often, because most objects die young.

Reference counting is mandatory. The cyclic GC is supplemental and can be disabled (though almost nobody does). The fast common path is reference counting. The fallback for cycles is the GC. Together they handle the full problem, and the split is intentional: most objects never form cycles, so most objects get freed instantly.

C++, Swift, and Rust Are Doing the Same Thing

Python is not unusual here. Reference counting shows up across the language ecosystem under different names.

C++: std::shared_ptr

C++ has no garbage collector. std::shared_ptr is the standard library's reference-counted smart pointer:

#include <memory> auto p = std::make_shared<int>(42); // count: 1 auto q = p; // count: 2 q.reset(); // count: 1 // p goes out of scope: count: 0, memory freed

The shared pointer carries a control block with two counters: one for strong references, one for weak references. To break cycles, C++ provides std::weak_ptr. A weak pointer does not increment the strong count, so two objects can reference each other without trapping memory.

Swift: ARC

Swift uses Automatic Reference Counting at the language level. The compiler inserts retain and release calls at compile time, not at runtime, so there is no GC thread and no stop-the-world pause. Every retain and release is a concrete instruction in your binary, placed by the compiler exactly where the type system says a reference begins or ends. Swift does not trust you to do this manually. It turns out that was the right call.

Swift's weak and unowned keywords break cycles. weak produces an optional that becomes nil when the object deallocates. unowned assumes the referenced object will outlive the reference and skips the nil check.

Rust: Rc<T> and Arc<T>

Rust's default ownership model avoids reference counting entirely. Single ownership means single deallocation, no counter needed. But when you genuinely need shared ownership, Rust provides Rc<T> for single-threaded use and Arc<T> for multithreaded:

use std::rc::Rc; let a = Rc::new(5); let b = Rc::clone(&a); // count: 2 drop(b); // count: 1 // a drops at end of scope: count: 0

Arc<T> uses atomic operations on the counter to make it safe across threads. Atomic operations are slower than plain integer math, so Arc<T> has a measurable overhead over Rc<T>. Both provide Weak<T> for cycle-breaking.

The Cost Is Real

Reference counting trades one kind of problem for another.

Space. Every reference-counted object needs extra bytes for the counter. In CPython, ob_refcnt is a Py_ssize_t, so 8 bytes on 64-bit systems. For small Python objects, that is proportionally significant. In Rust's Arc<T>, the allocation carries both a strong count and a weak count: at least 16 bytes of overhead per allocation.

Time. Every reference creation or destruction requires a counter update. In a multithreaded context, that update must be atomic, which costs cache line synchronization across CPU cores.

This is one reason CPython's Global Interpreter Lock exists. Single-threaded reference counting avoids atomic operations entirely. The GIL keeps everything single-threaded. Your counts are safe. Your multithreaded Python performance is not. Everyone made their peace with this, mostly.

The comparison against tracing garbage collectors:

ApproachLatencyThroughputCycles
Reference countingLow, deterministicSlightly lower (per-op overhead)Cannot handle alone
Tracing GC (mark-and-sweep)Unpredictable pausesHigher steady stateHandles naturally

Reference counting wins on latency. Objects are freed the moment the count hits zero, so memory pressure responds immediately. Tracing GCs let garbage accumulate until the next collection cycle, causing memory spikes and pause times. That tradeoff matters in latency-sensitive systems, which is partly why Swift and Rust's smart pointers chose reference counting over GC.

Where This Shows Up in Interviews

You will not be asked to implement a reference counter from scratch. But the concept shows up in several real interview contexts.

Python memory questions. Knowing CPython is reference-counted explains why del x often frees memory immediately, why closures keep objects alive longer than expected, and why circular data structures can leak memory when the cyclic GC has not run.

C++ smart pointer questions. At companies that use C++, distinguishing shared_ptr, unique_ptr, and weak_ptr is expected. shared_ptr is reference counting. unique_ptr is exclusive ownership with no counter at all. Using shared_ptr everywhere when unique_ptr would do is a real design smell, and interviewers know it.

"How does Python's garbage collector work?" A complete answer covers both mechanisms: reference counting as the primary path, with the cyclic GC as the fallback. Knowing why reference counting alone is insufficient separates a thorough answer from a surface-level one.

LRU cache and manual lifecycle management. When you implement an LRU cache in an interview, you are manually managing object lifetimes. A node gets created, linked, unlinked, and evicted. Understanding "who holds a reference to this?" helps you reason about pointer management deliberately instead of guessing.

Mutable vs. immutable objects. Reference counting interacts with Python's mutability model. Mutable and immutable objects in Python behave differently partly because of how the runtime counts and reuses references. Connecting those dots in an interview shows real language depth.

If you want to practice explaining memory management concepts out loud under pressure, SpaceComplexity runs voice-based mock interviews with rubric-based feedback on exactly this kind of systems thinking. The gap between knowing something and explaining it clearly in an interview is larger than it sounds.

For more on how Python's memory model fits into the broader picture, see how garbage collection works and stack vs. heap memory.

Key Takeaways

  • Reference counting tracks how many references point to each heap-allocated object; the object is freed the moment that count reaches zero
  • Python's sys.getrefcount() always returns one more than you expect because the function call itself is a reference
  • The structural weakness is circular references: CPython supplements reference counting with a generational cyclic GC in the gc module to handle them
  • Swift ARC, C++ std::shared_ptr, and Rust Rc<T>/Arc<T> are all reference counting under different names, each with a cycle-breaking weak pointer variant
  • The cost is per-operation: every assignment and function call involves a counter update, and in multithreaded contexts that requires atomic operations

Further Reading