What Is a Circular Buffer? Fixed Memory, O(1) Operations, Every Time

You have a server logging incoming requests. Traffic spikes. Your logger can't process events as fast as they arrive, so it buffers them. Fine. Except you can't let the buffer grow forever, because memory isn't infinite and your ops team will absolutely page you at 3am when you OOM the production box.

You want to keep the last N events and quietly discard older ones. A dynamic array resizes under load and that's the opposite of what you want. A linked list chases pointers on every read, which is slow in ways we'll get to. Neither option is it.

A circular buffer solves this in O(1) time and fixed O(n) space, with zero allocation after initialization. It's the data structure behind audio streams, network packet queues, Linux kernel ring buffers, and a handful of coding interview problems that look easy until the interviewer raises an eyebrow and says "what happens when it's full?"

Ring Buffer Mechanics: An Array That Wraps Around

A circular buffer is a fixed-size array with two cursors:

tail: where the next write lands
head: where the next read comes from

When either cursor reaches the end of the array, it wraps to index 0. Slot 0 is logically adjacent to slot capacity - 1. The array is a ring. You never move data. You just move the cursors.

The entire trick is one line:

index = (index + 1) % capacity

That modulo operation replaces what would otherwise be a linear element shift on every dequeue. Instead of dragging everything left by one slot, you just advance the head cursor and let the tail lap it eventually.

Concrete walkthrough. Capacity 4, starting empty:

Initial:   [ _, _, _, _ ]
            head=0, tail=0, count=0

enqueue(A): [ A, _, _, _ ]
             head=0, tail=1, count=1

enqueue(B): [ A, B, _, _ ]
             head=0, tail=2, count=2

enqueue(C): [ A, B, C, _ ]
             head=0, tail=3, count=3

dequeue():  returns A
            [ _, B, C, _ ]
             head=1, tail=3, count=2

enqueue(D): [ _, B, C, D ]
             head=1, tail=0, count=3   ← tail wrapped!

enqueue(E): [ E, B, C, D ]
             head=1, tail=1, count=4   ← full

Slot 0 was reused without moving anything. That's the payoff.

Ring buffer state diagram showing 8-slot circular structure with head and tail cursors

The Bug That Catches Everyone Once

When head == tail, what does that mean?

Empty: you dequeued everything and head chased tail to the same spot.

Full: you enqueued until tail lapped head, landing on the same spot.

Both states produce head == tail. One means you're out of data. The other means you have exactly capacity elements. Pick the wrong interpretation and your dequeue returns garbage on an empty buffer, or silently drops data on a full one. This is the kind of bug that passes all your happy-path tests and explodes on the one Tuesday your traffic spikes.

Programmer meme about everything being dead when the bug finally hits production

When you realize head==tail has two meanings and you only handled one of them.

Three standard fixes exist:

Count variable. Track how many elements are stored. is_empty() checks count == 0, is_full() checks count == capacity. One extra integer. No ambiguity. This is what you want.

Leave one slot empty. Define full as (tail + 1) % capacity == head. That condition can never equal the empty condition (head == tail), because the sentinel slot separates them. You waste one slot permanently but need no extra bookkeeping.

Boolean flag. Track whether the last operation was a read or write. If head == tail after a write, you're full. After a read, empty. Works, but adds branching everywhere.

The count approach wins for readability. Use it unless you have a specific reason not to.

Build a Circular Buffer in Python

class CircularBuffer:
    def __init__(self, capacity: int):
        self.data = [None] * capacity
        self.capacity = capacity
        self.head = 0
        self.tail = 0
        self.count = 0

    def enqueue(self, value) -> bool:
        if self.count == self.capacity:
            return False
        self.data[self.tail] = value
        self.tail = (self.tail + 1) % self.capacity
        self.count += 1
        return True

    def dequeue(self):
        if self.count == 0:
            return None
        value = self.data[self.head]
        self.head = (self.head + 1) % self.capacity
        self.count -= 1
        return value

    def peek(self):
        if self.count == 0:
            return None
        return self.data[self.head]

    def is_full(self) -> bool:
        return self.count == self.capacity

    def is_empty(self) -> bool:
        return self.count == 0

Every method is a handful of arithmetic operations and an array access. Nothing allocates. Nothing shifts. The whole thing fits in a screenful and you can hold it in your head.

What Happens When It's Full

Some use cases don't want to reject writes. Kernel logs, audio capture, network packet captures, your ops team's request logger from the first paragraph. They all work the same way: when the buffer is full, drop the oldest entry and accept the new one. Newer data matters more than older data.

def enqueue_overwrite(self, value) -> None:
    if self.is_full():
        self.head = (self.head + 1) % self.capacity
        self.count -= 1
    self.data[self.tail] = value
    self.tail = (self.tail + 1) % self.capacity
    self.count += 1

Advancing head before writing evicts the oldest element without touching any other slot. The buffer never rejects a write. The interviewer almost always asks "what happens when the buffer is full?" and the answer has two modes: blocking (return false) and overwriting (evict oldest). Knowing both without hesitation separates a complete answer from a partial one.

O(1) Everywhere, and One Non-Obvious Win

Operation	Time
enqueue	O(1)
dequeue	O(1)
peek	O(1)
is_empty / is_full	O(1)

Space is O(n) where n is the capacity you chose at construction. Fixed. No resizing, no node allocation per element, no garbage collector pressure from objects that live for exactly one event cycle.

The less-obvious win is cache performance. A circular buffer lives in a single contiguous array. Reads and writes hit sequential or near-sequential memory addresses. A linked list queue has the same O(1) Big-O, but each node is a heap allocation at an arbitrary address. Once the list grows large enough that nodes stop sharing cache lines, traversal degrades significantly. The circular buffer avoids this entirely. For a deeper look at why contiguous memory matters, see spatial vs temporal locality.

Where This Shows Up in Interviews

Programmer meme about waiting to study circular buffers until the interview starts

"Any day now I'll review what happens when tail wraps past head."

Moving Average from Data Stream (LeetCode 346)

The textbook circular buffer problem. Given a stream of integers, compute the average of the last k values after each insertion.

class MovingAverage:
    def __init__(self, size: int):
        self.buffer = [0] * size
        self.size = size
        self.count = 0
        self.tail = 0
        self.window_sum = 0

    def next(self, val: int) -> float:
        if self.count == self.size:
            self.window_sum -= self.buffer[self.tail]
        else:
            self.count += 1
        self.buffer[self.tail] = val
        self.window_sum += val
        self.tail = (self.tail + 1) % self.size
        return self.window_sum / self.count

No separate head pointer needed here because we always overwrite the oldest element. The tail position is both where we write and, when full, what we're about to evict. Maintain a running window_sum and subtract before overwriting. Each next() call is O(1).

Design a Queue With a Fixed Array

A common follow-up to "implement a queue." The naive approach uses a list and pop(0), which is O(n) because it shifts every element left. The right answer is a circular buffer: fixed-capacity, O(1) for both ends, no shifting ever.

LeetCode 622 (Design Circular Queue) is this exact problem.

Sliding Window Problems

The sliding window technique maintains a window of elements over an array or stream. A fixed-size circular buffer is the natural backing structure when you need both the oldest and newest element accessible in O(1). You often don't implement it explicitly in interview code, but knowing the model helps you reason clearly about what operations need to be constant-time and why.

The Monotonic Deque

The monotonic deque pattern (used in sliding window maximum) adds an ordering invariant on top of ring mechanics. Elements are stored in decreasing order so the front always holds the window maximum. You remove from both ends: stale elements that left the window fall off the front, elements that can never be the maximum fall off the back before a new element is added. The ring wrapping is identical to what's above. The invariant is what changes.

Four Bugs That Show Up in Interviews

Forgetting the modulo. Writing self.tail += 1 instead of self.tail = (self.tail + 1) % self.capacity. This looks fine on small inputs and throws an index error the moment a cursor reaches the last slot.

Off-by-one in the full condition. If you use the "waste one slot" approach, the full condition is (self.tail + 1) % self.capacity == self.head, not self.tail == self.head. The extra + 1 is the sentinel slot. Forget it and you'll incorrectly claim the buffer is full one element early.

Mixing up head and tail conventions. Some implementations define head as the last-written slot rather than the next-to-read slot. Neither convention is wrong. But you have to be consistent within a single implementation. Using one definition in enqueue and the other in dequeue is a fun way to spend twenty minutes in a debugging session.

Not initializing count. Initialize it to zero. Obvious, but easy to miss under interview pressure. The resulting bug is subtle: a full buffer that claims it has space, or an empty buffer returning stale data from a previous run.

The Interview Checklist

A circular buffer is a fixed-size array using (index + 1) % capacity to wrap cursors instead of shifting elements
head tracks the next read, tail tracks the next write; they advance independently
The full-vs-empty ambiguity requires a count variable, a wasted slot, or a boolean flag
All operations run in O(1) time with no allocation; contiguous memory gives better cache behavior than a linked list
Blocking mode rejects writes when full; overwriting mode evicts the oldest element instead
It shows up directly in LeetCode 346 and 622, and implicitly in any fixed-window streaming problem

The data structure itself is simple. The real test is handling the edge cases and the "what if it's full?" follow-up without hesitation. That only comes from practicing under actual interview conditions, not reading implementations. SpaceComplexity runs voice-based DSA mock interviews that push exactly these follow-ups in real time, so you're not encountering them for the first time when it counts.