Monolith vs Microservices: The System Design Interview Guide

In 2023, Amazon published a blog post about their video quality monitoring service. The team had built it as a distributed serverless system on AWS. Step Functions for orchestration. S3 for passing video frames between components. Lambda functions for each analysis step. Textbook microservices, with all the right AWS-native pieces.

Then they hit a hard scaling wall at 5% of expected load. So they rewrote it as a single process. Cut costs by 90%. Response time dropped from 1.2 seconds to 89 milliseconds.

The comments section had feelings. Half the replies were "this is an embarrassment for AWS." The other half were "finally, someone said it." Both camps were correct.

What the Prime Video team actually demonstrated wasn't that microservices are bad. It was that the monolith vs microservices decision has a real cost structure, and that cost only pays off when you genuinely need what distribution provides. Most system design interviews test whether you understand exactly this distinction.

One Process or Many?

A monolith is a single deployable unit. All your business logic, data access, and interface layer share the same process and typically the same database. Inter-component calls are in-process function calls. One build artifact, one deployment, one rollback.

Monolith architecture diagram

Microservices splits that unit into independently deployable services. Each service owns a specific business capability, runs in its own process, and owns its own data store.

Microservices architecture diagram

The fundamental difference isn't team structure or deployment frequency. It's whether components call each other across a network or in memory.

In-process function calls take nanoseconds. Same-region network calls take 1 to 10 milliseconds. A request that chains through five microservices adds at least 50ms of network overhead before any business logic runs. That is not rounding error. It shows up in your p99s well before it shows up in your sprint planning meetings.

Data Consistency Is the Hardest Trade-off

In a monolith, you have ACID transactions. You can update an Order row and a Payment row atomically in a single database transaction. If the payment fails, the order rolls back. Free and automatic.

In microservices, each service owns its own database. There is no cross-service transaction. If your Order Service succeeds but your Payment Service fails, you've created an order with no payment. If the sequence reverses, the customer gets charged and receives nothing. These are the kinds of bugs that generate very memorable support tickets.

The industry standard response is the Saga pattern: decompose a business transaction into local transactions per service, each paired with a compensating action if the overall flow needs to roll back.

Two approaches:

Choreography: each service publishes an event on completion; the next service reacts. Good for simple linear flows (3 to 4 steps). The risk is that global state becomes invisible. Nobody has the full picture, and debugging requires reconstructing a timeline from scattered event logs.
Orchestration: a central coordinator directs each step explicitly. Better for complex flows with branching logic. The coordinator is a single point of failure you now have to operate.

There's also the dual-write problem: you can't atomically write to your database and publish to a message queue without two-phase commit. The Outbox pattern solves this by writing the event as a row in an outbox table within the same local transaction, then reading from it asynchronously. It's not complicated once you know it. But you need to know it.

This entire category of problems doesn't exist in a monolith. In microservices, it's mandatory infrastructure. See Eventual vs Strong Consistency for the consistency model behind these choices, and Message Queue vs Pub/Sub for how services communicate asynchronously.

Independent Scaling Requires Independent Workloads

The pitch for microservices is "scale what needs scaling." This is real, but only when your services have genuinely different resource profiles.

If your search service handles 100x the traffic of your billing service, isolating them makes sense. Provision 100 search instances and 1 billing instance. In a monolith, you over-provision billing along with everything else.

If your services are roughly symmetric in load, you're paying the full operational cost of microservices without the scaling payoff.

The Prime Video case is the canonical illustration. Every analysis component had to run, in sequence, for every second of every stream. There was no component to scale independently. The distributed architecture added S3 round-trips, Lambda cold starts, and Step Functions state transitions for a workload that was inherently sequential. Cost went up. Latency went up. Reliability went down. That is a textbook failure of the tool-for-the-job analysis. See Horizontal vs Vertical Scaling for when each scaling strategy applies.

Team Structure Is Not Optional Infrastructure

Conway's Law: organizations produce systems that mirror their communication structures.

Amazon used this deliberately. The Bezos API Mandate (around 2002) required every team to communicate only through service interfaces. No shared databases, no cross-team function calls. If a service wasn't designed to be external, it didn't belong. That organizational constraint is what created microservices as an architecture. The two-pizza rule gave each service a real owner who ran it in production.

Microservices architecturally require team-per-service ownership. A 5-person startup running 20 microservices will spend most of its engineering capacity managing infrastructure instead of building product. Four of those five people are effectively on call for the infrastructure that serves the infrastructure.

Stack Overflow runs 6,000 requests per second and 2 billion page views per month from 9 on-premise web servers with a 50-person engineering team. A monolith. Deploys to production in 4 minutes, multiple times a day. No service mesh. No distributed tracing requirement just to read a stack trace. They treat that as a feature, not a limitation.

Netflix runs 700+ microservices handling 15 billion API calls daily. They built an entire discipline (chaos engineering) for distributed failure and a library (Hystrix) for circuit breaking. Neither was optional. Both were prerequisites.

Observability Goes from Annoying to Mandatory

In a monolith, a stack trace is complete. The entire request lifecycle is visible in one place, in sequence, with a timestamp you trust.

In microservices, a failed request might have touched twelve services. Without distributed tracing (OpenTelemetry, Jaeger, Zipkin) and centralized log aggregation, debugging means manually correlating logs across a dozen services with different timestamps. If you are lucky, the services agree on the same time zone. They do not always agree on the same time zone. Industry surveys put the debugging overhead for microservices at roughly 35% higher than monoliths.

The tooling is mature. It's also expensive to operate, and it still doesn't give you what a monolith gives you for free: a complete local stack trace.

When to Use Monolith vs Microservices

The right question isn't "monolith or microservices?" It's "do I have a concrete problem that requires what distribution provides?" This sounds obvious. It is apparently not obvious to most people.

Signal	Points toward
Team under ~50 engineers	Monolith or modular monolith
Domain not yet fully understood	Monolith first
Components have 10x+ different load profiles	Microservices
Security or compliance isolation required	Microservices
Multiple teams need independent deployment cadences	Microservices
Strong consistency required across entities	Monolith
Startup finding product-market fit	Monolith
200+ engineers, multiple independent product teams	Microservices

Martin Fowler's position: start with a monolith, extract services when you have evidence of where the scaling or team friction boundaries actually are. Wrong service boundaries are expensive to fix because you've already built separate databases, separate pipelines, and separate teams around the wrong cuts.

The modular monolith is the serious middle ground. Single deployable unit, strongly enforced internal module boundaries. Shopify's core is 2.8 million lines of Ruby split into 37 components. They enforce boundaries with a static analysis tool (Packwerk) that fails PRs violating module APIs. They processed 173 billion requests on Black Friday 2024. They extract services only when they have a proven reason: storefront rendering for scaling, credit card vaulting for PCI compliance.

What Interviewers Are Actually Scoring

Most candidates default to microservices without asking a single contextual question. Interviewers notice. It's the architectural equivalent of saying "blockchain" in 2018: the buzzword arrived without the reasoning behind it, and the interviewer has heard it 40 times that week.

Architectural reasoning maturity shows up in five specific behaviors:

Context-first reasoning. Before choosing anything, ask: what's the team size, expected scale, consistency requirements, and operational maturity? The architecture follows from the constraints. Candidates who pick microservices before asking these questions are signaling that they've learned the vocabulary, not the reasoning.

Boundary quality. Weak candidates list services by noun: user-service, order-service, payment-service. Strong candidates justify why those boundaries are correct: these specific components have different scaling profiles, different deployment cadences, or different security requirements. If two services call each other on every request, the boundaries are wrong and the candidate should know why.

Consistency reasoning. Stating "database per service" without addressing distributed transactions is a red flag. Know the Saga pattern. Know why two-phase commit is avoided in practice (it blocks resources until all participants commit and deadlocks if the coordinator fails). The CAP theorem is the underlying model.

Operational honesty. Microservices require CI/CD infrastructure per service, distributed tracing, service discovery, secret management, and health monitoring for every service. Candidates who treat these as automatic are telling you they haven't operated a distributed system.

The modular monolith as a legitimate answer. At senior levels, "start modular monolith, extract at proven boundaries using the Strangler Fig pattern" is the most defensible answer for most interview prompts. The Strangler Fig works by putting a routing facade in front of your monolith and extracting one slice at a time. Zero downtime. Instant rollback per slice. The monolith shrinks as services grow.

Practicing these trade-offs in writing is one thing. Defending them out loud when an interviewer pushes back is different. SpaceComplexity runs voice-based system design mock interviews where you have to justify architectural choices in real time and respond to follow-up questions on the spot.