The Phone Screen Is Not a Smaller Onsite

- Phone screens filter dealbreakers, not talent: one medium problem, no silence, no fundamental gaps, and the margin for error is higher than most candidates expect.
- The onsite measures consistent depth across four to six equal-weight rounds, not peak brilliance in one.
- Follow-up constraint shifts decide the onsite, not the initial solution; reasoning out loud through the modification is the primary product.
- Amazon bar raisers hold veto power and ask one question: would hiring this person raise the team's average, not just meet it?
- Behavioral answers need data: "cut p99 latency from 400ms to 120ms" is evidence; "improved the system" is a story.
- Each stage answers a different question: any dealbreakers, consistent depth, or raises the bar. Know which one you are answering before you walk in.
Most candidates prep for a phone screen by doing the same thing they did for the onsite, just for fewer days. Nobody warns you this is wrong. You find out when it hurts.
The phone screen, the onsite loop, and the bar raiser round are three separate instruments measuring three different things. Treating them as one continuous ladder of difficulty is why engineers who sail through the phone screen get ambushed in round four. They trained for the wrong exam.
The Phone Screen Has One Job: Filter the Obvious No's
The phone screen exists to eliminate people who would waste the team's time. Sounds harsh, but it's just economics. Flying a candidate across the country or blocking five engineers for a full day costs real money and real scheduling pain. The screen's question is not "are you good enough?" but "is there a clear reason to stop here?"
Most phone screens give you one medium problem, sometimes two easier ones. The interviewer is scanning for dealbreakers: Can you write code that runs? Do you understand basic complexity? Do you communicate while you work, or does the line go dead for fifteen minutes? (It goes dead more often than people admit.) Roughly 25 to 35 percent of candidates fail simple scripting problems at this stage. Not hard problems. Simple ones. The phone screen catches that before anyone buys a plane ticket.
The margin for error is higher than you'd expect. You can miss an edge case. You can take a hint. You can write slightly verbose code. What you cannot do is reveal a fundamental gap in technical basics or go completely silent. Either of those is an immediate stop. Everything else is recoverable.
Interviewing.io data makes the gap concrete. Top-quartile phone screeners produced candidates who were twice as likely to receive an offer after the onsite, 50 percent versus 25 percent, despite both groups having similar phone screen pass rates. The best screeners weren't finding people who were better at phone screens. They were detecting a genuinely different signal: the kind of reasoning that holds up under deeper pressure. Phone screens and onsites are testing different things.
When you nail the phone screen and realize you're now expected to perform for five hours straight across six rounds with no warm-up.
The Onsite Loop Is a Multi-Dimensional Endurance Test
Once you pass the screen, the evaluation framework changes entirely. The onsite isn't a harder phone screen. It's a multi-dimensional assessment where consistency across rounds matters as much as any single peak.
A typical loop runs four to six rounds: two or three coding, a system design round at mid-level and above, one behavioral. At Google and Meta, each interviewer submits an independent write-up with equal weight at the hiring committee. There's no warm-up round. No low-stakes slot to coast through. Round one is as live as round six. The interviewer in the first slot isn't there to ease you in. They're there to evaluate you.
One great round does not carry a bad one. A candidate who scores brilliantly in round two and visibly fades in round five is signaling something real about reliability. Consistent and strong enough beats brilliant and volatile. Onsite prep should involve practicing sustained performance across multiple problems back-to-back, not just solving individual hard problems in isolation. The cognitive fitness required for a five-hour loop is not the same as fitness for a single 45-minute problem. You can be sharp for 45 minutes and then fall apart at hour three. The onsite exposes this.
The target also shifts. In the phone screen, a correct working solution is roughly sufficient. In the onsite, a Google interviewer documented it directly: "I've passed more people that arrive at the optimal solution without coding it than I do people who arrive at the optimal solution and code it." The thought process is what's being evaluated. You're not shipping code. You're producing evidence of how you reason.
Spending five minutes understanding and narrating the problem before writing anything isn't wasted time. It's the primary product of the round.
Follow-Up Questions Are Where the Onsite Is Actually Decided
Phone screens rarely have follow-up variants. Onsites are built around them.
The pattern works like this. You solve the initial problem. The interviewer removes a constraint: "That assumed the data fits in memory. What if it doesn't?" Or they scale it: "Now imagine this runs across a billion records." Or they add a business requirement: "What if we need reads to be real-time but writes can lag?"
Each layer is a trap, and also an opportunity. Candidates who memorized solutions collapse here because they can't explain what they coded. Candidates who understand the mechanism can adapt. The difference shows immediately.
The follow-up is where the round is decided. Your first solution proves you can code. The follow-up proves you understand what you coded.
Google deliberately designs onsite problems to resemble familiar patterns while testing different concepts. When a constraint disappears, the right response isn't to answer faster. Slow down, say "okay, what that changes is...", and reason out loud toward the modification. An interviewer watching you discover that your current approach breaks under the new constraint is seeing exactly what they need to see.
Candidates who immediately guess an answer without acknowledging what changed are signaling that they don't fully own the solution. Confident wrong guesses are worse than careful right thinking. The interviewer already knows the answer. They want to watch you find it.
The Bar Raiser Round Is a Different Game
The bar raiser is an Amazon practice. Every onsite loop includes one interviewer who is not the hiring manager and not from the target team, and who has veto power. If they vote no, the hire does not happen, regardless of what every other interviewer said. You will not be told which round this is. You will not be warned. You find out when the offer does or doesn't come.
In practice, bar raisers rarely announce an explicit veto. Their technique is Socratic. After the interviews, all interviewers debrief in real time. The bar raiser presents a summary of the risks they see and uses directed questions to move the room toward a clear decision. They don't override the group. They engineer consensus. The veto is the background fact that gives them standing to do it.
The bar raiser's mandate is specific: is this candidate better than 50 percent of current Amazonians doing the same role? Not "would they do okay?" but "would hiring them raise the team's average?" You don't need to raise the bar in every competency. You need to raise it in at least one dimension, and not lower it in any other. One genuinely strong signal plus no red flags clears the bar. One below-median dimension can block the hire even when everything else is strong.
The bar raiser, essentially, running an extra dimension of scrutiny that nobody told you they were measuring.
Bar raiser rounds lean heavily on behavioral questions and Amazon's Leadership Principles. The interviewer selects one or two LPs and presses hard, following up on follow-ups, peeling the story back until the specifics either hold or collapse. This is a depth check: is this something you actually did, or a polished summary that falls apart under pressure?
Three behaviors consistently fail bar raisers. Blaming teammates for past failures. Giving vague answers about team dynamics. Narrating accomplishments with no data. "We improved the system" is a story. "We cut p99 latency from 400ms to 120ms across six services in one quarter" is evidence. Every behavioral answer needs a number attached to the result. Most engineers prep behavioral stories the night before and rely on memory and vibes. Bar raisers were trained specifically to detect exactly that.
For a deeper breakdown of how veto power works in the debrief room, the Amazon Bar Raiser post covers the mechanics in detail.
How to Actually Adjust Your Strategy for Each Stage
For the phone screen: Keep it clean. Solve a medium problem completely rather than start a hard one and trail off. Communicate enough to show you're not silent, but don't narrate so heavily that you run out of time. Clear beats impressive when impressive risks a messy partial solution.
For the onsite: Treat consistency as the primary metric. Your fourth round counts the same as your first. The practical prep is to simulate sustained performance: run three or four problems back-to-back without a long break. When follow-up variants arrive, deliberately slow down. Pausing to say "let me think about what this changes" reads as stronger than an immediate guess, because it is.
For the bar raiser: Prepare every behavioral story with a concrete result. The structure is: what was the problem, what specifically did you do, what number proves it worked. Practice the layer below your prepared answer, because bar raisers will always go one level deeper than your planned version. If you know your story up to the result, practice explaining what you learned from it, how you would do it differently, and what the downstream effects were.
SpaceComplexity runs voice-based mock interviews that simulate exactly this arc: the initial problem, the follow-up constraint shift, and the behavioral depth probing that makes onsite and bar raiser rounds difficult to practice alone on LeetCode. If you've been drilling problems but not doing actual spoken reps under pressure, the gap between your prep and the real thing is larger than it feels.
Three Rounds, Three Different Bars
- Phone screen: Prove you're not obviously unqualified. One medium problem, communicate throughout, don't reveal a fundamental gap. The bar here is a floor, not a ceiling.
- Onsite: Prove consistent depth across four to six rounds. Every packet carries equal weight in the committee. Follow-up variants are where the round is actually decided, not the initial solution.
- Bar raiser: Prove you'll raise the team's average. One strong signal plus no lowered bars is enough. Stories need data. Follow-ups will go deeper than you prepared.
The common mistake is treating all three as the same test with escalating difficulty. They're not. Each one is asking a different question about you. Know which question you're answering before you walk in.
If you want to understand what happens after you leave each round, the pieces on what your interviewer is writing while you work and how the hiring committee reads the full packet fill in the back half of the picture.
Further Reading
- How candidates are evaluated in coding interviews at top tech companies (Tech Interview Handbook)
- Amazon's Bar Raiser Program (AWS official)
- Technical interview performance is kind of arbitrary (interviewing.io)
- Software Engineering Interview Guide (Tech Interview Handbook)
- Google Coding Interview Prep (Educative)