Failing Project Interview Question: Your Answer Skips the Diagnosis

- Detection trigger separates strong answers: self-catching a problem before escalation signals the highest diagnostic skill
- Root cause diagnosis earns the score, not the execution of the fix — name the cause, not just the symptom
- De-escalation decision is the hardest part: cutting scope you already invested in requires reversing your own commitment
- 90% complete equals 0% complete until you integrate — experienced engineers scan for this signal weeks before anyone calls it a crisis
- Systemic change closes the answer: without it, you've described firefighting, not leadership
- STAR time split: 15% context, 60% diagnosis and intervention, 25% outcome and systemic fix
- Five killers: no detection trigger, actions without diagnosis, external blame, "worked harder" as the fix, no process change at the end
Every engineer who's been around long enough has a rescue story. The sprint from hell. The death march. The six-week crunch that somehow became six months. You showed up, you ground through it, you shipped the thing. Probably.
The problem is that everyone has this story. And when you tell it in an interview, you're telling the same story as every other candidate who has ever touched a troubled project. "It was bad. I helped. We delivered." The interviewer nods. The score is mediocre.
This question scores three things: how early you caught it, how hard the course correction was, and what you changed so it doesn't happen again.
Most answers nail the middle part and skip the other two. That's the gap.
The Standish Group CHAOS Report has tracked software project outcomes for decades. Only 31% of projects finish on time, on budget, and with full scope. The other 69% either fail outright or limp to the finish line looking vaguely like what was planned. Which means your war story is not special. What's special is whether you saw it coming.
What the Interviewer Is Actually Measuring
Three signals. Not one.
The first is the detection trigger. Did you catch the failure yourself, or did a stakeholder, a metric, or your manager surface it first? The earlier you caught it, and the more it was self-detected, the stronger the signal. Anyone can respond to a Slack escalation at 4pm on a Friday. Very few engineers develop a habit of reading weak signals before the situation becomes someone else's crisis.
The second is the de-escalation decision. Barry Staw's 1976 paper "Knee Deep in the Big Muddy" documented something uncomfortable: people who are personally responsible for a failing decision are far more likely to keep throwing resources at it than people who merely inherited the same situation. The psychological pull to justify your past choices is real, and it is well-documented.
Cutting scope, walking back an approach you spent three weeks on, having the uncomfortable conversation with a product director about a timeline that was never realistic: that is the hard part. That is also what the interviewer wants to hear.
The third is the systemic fix. What changed in your team's practice, your estimation process, or your communication protocol so this category of failure became less likely? Without this, you've described firefighting. With it, you've described engineering leadership.
Most answers cover the rescue execution in detail. That's the middle third. The detection and the systemic fix are what actually score.
The Diagnosis Gap
STAR (Situation, Task, Action, Result) is the right frame, but most candidates sprint through the S and T, land in Action, and describe what they did. The diagnosis gets one sentence. Maybe. That is the structural problem.
The Action section in this answer needs two distinct things, in order:
- How you diagnosed the problem: what signals you saw, what root cause you identified, how you separated symptom from cause
- What you did about it: the actual intervention
Jumping straight to the intervention without the diagnosis is like describing a surgery without the examination that produced it. The actions look impressive. There is no evidence of judgment.
Strong diagnosis answers three questions: what were the visible symptoms, what was the root cause beneath them, and how did you establish the difference?
For software projects, the visible symptom is almost always a missed deadline, or a sprint that looks fine on paper but produces nothing a human can touch. The root cause is usually one of a handful of things: scope crept incrementally without being acknowledged, a technical dependency was assumed rather than validated, teams worked in isolation and deferred integration to "later," estimates were optimistic and nobody felt safe saying so.
Naming the root cause specifically is what earns score. "We were falling behind" is a symptom. "We'd let scope grow inside every ticket without surfacing it to stakeholders, so we were permanently '80% complete' on tasks that had quietly tripled in complexity" is a diagnosis.

When every component is "almost done" and nothing ships, this is what the backlog actually contained.
The 90% Complete Problem
Most failing projects do not look like they are failing. They look like they are almost done.
Someone is always "nearly finished." Every task is "in progress." The board looks healthy. The stand-ups are calm. And yet two months have passed and a stakeholder cannot click on anything.
In software, 90% complete and 0% complete are functionally the same until you integrate. A system where every component is "almost done" has delivered nothing. The signal that exposes this is not a missed deadline. It is velocity data that looks fine while the working-software count stays at zero.
This is what experienced engineers learn to scan for: tasks that stay "in progress" for a week, sprint reviews where everything is "unblocked" but nothing closes, tickets estimated at two days that are now on day eight without a revised estimate. These are the weak signals that appear long before anyone calls the project failing.
The ability to read these signals early, before they compound into a crisis, is exactly what this question is designed to surface. At the senior level, the interviewer is not trying to understand if you know how to work hard. They are trying to understand if you are calibrated enough to know when hard work is going in the wrong direction.
How to Structure the Answer
Beat 1: Set context briefly. Two sentences on the project, your role, and the stakes. No more than 15% of the answer. The interviewer does not need a project specification. Nobody does.
Beat 2: Describe the detection trigger. This is where you establish credibility. Explain specifically what signal you noticed, how you spotted it, and what told you it was not noise. A velocity chart that had been flat for three weeks. Every sprint task marked "in progress" with nothing closing. A stakeholder who switched from weekly to daily status check-ins, unprompted. Be precise about what you saw and when. Vague is worse than short.
Beat 3: Diagnosis, then intervention. First explain what root cause you found. Then explain what you gave up: the scope that got cut, the technical approach you walked back, the timeline you reset. The reversal is the story. What did you stop doing that you had already invested in?
Beat 4: Outcome and systemic change. Name the actual result, even if imperfect. Then describe what changed in your process or practice. The story ends with a mechanism, not just a delivery.
Time split: roughly 15% context, 60% diagnosis and intervention, 25% result and systemic change.
A Concrete Example
Context: I was tech lead on a data pipeline rebuild. Three months in, twelve engineers working across five microservices, nothing deliverable.
Detection: Sprint velocity looked fine on paper. But we hadn't shipped anything a stakeholder could touch in six weeks. I pulled up the board and found every major task had been "in progress" for at least two weeks. We had a 90% complete problem: every component was "almost done" and nothing was fully done.
Diagnosis: The root cause was not execution speed. We had broken the system into microservices before validating the interfaces between them. Each team was optimizing their service in isolation. We had no integration tests and kept deferring integration to "later." There was no later. We were also estimating at the story level, which hid enormous complexity inside individual tickets without anyone flagging it.
Intervention: I called a two-hour scope audit with all the tech leads. We reclassified every remaining task as phase-1 must-have or defer. Forty percent of the backlog moved out. I shifted to interface-first integration testing: every piece of work had to produce something that ran in the full system, not just in isolation. Then I had a hard conversation with the product director. The original timeline was wrong and I should have flagged the integration risk earlier. I gave her a revised estimate with a confidence interval instead of a single date.
Result: We shipped phase 1 five weeks later. Full delivery was three months after that, one month late on the original plan, versus the six-plus months we were tracking toward. I changed our estimation process so every epic required a 30-minute breakdown into sub-day tasks before we committed, surfacing scope risk at planning instead of mid-sprint.
That answer covers all three signals. I caught it myself, before anyone escalated. I cut scope and reset the timeline. And I changed the estimation process so the same failure mode became visible earlier.
Five Killers

Staw's 1976 finding in comic form: the more personally invested you are, the harder you push after the signal to stop.
No detection trigger. The story starts with the project already in obvious crisis. Someone escalated to you, or it became impossible to ignore. This signals reactivity. You are describing a response to a fire alarm, not a habit of smelling smoke.
Actions without diagnosis. You explain what you did but not what you found. The intervention looks arbitrary. An interviewer cannot score your judgment from a list of actions without the reasoning that produced them. "We held a two-hour scope audit" is meaningless without "because we found that 40% of the backlog was scope that had drifted in over three months."
External blame. Requirements changed. The tech was unproven. The timeline was always unrealistic. These may all be true. But without owning what you missed or what you would catch earlier next time, the story becomes a complaint. Complaints do not score.
The turnaround was "working harder." If the recovery involved extended hours and grinding through a backlog, you have described escalating commitment rather than changing course. At the senior level, this is the wrong story. Interviewers want to hear what you stopped doing, not how hard you pushed on what was not working.
No systemic change. The story ends with delivery. Without describing what changed in process or practice, you have demonstrated you can respond to a crisis. You have not demonstrated you can prevent the next one. Those score differently.
The production incident interview question follows the same arc: diagnosis, intervention, prevention. If you have prepared one well, the structure transfers directly.
The Short Version
- The question scores detection trigger, de-escalation decision, and systemic fix.
- Most answers skip the diagnosis. That is where the score is.
- The hardest part of any turnaround is not the execution. It is reversing a course you personally committed to.
- Strong answers name the root cause, not just the symptoms.
- In software, 90% complete is functionally 0% complete until you integrate. Learning to see this early is the skill.
- Time split: 15% context, 60% diagnosis and intervention, 25% result and systemic fix.
- End with what changed, not just what shipped.
If you want to practice this out loud against a rubric, SpaceComplexity runs voice-based mock interviews with structured feedback on every dimension of your answer. Reading about the structure is useful. Saying it under pressure is where the gaps actually show up.
For related behavioral questions with the same structural logic, see tell me about a time you failed and recovered from a bad decision.