How Companies Detect AI Cheating in Coding Interviews

You passed the take-home. You crushed the recruiter screen. Now you're forty minutes into a live coding round, and you just produced a perfect O(n log n) solution to a medium-hard graph problem in under three minutes. Zero thinking aloud. Zero wrong turns. Zero questions about the input constraints. The interviewer makes a note.

Not a good one.

Between July 2025 and January 2026, Fabric analyzed 19,368 AI-powered interviews and found that 38.5% of candidates triggered cheating signals of some kind. The rate climbed from 9% in July to 45% by September and never really came back down. AI cheating in coding interviews is no longer an edge case. It's close to the median.

Here is what interviewers and proctoring systems actually watch for, and why the only reliable strategy is being genuinely good.

The Tool Behind All of This

In early 2025, two Columbia students built Interview Coder, an app that could listen to an interview in real time, send the question to an LLM, and display the answer invisibly. When Columbia suspended them, they dropped out, rebranded as Cluely, raised $5.3 million in seed funding, then $15 million in Series A from Andreessen Horowitz.

Dropout story. AI pivot. a16z money. You know how this goes.

The core trick is a graphics-layer exploit. On Windows, Cluely uses DirectX overlays; on macOS, it uses Metal framework layers. Both render at a depth that sits below what Zoom, Teams, and Google Meet capture when you share your screen. You see the AI-generated solution floating over your IDE. Your interviewer sees a clean screen.

That is the specific threat companies and platforms are now trying to detect. And they have gotten quite good at it.

Linus Torvalds' minimal desk setup vs an elaborate triple-monitor ChatGPT setup for copying code

Setup required to build Linux. Setup required to cheat your way through a 45-minute graph problem.

Signal One: The Answer Arrives Before the Thinking Does

In a real interview, response time correlates with difficulty. A candidate who knows arrays asks one clarifying question about edge cases and starts coding in ninety seconds. The same candidate stalls on a graph problem, talks through three wrong approaches, and recovers. That variance is normal. It is what thinking looks like.

AI-assisted candidates show a different signature: uniform latency after every question. The tool needs three to five seconds regardless of difficulty. Audio gets captured, sent to the LLM, processed, the answer renders. Harder questions don't take longer because the AI doesn't care whether it's a two-sum or a shortest path in a weighted directed graph. That flat response curve across difficulty is one of the strongest behavioral signals interviewers report noticing.

The suspicious case isn't just speed. It's uncanny speed combined with immediate optimality. Real candidates almost always propose a brute force first. They explore. A candidate who skips directly to an O(n log n) two-pointer approach with perfect variable names, a clean edge case analysis, and zero hesitation is doing something that very few human beings can actually do under pressure.

Signal Two: The Eyes Give It Away

Interviewers on video calls have a clear view of your face, and your eyes are harder to control than your hands.

Reading text produces horizontal saccades. Thinking produces inward focus, upward drift, or brief defocusing. When a candidate is reading a rendered overlay to their left or right, their eye movements look like reading, not like reasoning. Proctoring platforms like Fabric and Talview track gaze direction, blink rate, and focus consistency to build a behavioral fingerprint across the interview.

A UC Berkeley School of Information project in 2025 investigated whether gaze and pupil tracking alone can reliably separate cheating from non-cheating behavior using open-source eye-tracking hardware. Early results are promising enough that multiple platforms now combine eye tracking with at least four other signals before flagging anyone.

Experienced interviewers describe candidates whose eyes drift consistently to one side after each question, or who seem to be reading rather than thinking. One hiring manager described it as "watching someone sight-read sheet music versus improvise." The cadence is different.

Some companies have gone lower-tech. A viral tweet in late 2025 described a ByteDance interviewer who simply said, mid-interview: "Close your eyes and answer this question." Absurd. Also effective.

Signal Three: You Can't Explain Your Own Code

This is the one that ends interviews. If you wrote the code, you can answer questions about it. If you copied it from an AI overlay, you probably cannot.

Follow-up questions have become the primary human defense. A typical sequence after you submit a solution:

"Walk me through line seven. Why did you initialize the visited set before the loop instead of inside it?"
"What happens if the graph has negative edge weights?"
"Can you modify this to return the actual path, not just the cost?"

None of these require memorization. If you understand what you wrote, they take seconds. If you're working from a solution you read off a screen and typed out, you will stall, guess, or give answers that contradict the code you just wrote.

ChatGPT says "I scanned your GitHub and stole your code." Programmer replies: "Cool. Did you get it to work?"

The interview version of this ends with "walk me through what you stole."

This is the interrogation layer that AI tools cannot yet reliably help with. The overlay gives you code. It cannot teach you to explain that code in real time, in your own voice, with full context about what you said ten minutes earlier. The multi-turn conversational trap is where the deception collapses.

Strong interviewers also introduce deliberate errors into a candidate's solution and ask them to debug it. If you understand the code, you spot the error. If you don't, you stare at it and either say nothing or describe symptoms without finding the cause. This maps directly to what separates good debugging from guessing.

Signal Four: Keystroke Dynamics Don't Match Real Coding

Platforms that let you code in a browser-based IDE collect everything. Not just the final solution, but every keystroke, every backspace, every pause, every edit.

Real coding has noise. You type a variable name, delete two characters, retype it. You write a conditional, then go back and add the else branch. You pause in the middle of a line because you're working something out. The edit history looks like a mental process made physical.

AI-assisted code has a different fingerprint. A block of correct, clean code appears in a short burst. No false starts. No reconsidering. Minimal backspaces. The keystroke entropy is low. HackerRank's behavioral model tracks dozens of signals including submission timing, code similarity against known solutions, and typing patterns. Academic work on keystroke dynamics puts detection accuracy somewhere between 75% and 86% in controlled conditions.

Copy-paste events are also logged. Pasting a block of code from outside the IDE window is flagged immediately on most platforms. Even if Cluely prevents this by requiring you to type the answer yourself, the low-backspace clean-burst signature still persists.

How Platforms Are Catching AI Cheating in Coding Interviews

Human interviewers catch obvious cases. Platforms catch the subtle ones.

HackerEarth's Smart Browser locks the candidate's environment, monitors for forbidden tools, and takes periodic AI-powered snapshots. HireVue disables copy-paste, tracks mouse movements, and monitors audio for external voices. Talview added a secondary camera for candidates, claiming their system detects eight times more infractions than traditional AI monitoring.

Fabric's system does something different: it generates a cheating probability score from 20+ simultaneous signals across biometric, telemetric, and content dimensions. Not a single flag, but a composite score with timestamped evidence so a reviewer can see exactly when the behavioral anomalies occurred. In their dataset, 61% of flagged cheaters would have advanced through a standard hiring process without this kind of layered detection.

Zero Assist, a newer entrant, focuses specifically on real-time detection during live technical interviews rather than asynchronous assessments.

When the Format Is the Defense

The smarter response isn't just better detection. It's making the AI less useful.

Standard LeetCode-style problems are the easiest thing in the world for an LLM to solve, because the entire problem space is in the training data. Companies that have moved away from canonical problems toward vague, real-world debugging scenarios report sharply lower cheating rates. "Here is a production service log from last Tuesday, the latency spiked at 2:47 AM, tell me what happened" is not a problem Cluely can solve by querying its training data.

Some companies have returned to in-person interviews. Google's CEO raised this at an internal town hall in early 2025. Others have moved to multi-stage interrogation formats: a short coding exercise followed by an extended discussion that assumes full comprehension of every line.

The formats that survive the AI-cheating era reward what old formats only pretended to test: understanding, not just output.

The Gap Closes on Day One

The candidate who cheated through the technical round will face a first code review, a first production incident, and a first debugging session with a senior engineer watching. The gap between claimed and actual ability closes very fast.

There is also a more immediate concern. The detection systems are improving faster than the cheating tools. When Cluely's overlay trick gets patched at the platform level (platforms are actively developing counter-measures to the GPU-layer exploit), candidates who built their prep around the tool will have nothing. The ones who built it around actual pattern recognition will be fine.

The real interview is multi-turn. It involves clarifying questions, trade-off discussions, edge case exploration, and follow-ups. If you practice the way deliberate practice actually demands, those conversations become natural. The AI cannot replicate the fluency that comes from having genuinely worked through two hundred problems with full comprehension.

Practicing with voice-based mock interviews at SpaceComplexity is one way to stress-test that fluency before you're in the actual room. The platform runs you through the same multi-turn interrogation format now standard at companies that have hardened against AI cheating: problem setup, approach discussion, live coding, and follow-up questions you have to answer in your own words.

The Bottom Line

Cheating rates in live technical interviews hit 38.5% across 19,368 interviews (Fabric, 2026), up from under 10% in mid-2025.
Tools like Cluely use GPU-layer exploits to render invisible overlays below the screen-capture depth of Zoom and Teams.
The strongest human signal is uniform response latency across difficulty: thinking takes variable time, AI tools take constant time.
The follow-up question layer breaks most AI assistance: if you can't explain line seven, you didn't write line seven.
Platform-level detection combines gaze tracking, keystroke dynamics, copy-paste events, behavioral timing, and code similarity into composite cheating scores.
The formats being deployed to defeat AI (real-world debugging, custom problems, interrogation-style follow-ups) test understanding more directly than canonical LeetCode problems ever did.
Genuine understanding is not just the ethical path. It's the only one that holds up under scrutiny.