Do AI Detectors Really Work? The Truth Behind False Positives

Andrew Ng3 hours ago

In late 2025 and into 2026, a strange trend started showing up in higher education reporting: students who had never touched ChatGPT were paying for “AI humanizer” tools — not to cheat, but to make their own, entirely human writing pass detection software that kept flagging it anyway. One educator quoted in the coverage put it bluntly: students were now trying to prove they’re human, despite never having used AI at all.

That sentence is a good summary of where AI detection has actually landed by 2026, and it’s worth slowing down on after our earlier comparison of the free detection tools themselves. We focused there on why accuracy claims contradict each other from one review to the next. Here, we’re focused on a narrower and more serious question: when these tools get it wrong, who actually pays for that mistake, and is anything meaningfully being done about it?

What the Research Actually Says

The foundational study on this problem came out of Stanford in 2023, and its core finding hasn’t aged well for the detector industry: GPT-style detectors consistently misclassified non-native English writing as AI-generated, while accurately identifying writing from native English speakers in the same test set. A 2026 follow-up extended that work specifically on TOEFL essays and found a mean false positive rate of 61.3% for essays written by Chinese students, compared to just 5.1% for essays from US students run through the same detector under the same conditions. Researchers link the gap to “low-perplexity” writing patterns — simpler, more predictable sentence construction — that show up disproportionately in writing from people working in a second language, and that detectors read as a signal of AI generation.

It isn’t only a language-background problem, either. A widely cited academic study of real disciplinary cases found that 78% of students formally accused of using AI on an assignment had, in fact, written it themselves. Even tools with genuinely low false-positive rates fail in aggregate at scale: one analysis pointed out that Copyleaks’ own claimed 0.02% false positive rate would still produce roughly 80 falsely accused students per year at a single 20,000-student university, once you multiply that tiny percentage across every assignment in every course. And Turnitin’s own Chief Product Officer has publicly acknowledged a 4% false positive rate for its AI-writing detection — a number that, at a school the size of Ohio State, works out to thousands of students a year.

Why This Keeps Happening

The technical explanation is the same one we covered in the detector comparison: these tools measure statistical patterns like predictability and sentence-length variation, not some unique AI signature. The uncomfortable part is what that means in practice. The writing styles most likely to get flagged — simple, direct, formulaic sentence structure — overlap heavily with how people write in a non-native language, how technical and scientific writing is conventionally taught, and how students are explicitly instructed to write in standardized test prep. The detector isn’t catching dishonesty. It’s catching a writing style, and assuming a cause.

OpenAI itself has weighed in on this, stating plainly that its own research into AI detectors didn’t show them to be reliable enough for educators to base consequential judgments about students on. That’s a notable thing for the company that makes ChatGPT to say out loud, and it’s part of why the institutional response over the past year has shifted so sharply.

What Universities Are Actually Doing About It

The policy response has moved faster than most people outside higher ed probably realize. More than 25 major universities — including MIT, Yale, NYU, UC Berkeley, the University of Toronto, the University of British Columbia, Macquarie University, and the University of Manchester — have banned or significantly restricted the use of AI detection tools in disciplinary decisions. UCLA reviewed Turnitin’s AI detection accuracy internally and declined to adopt it at all. The University of Sydney has gone further, stating its educators have never had institutional access to AI detection software in the first place, citing the same false-positive and false-negative research discussed above.

It isn’t just policy memos, either. A Yale student filed what may be the first lawsuit of its kind against the university over a false AI accusation, and a University of Michigan student filed a similar suit in 2026. Courts are beginning to establish a position that an AI detection score, on its own, doesn’t constitute sufficient evidence of misconduct — which is a meaningful shift from how these scores were often treated just a year or two earlier.

To be clear, this is general information, not legal advice — if you’re personally facing a misconduct proceeding tied to a flagged AI score, talk to your school’s student advocate or ombudsperson, or consult an education attorney who handles these cases specifically.

If You’ve Been Wrongly Flagged, Here’s What Actually Helps

Buying a tool to rewrite your own writing so it scores lower isn’t a real fix — it treats a symptom of a broken system, and ironically it’s the exact same technique some actual cheaters use to evade detection after the fact, which makes it a confusing signal either way. What the research and the university policy shifts above suggest is genuinely more effective:

Keep your drafting process documented as you go, not after you’re flagged. Google Docs and Word both retain version history automatically; that revision trail is far stronger evidence of authorship than any detector score could ever be evidence against you.
Ask for the specifics, not just the accusation. A vague “this seems AI-generated” should be met with a request for the actual report — which sections were flagged, on what basis, and using which tool.
Push for a conversation, not just a score. The strongest-performing institutional policies we found treat a flag as the start of a review — asking a student to explain specific paragraphs or walk through their sources — rather than as a finding on its own.
Know that the legal and policy ground has shifted under this issue. A flagged score that might have ended a case in 2023 increasingly isn’t treated as sufficient on its own in 2026, and citing that shift (the university bans, the lawsuits, OpenAI’s own public statement) is a legitimate part of responding to an accusation.

Quick FAQ

Are schools required to tell students which AI detector they use?

There’s no universal rule — policy varies enormously by country, institution, and sometimes even by individual instructor. Checking your specific syllabus and institutional policy is the only reliable way to know.

Does a low false-positive rate mean a detector is basically safe to rely on?

Not at the scale a university operates at. Even a tool with a genuinely low rate, like Copyleaks’ claimed 0.02%, still produces real false accusations once that percentage is applied across thousands of students and assignments a year.

Is this specifically a problem for international or non-native English-speaking students?

The research base is strongest there — the original Stanford study and the 2026 TOEFL follow-up both found dramatically higher false-positive rates for non-native English writing. But formulaic or highly technical writing styles in general appear to carry elevated risk too, regardless of a writer’s first language.

If a university won’t reverse a false accusation, is there any recourse?

Some students have pursued formal appeals or, in a small but growing number of cases, litigation — the Yale and Michigan cases referenced above are examples. This is genuinely a legal question rather than a general one, so it’s worth speaking to someone qualified rather than relying on general advice from an article like this one.

The Bottom Line

“Do AI detectors work?” turns out to be the wrong question. They work in the narrow sense that they produce a score, and that score is sometimes right. The better question — the one universities, courts, and even the companies that make these tools are now actively wrestling with — is whether a probabilistic guess should ever have been treated as proof in the first place. As of 2026, the institutional answer is increasingly no, even as the tools themselves remain in daily use across classrooms that haven’t caught up yet.